archived-llvm

RPCS3/archived-llvm

Fork 0

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Commit Graph

Author	SHA1	Message	Date
Piotr Padlewski	a6db8552b8	[thinlto] Don't decay threshold for hot callsites Summary: We don't want to decay hot callsites to import chains of hot callsites. The same mechanism is used in LIPO. Reviewers: tejohnson, eraman, mehdi_amini Subscribers: llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D24976 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282833 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-30 03:01:17 +00:00
Piotr Padlewski	584dba4143	[thinlto] Add cold-callsite import heuristic Summary: Not tunned up heuristic, but with this small heuristic there is about +0.10% improvement on SPEC 2006 Reviewers: tejohnson, mehdi_amini, eraman Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24940 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282733 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-29 17:32:07 +00:00
Piotr Padlewski	fdf7354745	[thinlto] Basic thinlto fdo heuristic Summary: This patch improves thinlto importer by importing 3x larger functions that are called from hot block. I compared performance with the trunk on spec, and there were about 2% on povray and 3.33% on milc. These results seems to be consistant and match the results Teresa got with her simple heuristic. Some benchmarks got slower but I think they are just noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with more iterations to confirm. Geomean of all benchmarks including the noisy ones were about +0.02%. I see much better improvement on google branch with Easwaran patch for pgo callsite inlining (the inliner actually inline those big functions) Over all I see +0.5% improvement, and I get +8.65% on povray. So I guess we will see much bigger change when Easwaran patch will land (it depends on new pass manager), but it is still worth putting this to trunk before it. Implementation details changes: - Removed CallsiteCount. - ProfileCount got replaced by Hotness - hot-import-multiplier is set to 3.0 for now, didn't have time to tune it up, but I see that we get most of the interesting functions with 3, so there is no much performance difference with higher, and binary size doesn't grow as much as with 10.0. Reviewers: eraman, mehdi_amini, tejohnson Subscribers: mehdi_amini, llvm-commits Differential Revision: https://reviews.llvm.org/D24638 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282437 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-26 20:37:32 +00:00

Author

SHA1

Message

Date

Piotr Padlewski

a6db8552b8

[thinlto] Don't decay threshold for hot callsites

Summary:
We don't want to decay hot callsites to import chains of hot
callsites. The same mechanism is used in LIPO.

Reviewers: tejohnson, eraman, mehdi_amini

Subscribers: llvm-commits, mehdi_amini

Differential Revision: https://reviews.llvm.org/D24976

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282833 91177308-0d34-0410-b5e6-96231b3b80d8

2016-09-30 03:01:17 +00:00

Piotr Padlewski

584dba4143

[thinlto] Add cold-callsite import heuristic

Summary:
Not tunned up heuristic, but with this small heuristic there is about
+0.10% improvement on SPEC 2006

Reviewers: tejohnson, mehdi_amini, eraman

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24940

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282733 91177308-0d34-0410-b5e6-96231b3b80d8

2016-09-29 17:32:07 +00:00

Piotr Padlewski

fdf7354745

[thinlto] Basic thinlto fdo heuristic

Summary:
This patch improves thinlto importer
by importing 3x larger functions that are called from hot block.

I compared performance with the trunk on spec, and there
were about 2% on povray and 3.33% on milc. These results seems
to be consistant and match the results Teresa got with her simple
heuristic. Some benchmarks got slower but I think they are just
noisy (mcf, xalancbmki, omnetpp)- running the benchmarks again with
more iterations to confirm. Geomean of all benchmarks including the noisy ones
were about +0.02%.

I see much better improvement on google branch with Easwaran patch
for pgo callsite inlining (the inliner actually inline those big functions)
Over all I see +0.5% improvement, and I get +8.65% on povray.
So I guess we will see much bigger change when Easwaran patch will land
(it depends on new pass manager), but it is still worth putting this to trunk
before it.

Implementation details changes:
- Removed CallsiteCount.
- ProfileCount got replaced by Hotness
- hot-import-multiplier is set to 3.0 for now,
didn't have time to tune it up, but I see that we get most of the interesting
functions with 3, so there is no much performance difference with higher, and
binary size doesn't grow as much as with 10.0.

Reviewers: eraman, mehdi_amini, tejohnson

Subscribers: mehdi_amini, llvm-commits

Differential Revision: https://reviews.llvm.org/D24638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282437 91177308-0d34-0410-b5e6-96231b3b80d8

2016-09-26 20:37:32 +00:00

3 Commits