RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-04-15 12:30:22 +00:00

Author	SHA1	Message	Date
Fedor Sergeev	2f9c8a0c1d	[LoopUnroll] allow customization for new-pass-manager version of LoopUnroll Unlike its legacy counterpart new pass manager's LoopUnrollPass does not provide any means to select which flavors of unroll to run (runtime, peeling, partial), relying on global defaults. In some cases having ability to run a restricted LoopUnroll that does more than LoopFullUnroll is needed. Introduced LoopUnrollOptions to select optional unroll behaviors. Added 'unroll<peeling>' to PassRegistry mainly for the sake of testing. Reviewers: chandlerc, tejohnson Differential Revision: https://reviews.llvm.org/D53440 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345723 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-31 14:33:14 +00:00
Chandler Carruth	2aaf7228e0	[TI removal] Make variables declared as `TerminatorInst` and initialized by `getTerminator()` calls instead be declared as `Instruction`. This is the biggest remaining chunk of the usage of `getTerminator()` that insists on the narrow type and so is an easy batch of updates. Several files saw more extensive updates where this would cascade to requiring API updates within the file to use `Instruction` instead of `TerminatorInst`. All of these were trivial in nature (pervasively using `Instruction` instead just worked). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@344502 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-15 10:04:59 +00:00
Neil Henning	f4bd5343c9	Add missing period to comment to match style of file. This is a test commit to show that my commit access is working. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343842 91177308-0d34-0410-b5e6-96231b3b80d8	2018-10-05 09:39:07 +00:00
Fangrui Song	af7b1832a0	Remove trailing space sed -Ei 's/[[:space:]]+$//' include/*/.{def,h,td} lib/*/.{cpp,h} git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@338293 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-30 19:41:25 +00:00
David Green	e101271f21	[UnrollAndJam] New Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder Loop So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336062 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-01 12:47:30 +00:00
Matt Arsenault	93ae5a35af	LoopUnroll: Allow analyzing intrinsic call costs I'm not sure why the code here is skipping calls since TTI does try to do something for general calls, but it at least should allow intrinsics. Skip intrinsics that should not be omitted as calls, which is by far the most common case on AMDGPU. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335645 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-26 18:51:17 +00:00
Hiroshi Inoue	26570985ef	[NFC] fix trivial typos in comments git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334687 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-14 05:41:49 +00:00
David Green	00d34a85c6	Revert 333358 as it's failing on some builders. I'm guessing the tests reply on the ARM backend being built. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333359 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-27 12:54:33 +00:00
David Green	3dde46793c	[UnrollAndJam] Add a new Unroll and Jam pass This is a simple implementation of the unroll-and-jam classical loop optimisation. The basic idea is that we take an outer loop of the form: for i.. ForeBlocks(i) for j.. SubLoopBlocks(i, j) AftBlocks(i) Instead of doing normal inner or outer unrolling, we unroll as follows: for i... i+=2 ForeBlocks(i) ForeBlocks(i+1) for j.. SubLoopBlocks(i, j) SubLoopBlocks(i+1, j) AftBlocks(i) AftBlocks(i+1) Remainder So we have unrolled the outer loop, then jammed the two inner loops into one. This can lead to a simpler inner loop if memory accesses can be shared between the now-jammed loops. To do this we have to prove that this is all safe, both for the memory accesses (using dependence analysis) and that ForeBlocks(i+1) can move before AftBlocks(i) and SubLoopBlocks(i, j). Differential Revision: https://reviews.llvm.org/D41953 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333358 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-27 12:11:21 +00:00
Nicola Zaghen	0818e789cb	Rename DEBUG macro to LLVM_DEBUG. The DEBUG() macro is very generic so it might clash with other projects. The renaming was done as follows: - git grep -l 'DEBUG' \| xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g' - git diff -U0 master \| ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM - Manual change to APInt - Manually chage DOCS as regex doesn't match it. In the transition period the DEBUG() macro is still present and aliased to the LLVM_DEBUG() one. Differential Revision: https://reviews.llvm.org/D43624 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-14 12:53:11 +00:00
Adrian Prantl	26b584c691	Remove \brief commands from doxygen comments. We've been running doxygen with the autobrief option for a couple of years now. This makes the \brief markers into our comments redundant. Since they are a visual distraction and we don't want to encourage more \brief markers in new code either, this patch removes them all. Patch produced by for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done Differential Revision: https://reviews.llvm.org/D46290 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8	2018-05-01 15:54:18 +00:00
Ikhlas Ajbar	7c6b96497b	[Hexagon] peel loops with runtime small trip counts Move the check canPeel() to Hexagon Target before setting PeelCount. Differential Revision: https://reviews.llvm.org/D44880 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329129 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 22:55:09 +00:00
Ikhlas Ajbar	746a5dcae6	peel loops with runtime small trip counts For Hexagon, peeling loops with small runtime trip count is beneficial for our benchmarks. We set PeelCount in HexagonTargetInfo.cpp and we use PeelCount set by the target for computing the desired peel count. Differential Revision: https://reviews.llvm.org/D44880 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329042 91177308-0d34-0410-b5e6-96231b3b80d8	2018-04-03 03:39:43 +00:00
David Blaikie	49ca55e381	Transforms: Introduce Transforms/Utils.h rather than spreading the declarations amongst Scalar.h and IPO.h Fixes layering - Transforms/Utils shouldn't depend on including a Scalar or IPO header, because Scalar and IPO depend on Utils. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328717 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-28 17:44:36 +00:00
Florian Hahn	648727d7b9	[LoopUnroll] Peel off iterations if it makes conditions true/false. If the loop body contains conditions of the form IndVar < #constant, we can remove the checks by peeling off #constant iterations. This improves codegen for PR34364. Reviewers: mkuper, mkazantsev, efriedma Reviewed By: mkazantsev Differential Revision: https://reviews.llvm.org/D43876 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327671 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-15 21:34:43 +00:00
Andrei Elovikov	0e27effab3	[LoopUnroll] Ignore ephemeral values when checking full unroll profitability. Summary: Before this patch call graph is like this in the LoopUnrollPass: tryToUnrollLoop ApproximateLoopSize collectEphemeralValues /* Use collected ephemeral values / computeUnrollCount analyzeLoopUnrollCost / Bail out from the analysis if loop contains CallInst / This patch moves collection of the ephemeral values to the tryToUnrollLoop function and passes the collected values into both ApproximateLoopsize (as before) and additionally starts using them in analyzeLoopUnrollCost: tryToUnrollLoop collectEphemeralValues ApproximateLoopSize(EphValues) / Use EphValues / computeUnrollCount(EphValues) analyzeLoopUnrollCost(EphValues) / Ignore ephemeral values - they don't contribute to the final cost / / Bail out from the analysis if loop contains CallInst */ Reviewers: mzolotukhin, evstupac, sanjoy Reviewed By: evstupac Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D43931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327617 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-15 09:59:15 +00:00
Yaxun Liu	a0af6ca70c	LoopUnroll: respect pragma unroll when AllowRemainder is disabled Currently when AllowRemainder is disabled, pragma unroll count is not respected even though there is no remainder. This bug causes a loop fully unrolled in many cases even though the user specifies a unroll count. Especially it affects OpenCL/CUDA since in many cases a loop contains convergent instructions and currently AllowRemainder is disabled for such loops. Differential Revision: https://reviews.llvm.org/D43826 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326585 91177308-0d34-0410-b5e6-96231b3b80d8	2018-03-02 16:22:32 +00:00
Easwaran Raman	fe7b9dc2d2	Add hasProfileData() to check if a function has profile data. NFC. Summary: This replaces calls to getEntryCount().hasValue() with hasProfileData that does the same thing. This refactoring is useful to do before adding synthetic function entry counts but also a useful cleanup IMO even otherwise. I have used hasProfileData instead of hasRealProfileData as David had earlier suggested since I think profile implies "real" and I use the phrase "synthetic entry count" and not "synthetic profile count" but I am fine calling it hasRealProfileData if you prefer. Reviewers: davidxl, silvas Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D41461 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321331 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-22 01:33:52 +00:00
Simon Pilgrim	a9278fb1d8	Fix MSVC signed/unsigned comparison warning git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316161 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-19 15:00:31 +00:00
Eugene Zelenko	e77c212f96	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316128 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-18 21:46:47 +00:00
Hongbin Zheng	0446db2b0e	[LoopInfo][Refactor] Make SetLoopAlreadyUnrolled a member function of the Loop Pass, NFC. This avoid code duplication and allow us to add the disable unroll metadata elsewhere. Differential Revision: https://reviews.llvm.org/D38928 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315850 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-15 07:31:02 +00:00
Vivek Pandya	2540c741d5	[NFC] Convert OptimizationRemarkEmitter old emit() calls to new closure parameterized emit() calls Summary: This is not functional change to adopt new emit() API added in r313691. Reviewed By: anemet Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D38285 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-11 17:12:59 +00:00
Adam Nemet	3b8950a6d6	Rename OptimizationDiagnosticInfo.* to OptimizationRemarkEmitter.* Sync it up with the name of the class actually defined here. This has been bothering me for a while... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315249 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-09 23:19:02 +00:00
Benjamin Kramer	2b11cf60db	[LoopUnroll] Fix use after poison. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314418 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-28 14:47:39 +00:00
Sanjoy Das	e87cf87e45	Use a BumpPtrAllocator for Loop objects Summary: And now that we no longer have to explicitly free() the Loop instances, we can (with more ease) use the destructor of LoopBase to do what LoopBase::clear() was doing. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314375 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-28 02:45:42 +00:00
Rui Ueyama	5e0e7e8321	Fix -Wunused-variable for Release build. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314353 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-27 22:03:15 +00:00
Sanjoy Das	f391e18bcd	Return the LoopUnrollResult from tryToUnrollLoop; NFC I will use this in a later change. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314352 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-27 21:45:22 +00:00
Sanjoy Das	18ed8f9b45	Rename LoopUnrollStatus to LoopUnrollResult; NFC A "Result" suffix is more appropriate here git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314350 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-27 21:45:19 +00:00
Sanjoy Das	198959c487	Tighten the invariants around LoopBase::invalidate Summary: With this change: - Methods in LoopBase trip an assert if the receiver has been invalidated - LoopBase::clear frees up the memory held the LoopBase instance This change also shuffles things around as necessary to work with this stricter invariant. Reviewers: chandlerc Subscribers: mehdi_amini, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D38055 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313708 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-20 02:31:57 +00:00
Davide Italiano	628b9ff95d	[LoopUnroll] Add a cl::opt to force peeling, for testing purposes. Will be used to test the patch proposed in D37153. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311915 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-28 19:50:55 +00:00
Sam Parker	66f113a5b0	[LoopUnroll] Enable option to peel remainder loop On some targets, the penalty of executing runtime unrolling checks and then not the unrolled loop can be significantly detrimental to performance. This results in the need to be more conservative with the unroll count, keeping a trip count of 2 reduces the overhead as well as increasing the chance of the unrolled body being executed. But being conservative leaves performance gains on the table. This patch enables the unrolling of the remainder loop introduced by runtime unrolling. This can help reduce the overhead of misunrolled loops because the cost of non-taken branches is much less than the cost of the backedge that would normally be executed in the remainder loop. This allows larger unroll factors to be used without suffering performance loses with smaller iteration counts. Differential Revision: https://reviews.llvm.org/D36309 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310824 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-14 09:25:26 +00:00
Chandler Carruth	f4c6127fb8	[PM] Fix new LoopUnroll function pass by invalidating loop analysis results when a loop is completely removed. This is very hard to manifest as a visible bug. You need to arrange for there to be a subsequent allocation of a 'Loop' object which gets the exact same address as the one which the unroll deleted, and you need the LoopAccessAnalysis results to be significant in the way that they're stale. And you need a million other things to align. But when it does, you get a deeply mysterious crash due to actually finding a stale analysis result. This fixes the issue and tests for it by directly checking we successfully invalidate things. I have not been able to get any test case to reliably trigger this. Changes to LLVM itself caused the only test case I ever had to cease to crash. I've looked pretty extensively at less brittle ways of fixing this and they are actually very, very hard to do. This is a somewhat strange and unusual case as we have a pass which is deleting an IR unit, but is not running within that IR unit's pass framework (which is what handles this cleanly for the normal loop unroll). And where there isn't a definitive way to clear all of the stale cache entries. And where the pass is updating the core analysis that provides the IR units! For example, we don't have any of these problems with Function analyses because it is easy to clear out function analyses when the functions themselves may have been deleted -- we clear an entire module's worth! But that is too heavy of a hammer down here in the LoopAnalysisManager layer. A better long-term solution IMO is to require that AnalysisManager's make their keys durable to this kind of thing. Specifically, when caching an analysis for one IR unit that is conceptually "owned" by a higher level IR unit, the AnalysisManager should incorporate this into its data structures so that we can reliably clear these results without having to teach each and every pass to do so manually as we do here. But that is a change for another day as it will be a fairly invasive change to the AnalysisManager infrastructure. Until then, this fortunately seems to be quite rare. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310333 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-08 02:24:20 +00:00
Teresa Johnson	dad922aba2	Use profile summary to disable peeling for huge working sets Summary: Detect when the working set size of a profiled application is huge, by comparing the number of counts required to reach the hot percentile in the profile summary to a large threshold. When the working set size is determined to be huge, disable peeling to avoid bloating the working set further. Note that the selected threshold (15K) is significantly larger than the largest working set value in SPEC cpu2006 (which is gcc at around 11K). Reviewers: davidxl Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36288 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310005 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-03 23:42:58 +00:00
Teresa Johnson	aebcb9ce1d	Disable loop peeling during full unrolling pass. Summary: Peeling should not occur during the full unrolling invocation early in the pipeline, but rather later with partial and runtime loop unrolling. The later loop unrolling invocation will also eventually utilize profile summary and branch frequency information, which we would like to use to control peeling. And for ThinLTO we want to delay peeling until the backend (post thin link) phase, just as we do for most types of unrolling. Ensure peeling doesn't occur during the full unrolling invocation by adding a parameter to the shared implementation function, similar to the way partial and runtime loop unrolling are disabled. Performance results for ThinLTO suggest this has a neutral to positive effect on some internal benchmarks. Reviewers: chandlerc, davidxl Subscribers: mzolotukhin, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36258 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309966 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-03 17:52:38 +00:00
Teresa Johnson	99dd11f7e8	[PM] Split LoopUnrollPass and make partial unroller a function pass Summary: This is largely NFC, in preparation for utilizing ProfileSummaryInfo and BranchFrequencyInfo analyses. In this patch I am only doing the splitting for the New PM, but I can do the same for the legacy PM as a follow-on if this looks good. Not NFC since for partial unrolling we lose the updates done to the loop traversal (adding new sibling and child loops) - according to Chandler this is not very useful for partial unrolling, but it also means that the debugging flag -unroll-revisit-child-loops no longer works for partial unrolling. Reviewers: chandlerc Subscribers: mehdi_amini, mzolotukhin, eraman, llvm-commits Differential Revision: https://reviews.llvm.org/D36157 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@309886 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-02 20:35:29 +00:00
Geoff Berry	b6867d2be9	[LoopUnroll] Fix bug in computeUnrollCount causing it to not honor MaxCount Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: mcrosier, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D34532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306564 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 17:01:15 +00:00
Geoff Berry	28b3f06e1a	[LoopUnroll] Pass SCEV to getUnrollingPreferences hook. NFCI. Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Subscribers: jholewinski, arsenm, mzolotukhin, nemanjai, nhaehnle, javed.absar, mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D34531 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@306554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-28 15:53:17 +00:00
Chandler Carruth	ddfada260a	[IR] Redesign the case iterator in SwitchInst to actually be an iterator and to expose a handle to represent the actual case rather than having the iterator return a reference to itself. All of this allows the iterator to be used with common STL facilities, standard algorithms, etc. Doing this exposed some missing facilities in the iterator facade that I've fixed and required some work to the actual iterator to fully support the necessary API. Differential Revision: https://reviews.llvm.org/D31548 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300032 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 07:27:28 +00:00
Sanjoy Das	c4c0c1a3ed	[LoopUnrolling] Re-prioritize Peeling and Partial unrolling Summary: In current implementation the loop peeling happens after trip-count based partial unrolling and may sometimes not happen at all due to it (for example, if trip count is known, but UP.Partial = false). This is generally bad, the more than there are some situations where peeling is profitable even if the partial unrolling is disabled. This patch is a NFC which reorders peeling and partial unrolling application and prepares the code for implementation of the said optimizations. Patch by Max Kazantsev! Reviewers: sanjoy, anna, reames, apilipenko, igor-laevsky, mkuper Reviewed By: mkuper Subscribers: mkuper, llvm-commits, mzolotukhin Differential Revision: https://reviews.llvm.org/D30243 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296897 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 18:19:10 +00:00
Michael Kuperstein	2b21febc6b	[LoopUnroll] Enable PGO-based loop peeling by default. This enables peeling of loops with low dynamic iteration count by default, when profile information is available. Differential Revision: https://reviews.llvm.org/D27734 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295796 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-22 00:27:34 +00:00
Dehao Chen	1ae1089dec	Increases full-unroll threshold. Summary: The default threshold for fully unroll is too conservative. This patch doubles the full-unroll threshold This change will affect the following speccpu2006 benchmarks (performance numbers were collected from Intel Sandybridge): Performance: 403 0.11% 433 0.51% 445 0.48% 447 3.50% 453 1.49% 464 0.75% Code size: 403 0.56% 433 0.96% 445 2.16% 447 2.96% 453 0.94% 464 8.02% The compiler time overhead is similar with code size. Reviewers: davidxl, mkuper, mzolotukhin, hfinkel, chandlerc Reviewed By: hfinkel, chandlerc Subscribers: mehdi_amini, zzheng, efriedma, haicheng, hfinkel, llvm-commits Differential Revision: https://reviews.llvm.org/D28368 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295538 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-18 03:46:51 +00:00
Chandler Carruth	6d7e8f10e2	[PM] Simplify the new PM interface to the loop unroller and expose two factory functions for the two modes the loop unroller is actually used in in-tree: simplified full-unrolling and the entire thing including partial unrolling. I've also wired these up to nice names so you can express both of these being in a pipeline easily. This is a precursor to actually enabling these parts of the O2 pipeline. Differential Revision: https://reviews.llvm.org/D28897 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293136 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 02:13:50 +00:00
Michael Kuperstein	3ef6c7c651	[LoopUnroll] Properly update loopinfo for runtime unrolling by 2 Even when we don't create a remainder loop (that is, when we unroll by 2), we may duplicate nested loops into the remainder. This is complicated by the fact the remainder may itself be either inserted into an outer loop, or at the top level. In the latter case, we may need to create new top-level loops. Differential Revision: https://reviews.llvm.org/D29156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293124 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-26 01:04:11 +00:00
Chandler Carruth	f541c74552	[PM] Teach LoopUnroll to update the LPM infrastructure as it unrolls loops. We do this by reconstructing the newly added loops after the unroll completes to avoid threading pass manager details through all the mess of the unrolling infrastructure. I've enabled some extra assertions in the LPM to try and catch issues here and enabled a bunch of unroller tests to try and make sure this is sane. Currently, I'm manually running loop-simplify when needed. That should go away once it is folded into the LPM infrastructure. Differential Revision: https://reviews.llvm.org/D28848 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293011 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-25 02:49:01 +00:00
Dehao Chen	f519872cb6	Introduce -unroll-partial-threshold to separate PartialThreshold from Threshold in loop unorller. Summary: Partial unrolling should have separate threshold with full unrolling. Reviewers: efriedma, mzolotukhin Reviewed By: efriedma, mzolotukhin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D28831 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292293 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 23:39:33 +00:00
Chandler Carruth	10dd00ced5	[PM] Introduce an analysis set used to preserve all analyses over a function's CFG when that CFG is unchanged. This allows transformation passes to simply claim they preserve the CFG and analysis passes to check for the CFG being preserved to remove the fanout of all analyses being listed in all passes. I've gone through and removed or cleaned up as many of the comments reminding us to do this as I could. Differential Revision: https://reviews.llvm.org/D28627 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292054 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-15 06:32:49 +00:00
Chandler Carruth	c68d25fb58	[PM] Separate the LoopAnalysisManager from the LoopPassManager and move the latter to the Transforms library. While the loop PM uses an analysis to form the IR units, the current plan is to have the PM itself establish and enforce both loop simplified form and LCSSA. This would be a layering violation in the analysis library. Fundamentally, the idea behind the loop PM is to transform loops in addition to running passes over them, so it really seemed like the most natural place to sink this was into the transforms library. We can't just move everything because we also have loop analyses that rely on a subset of the invariants. So this patch splits the the loop infrastructure into the analysis management that has to be part of the analysis library, and the transform-aware pass manager. This also required splitting the loop analyses' printer passes out to the transforms library, which makes sense to me as running these will transform the code into LCSSA in theory. I haven't split the unittest though because testing one component without the other seems nearly intractable. Differential Revision: https://reviews.llvm.org/D28452 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291662 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 09:43:56 +00:00
Chandler Carruth	d27a39a962	[PM] Rewrite the loop pass manager to use a worklist and augmented run arguments much like the CGSCC pass manager. This is a major redesign following the pattern establish for the CGSCC layer to support updates to the set of loops during the traversal of the loop nest and to support invalidation of analyses. An additional significant burden in the loop PM is that so many passes require access to a large number of function analyses. Manually ensuring these are cached, available, and preserved has been a long-standing burden in LLVM even with the help of the automatic scheduling in the old pass manager. And it made the new pass manager extremely unweildy. With this design, we can package the common analyses up while in a function pass and make them immediately available to all the loop passes. While in some cases this is unnecessary, I think the simplicity afforded is worth it. This does not (yet) address loop simplified form or LCSSA form, but those are the next things on my radar and I have a clear plan for them. While the patch is very large, most of it is either mechanically updating loop passes to the new API or the new testing for the loop PM. The code for it is reasonably compact. I have not yet updated all of the loop passes to correctly leverage the update mechanisms demonstrated in the unittests. I'll do that in follow-up patches along with improved FileCheck tests for those passes that ensure things work in more realistic scenarios. In many cases, there isn't much we can do with these until the loop simplified form and LCSSA form are in place. Differential Revision: https://reviews.llvm.org/D28292 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@291651 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-11 06:23:21 +00:00
Dehao Chen	ddd3bb716c	Use continuous boosting factor for complete unroll. Summary: The current loop complete unroll algorithm checks if unrolling complete will reduce the runtime by a certain percentage. If yes, it will apply a fixed boosting factor to the threshold (by discounting cost). The problem for this approach is that the threshold abruptly. This patch makes the boosting factor a function of runtime reduction percentage, capped by a fixed threshold. In this way, the threshold changes continuously. The patch also simplified the code by reducing one parameter in UP. The patch only affects code-gen of two speccpu2006 benchmark: 445.gobmk binary size decreases 0.08%, no performance change. 464.h264ref binary size increases 0.24%, no performance change. Reviewers: mzolotukhin, chandlerc Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26989 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290737 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-30 00:50:28 +00:00
Daniel Jasper	8de3a54f07	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290086 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-19 08:22:17 +00:00

1 2 3 4 5 ...

258 Commits