RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-23 05:46:05 +00:00

Author	SHA1	Message	Date
Duncan P. N. Exon Smith	d88fd1226e	Revert "code hoisting pass based on GVN" This reverts commit r274305, since it breaks self-hosting: http://lab.llvm.org:8080/green/job/clang-stage1-configure-RA_build/22349/ http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules/builds/17232 Note that the blamelist on lab.llvm.org:8011 is incorrect. The previous build was r274299, but somehow r274305 wasn't included in the blamelist: http://lab.llvm.org:8011/builders/clang-x86_64-linux-selfhost-modules git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274320 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 01:51:40 +00:00
Sebastian Pop	164e0c1535	code hoisting pass based on GVN This pass hoists duplicated computations in the program. The primary goal of gvn-hoist is to reduce the size of functions before inline heuristics to reduce the total cost of function inlining. Pass written by Sebastian Pop, Aditya Kumar, Xiaoyu Hu, and Brian Rzycki. Important algorithmic contributions by Daniel Berlin under the form of reviews. Differential Revision: http://reviews.llvm.org/D19338 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274305 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-01 00:24:31 +00:00
Peter Collingbourne	dba9146333	IR: New representation for CFI and virtual call optimization pass metadata. The bitset metadata currently used in LLVM has a few problems: 1. It has the wrong name. The name "bitset" refers to an implementation detail of one use of the metadata (i.e. its original use case, CFI). This makes it harder to understand, as the name makes no sense in the context of virtual call optimization. 2. It is represented using a global named metadata node, rather than being directly associated with a global. This makes it harder to manipulate the metadata when rebuilding global variables, summarise it as part of ThinLTO and drop unused metadata when associated globals are dropped. For this reason, CFI does not currently work correctly when both CFI and vcall opt are enabled, as vcall opt needs to rebuild vtable globals, and fails to associate metadata with the rebuilt globals. As I understand it, the same problem could also affect ASan, which rebuilds globals with a red zone. This patch solves both of those problems in the following way: 1. Rename the metadata to "type metadata". This new name reflects how the metadata is currently being used (i.e. to represent type information for CFI and vtable opt). The new name is reflected in the name for the associated intrinsic (llvm.type.test) and pass (LowerTypeTests). 2. Attach metadata directly to the globals that it pertains to, rather than using the "llvm.bitsets" global metadata node as we are doing now. This is done using the newly introduced capability to attach metadata to global variables (r271348 and r271358). See also: http://lists.llvm.org/pipermail/llvm-dev/2016-June/100462.html Differential Revision: http://reviews.llvm.org/D21053 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273729 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 21:21:32 +00:00
David Majnemer	0c4f69f653	Remove the ScalarReplAggregates pass Nearly all the changes to this pass have been done while maintaining and updating other parts of LLVM. LLVM has had another pass, SROA, which has superseded ScalarReplAggregates for quite some time. Differential Revision: http://reviews.llvm.org/D21316 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272737 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 00:19:09 +00:00
Manuel Jacob	b6968cd808	[PM] Schedule InstSimplify after late LICM run, to clean up LCSSA nodes. Summary: The module pass pipeline includes a late LICM run after loop unrolling. LCSSA is implicitly run as a pass dependency of LICM. However no cleanup pass was run after this, so the LCSSA nodes ended in the optimized output. Reviewers: hfinkel, mehdi_amini Subscribers: majnemer, bruno, mzolotukhin, mehdi_amini, llvm-commits Differential Revision: http://reviews.llvm.org/D20606 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271602 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 22:14:26 +00:00
Peter Collingbourne	03ef7f1eec	Move whole-program virtual call optimization pass after function attribute inference in LTO pipeline. As a result of D18634 we no longer infer certain attributes on linkonce_odr functions at compile time, and may only infer them at LTO time. The readnone attribute in particular is required for virtual constant propagation (part of whole-program virtual call optimization) to work correctly. This change moves the whole-program virtual call optimization pass after the function attribute inference passes, and enables the attribute inference passes at opt level 1, so that virtual constant propagation has a chance to work correctly for linkonce_odr functions. Differential Revision: http://reviews.llvm.org/D20643 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270765 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 21:26:14 +00:00
Xinliang David Li	17b42bdb89	Rename pass name to prepare to new PM porting /NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269586 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-15 01:04:24 +00:00
Xinliang David Li	f05d4a7577	[PM] code refactoring -- preparation for new PM porting /NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268851 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-07 05:39:12 +00:00
Mehdi Amini	4176423406	Tweak the ThinLTO pass pipeline Summary: The original ThinLTO pipeline was derived from some work I did tuning FullLTO on the test suite and SPEC. This patch reduces the amount of work done in the "linker phase" of the build, and extend the function simplifications passes performed during the "compile phase". This helps the build time by reducing the IR as much as possible during the compile phase and limiting the work to be performed during the "link phase", while keeping the performance "on par" with the existing pipeline. Reviewers: tejohnson Subscribers: llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D19773 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-06 18:17:03 +00:00
Xinliang David Li	ea2fb7d1dd	[PM] port IR based PGO prof-gen pass to new pass manager git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268710 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-06 05:49:19 +00:00
Mehdi Amini	6c3ac407d3	Move "Eliminate Available Externally" immediately after the inliner This pass is supposed to reduce the size of the IR for compile time purpose. We should run it ASAP, except when we prepare for LTO or ThinLTO, and we want to keep them available for link-time inline. Differential Revision: http://reviews.llvm.org/D19813 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268394 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-03 15:46:00 +00:00
Mehdi Amini	282acf461e	Move createReversePostOrderFunctionAttrsPass right after the inliner is done This is where it was originally, until LoopVersioningLICM was inserted before in r259986, I don't believe it was on purpose. Differential Revision: http://reviews.llvm.org/D19809 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268252 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-02 16:53:16 +00:00
Rong Xu	1564d12c82	[PGO] Promote indirect calls to conditional direct calls with value-profile This patch implements the transformation that promotes indirect calls to conditional direct calls when the indirect-call value profile meta-data is available. Differential Revision: http://reviews.llvm.org/D17864 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267815 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-27 23:20:27 +00:00
Adam Nemet	8d171c8f85	[LoopDist] Add llvm.loop.distribute.enable loop metadata Summary: D19403 adds a new pragma for loop distribution. This change adds support for the corresponding metadata that the pragma is translated to by the FE. As part of this I had to rethink the flag -enable-loop-distribute. My goal was to be backward compatible with the existing behavior: A1. pass is off by default from the optimization pipeline unless -enable-loop-distribute is specified A2. pass is on when invoked directly from opt (e.g. for unit-testing) The new pragma/metadata overrides these defaults so the new behavior is: B1. A1 + enable distribution for individual loop with the pragma/metadata B2. A2 + disable distribution for individual loop with the pragma/metadata The default value whether the pass is on or off comes from the initiator of the pass. From the PassManagerBuilder the default is off, from opt it's on. I moved -enable-loop-distribute under the pass. If the flag is specified it overrides the default from above. Then the pragma/metadata can further modifies this per loop. As a side-effect, we can now also use -enable-loop-distribute=0 from opt to emulate the default from the optimization pipeline. So to be precise this is the new behavior: C1. pass is off by default from the optimization pipeline unless -enable-loop-distribute or the pragma/metadata enables it C2. pass is on when invoked directly from opt unless -enable-loop-distribute=0 or the pragma/metadata disables it Reviewers: hfinkel Subscribers: joker.eph, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D19431 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267672 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-27 05:28:18 +00:00
Mehdi Amini	3f9946eb0d	Run GlobalOpt before emitting the bitcode for ThinLTO This is motivated by reducing the size of the IR and thus reduce compile time. From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267385 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-25 08:47:49 +00:00
Mehdi Amini	7d2a81594a	ThinLTO: Move createNameAnonFunctionPass insertion in PassManagerBuilder (NFC) It is just code motion, but makes more sense this way. From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267384 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-25 08:47:37 +00:00
Xinliang David Li	6be977abce	Port InstrProfiling pass to the new pass manager Differential Revision: http://reviews.llvm.org/D18126 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266637 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-18 17:47:38 +00:00
Justin Lebar	c904368f3d	[PM] Add a SpeculativeExecution pass for targets with divergent branches. Summary: This IR pass is helpful for GPUs, and other targets with divergent branches. It's a nop on targets without divergent branches. Reviewers: chandlerc Subscribers: llvm-commits, jingyue, rnk, joker.eph, tra Differential Revision: http://reviews.llvm.org/D18626 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266399 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-15 00:32:12 +00:00
Mehdi Amini	28b33112e2	Add a pass to name anonymous/nameless function Summary: For correct handling of alias to nameless function, we need to be able to refer them through a GUID in the summary. Here we name them using a hash of the non-private global names in the module. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18883 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266132 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-12 21:35:28 +00:00
Justin Lebar	3daeed4a53	[PassManager] Make PassManagerBuilder::addExtension take an std::function, rather than a function pointer. Summary: This gives callers flexibility to pass lambdas with captures, which lets callers avoid the C-style void*-ptr closure style. (Currently, callers in clang store state in the PassManagerBuilderBase arg.) No functional change, and the new API is backwards-compatible. Reviewers: chandlerc Subscribers: joker.eph, cfe-commits Differential Revision: http://reviews.llvm.org/D18613 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264918 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-30 20:39:29 +00:00
Adam Nemet	5ee7b3ce18	Turn LoopLoadElimination on again The latent bug that LLE exposed in the LoopVectorizer was resolved (PR26952). The pass can be disabled with -mllvm -enable-loop-load-elim=0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263595 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-15 22:26:12 +00:00
Teresa Johnson	f2403fe5b5	[ThinLTO] Renaming of function index to module summary index (NFC) (Resubmitting after fixing missing file issue) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263513 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-15 00:04:37 +00:00
Teresa Johnson	c37b05528e	Revert "[ThinLTO] Renaming of function index to module summary index (NFC)" This reverts commit r263490. Missed a file. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263493 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 21:18:10 +00:00
Teresa Johnson	256128f217	[ThinLTO] Renaming of function index to module summary index (NFC) With the changes in r263275, there are now more than just functions in the summary. Completed the renaming of data structures (started in r263275) to reflect the wider scope. In particular, changed the FunctionIndex* data structures to ModuleIndex*, and renamed related variables and comments. Also renamed the files to reflect the changes. A companion clang patch will immediately succeed this patch to reflect this renaming. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263490 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 21:05:56 +00:00
Adam Nemet	bb458432a8	Revert "Turn LoopLoadElimination on again" This reverts commit r263472. There is an LNT failure on clang-ppc64be-linux-lnt. Turn this off, while I am investigating. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263485 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 20:38:55 +00:00
Adam Nemet	753ff056a7	Turn LoopLoadElimination on again The two issues that were discovered got fixed (r263058, r263173). The pass can be disabled with -mllvm -enable-loop-load-elim=0 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263472 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 19:40:25 +00:00
Chandler Carruth	dd15ed0335	[PM] Port GVN to the new pass manager, wire it up, and teach a couple of tests to run GVN in both modes. This is mostly the boring refactoring just like SROA and other complex transformation passes. There is some trickiness in that GVN's ValueNumber class requires hand holding to get to compile cleanly. I'm open to suggestions about a better pattern there, but I tried several before settling on this. I was trying to balance my desire to sink as much implementation detail into the source file as possible without introducing overly many layers of abstraction. Much like with SROA, the design of this system is made somewhat more cumbersome by the need to support both pass managers without duplicating the significant state and logic of the pass. The same compromise is struck here. I've also left a FIXME in a doxygen comment as the GVN pass seems to have pretty woeful documentation within it. I'd like to submit this with the FIXME and let those more deeply familiar backfill the information here now that we have a nice place in an interface to put that kind of documentaiton. Differential Revision: http://reviews.llvm.org/D18019 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263208 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-11 08:50:55 +00:00
Matthias Braun	e152c1527d	InstCombine: Restrict computeKnownBits() on all Values to OptLevel > 2 As part of r251146 InstCombine was extended to call computeKnownBits on every value in the function to determine whether it happens to be constant. This increases typical compiletime by 1-3% (5% in irgen+opt time) in my measurements. On the other hand this case did not trigger once in the whole llvm-testsuite. This patch introduces the notion of ExpensiveCombines which are only enabled for OptLevel > 2. I removed the check in InstructionSimplify as that is called from various places where the OptLevel is not known but given the rarity of the situation I think a check in InstCombine is enough. Differential Revision: http://reviews.llvm.org/D16835 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263047 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-09 18:47:11 +00:00
Adam Nemet	6b38e591a6	Revert "Enable LoopLoadElimination by default" This reverts commit r262250. It causes SPEC2006/gcc to generate wrong result (166.s) in AArch64 when running with ref data set. The error happens with "-Ofast -flto -fuse-ld=gold" or "-O3 -fno-strict-aliasing". git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262839 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-07 17:38:02 +00:00
Adam Nemet	7ff3ae62d2	Enable LoopLoadElimination by default Summary: I re-benchmarked this and results are similar to original results in D13259: On ARM64: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -59.27% SingleSource/Benchmarks/Polybench/stencils/adi -19.78% On x86: SingleSource/Benchmarks/Polybench/linear-algebra/solvers/dynprog -27.14% And of course the original ~20% gain on SPECint_2006/456.hmmer with Loop Distribution. In terms of compile time, there is ~5% increase on both SingleSource/Benchmarks/Misc/oourafft and SingleSource/Benchmarks/Linkpack/linkpack-pc. These are both very tiny loop-intensive programs where SCEV computations dominates compile time. The reason that time spent in SCEV increases has to do with the design of the old pass manager. If a transform pass does not preserve an analysis we invalidate the analysis even if there was no modification made by the transform pass. This means that currently we don't take advantage of LLE and LV sharing the same analysis (LAA) and unfortunately we recompute LAA and SCEV for LLE. (There should be a way to work around this limitation in the case of SCEV and LAA since both compute things on demand and internally cache their result. Thus we could pretend that transform passes preserve these analyses and manually invalidate them upon actual modification. On the other hand the new pass manager is supposed to solve so I am not sure if this is worthwhile.) Reviewers: hfinkel, dberlin Subscribers: dberlin, reames, mssimpso, aemerson, joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D16300 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262250 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-29 20:35:11 +00:00
Chandler Carruth	e9afeb0bd1	[PM] Port the PostOrderFunctionAttrs pass to the new pass manager and convert one test to use this. This is a particularly significant milestone because it required a working per-function AA framework which can be queried over each function from within a CGSCC transform pass (and additionally a module analysis to be accessible). This is essentially the point of the entire pass manager rewrite. A CGSCC transform is able to query for multiple different function's analysis results. It works. The whole thing appears to actually work and accomplish the original goal. While we were able to hack function attrs and basic-aa to "work" in the old pass manager, this port doesn't use any of that, it directly leverages the new fundamental functionality. For this to work, the CGSCC framework also has to support SCC-based behavior analysis, etc. The only part of the CGSCC pass infrastructure not sorted out at this point are the updates in the face of inlining and running function passes that mutate the call graph. The changes are pretty boring and boiler-plate. Most of the work was factored into more focused preperatory patches. But this is what wires it all together. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261203 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-18 11:03:11 +00:00
Mehdi Amini	f5146973a3	Define the ThinLTO Pipeline (experimental) Summary: On the contrary to Full LTO, ThinLTO can afford to shift compile time from the frontend to the linker: both phases are parallel (even if it is not totally "free": projects like clang are reusing product from the "compile phase" for multiple link, think about libLLVMSupport reused for opt, llc, etc.). This pipeline is based on the proposal in D13443 for full LTO. We didn't move forward on this proposal because the LTO link was far too long after that. We believe that we can afford it with ThinLTO. The ThinLTO pipeline integrates in the regular O2/O3 flow: - The compile phase perform the inliner with a somehow lighter function simplification. (TODO: tune the inliner thresholds here) This is intendend to simplify the IR and get rid of obvious things like linkonce_odr that will be inlined. - The link phase will run the pipeline from the start, extended with some specific passes that leverage the augmented knowledge we have during LTO. Especially after the inliner is done, a sequence of globalDCE/globalOpt is performed, followed by another run of the "function simplification" passes. It is not clear if this part of the pipeline will stay as is, as the split model of ThinLTO does not allow the same benefit as FullLTO without added tricks. The measurements on the public test suite as well as on our internal suite show an overall net improvement. The binary size for the clang executable is reduced by 5%. We're still tuning it with the bringup of ThinLTO and it will evolve, but this should provide a good starting point. Reviewers: tejohnson Differential Revision: http://reviews.llvm.org/D17115 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261029 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-16 23:02:29 +00:00
Mehdi Amini	e300292233	Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()" (NFC) It is intended to contains the passes run over a function after the inliner is done with a function and before it moves to its callers. From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261028 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-16 22:54:27 +00:00
Mehdi Amini	3fa81fcae7	Revert "Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()"" This reverts commit r260603. I didn't intend to push it :( From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260607 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 22:09:11 +00:00
Mehdi Amini	72cdd19eb0	Revert "Define the ThinLTO Pipeline" This reverts commit r260604. I didn't intend to push this now. From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260606 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 22:09:07 +00:00
Mehdi Amini	1a87474d88	Define the ThinLTO Pipeline Summary: On the contrary to Full LTO, ThinLTO can afford to shift compile time from the frontend to the linker: both phases are parallel. This pipeline is based on the proposal in D13443 for full LTO. We ] didn't move forward on this proposal because the link was far too long after that. This patch refactor the "function simplification" passes that are part of the inliner loop in a helper function (this part is NFC and can be commited separately to simplify the diff). The ThinLTO pipeline integrates in the regular O2/O3 flow: - The compile phase perform the inliner with a somehow lighter function simplification. (TODO: tune the inliner thresholds here) This is intendend to simplify the IR and get rid of obvious things like linkonce_odr that will be inlined. - The link phase will run the pipeline from the start, extended with some specific passes that leverage the augmented knowledge we have during LTO. Especially after the inliner is done, a sequence of globalDCE/globalOpt is performed, followed by another run of the "function simplification" passes. The measurements on the public test suite as well as on our internal suite show an overall net improvement. The binary size for the clang executable is reduced by 5%. We're still tuning it with the bringup of ThinLTO but this should provide a good starting point. Reviewers: tejohnson Subscribers: joker.eph, llvm-commits, dexonsmith Differential Revision: http://reviews.llvm.org/D17115 From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260604 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 22:00:31 +00:00
Mehdi Amini	9794058401	Refactor the PassManagerBuilder: extract a "addFunctionSimplificationPasses()" It is intended to contains the passes run over a function after the inliner is done with a function and before it moves to its callers. From: Mehdi Amini <mehdi.amini@apple.com> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260603 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 22:00:25 +00:00
Ashutosh Nema	30082e800b	Fixed typo in comment & coding style for LoopVersioningLICM. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260504 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 09:23:53 +00:00
Peter Collingbourne	40cd497a24	WholeProgramDevirt: introduce. This pass implements whole program optimization of virtual calls in cases where we know (via bitset information) that the list of callees is fixed. This includes the following: - Single implementation devirtualization: if a virtual call has a single possible callee, replace all calls with a direct call to that callee. - Virtual constant propagation: if the virtual function's return type is an integer <=64 bits and all possible callees are readnone, for each class and each list of constant arguments: evaluate the function, store the return value alongside the virtual table, and rewrite each virtual call as a load from the virtual table. - Uniform return value optimization: if the conditions for virtual constant propagation hold and each function returns the same constant value, replace each virtual call with that constant. - Unique return value optimization for i1 return values: if the conditions for virtual constant propagation hold and a single vtable's function returns 0, or a single vtable's function returns 1, replace each virtual call with a comparison of the vptr against that vtable's address. Differential Revision: http://reviews.llvm.org/D16795 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260312 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-09 22:50:34 +00:00
Ashutosh Nema	9feccf470d	New Loop Versioning LICM Pass Summary: When alias analysis is uncertain about the aliasing between any two accesses, it will return MayAlias. This uncertainty from alias analysis restricts LICM from proceeding further. In cases where alias analysis is uncertain we might use loop versioning as an alternative. Loop Versioning will create a version of the loop with aggressive aliasing assumptions in addition to the original with conservative (default) aliasing assumptions. The version of the loop making aggressive aliasing assumptions will have all the memory accesses marked as no-alias. These two versions of loop will be preceded by a memory runtime check. This runtime check consists of bound checks for all unique memory accessed in loop, and it ensures the lack of memory aliasing. The result of the runtime check determines which of the loop versions is executed: If the runtime check detects any memory aliasing, then the original loop is executed. Otherwise, the version with aggressive aliasing assumptions is used. The pass is off by default and can be enabled with command line option -enable-loop-versioning-licm. Reviewers: hfinkel, anemet, chatur01, reames Subscribers: MatzeB, grosser, joker.eph, sanjoy, javed.absar, sbaranga, llvm-commits Differential Revision: http://reviews.llvm.org/D9151 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259986 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-06 07:47:48 +00:00
Rong Xu	f588a4e23f	[PGO] Passmanagerbuilder change that enable IR level PGO instrumentation This patch includes the passmanagerbuilder change that enables IR level PGO instrumentation. It adds two passmanagerbuilder options: -profile-generate=<profile_filename> and -profile-use=<profile_filename>. The new options are primarily for debug purpose. Reviewers: davidxl, silvas Differential Revision: http://reviews.llvm.org/D15828 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258420 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-21 18:28:59 +00:00
James Molloy	994f24a703	[LTO] Add a run of LoopUnroll Loop trip counts can often be resolved during LTO. We should obviously be unrolling small loops once those trip counts have been resolved, but we weren't. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257767 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-14 15:00:09 +00:00
Chandler Carruth	e96fb9ab15	[attrs] Split the late-revisit pattern for deducing norecurse in a top-down manner into a true top-down or RPO pass over the call graph. There are specific patterns of function attributes, notably the norecurse attribute, which are most effectively propagated top-down because all they us caller information. Walk in RPO over the call graph SCCs takes the form of a module pass run immediately after the CGSCC pass managers postorder walk of the SCCs, trying again to deduce norerucrse for each singular SCC in the call graph. This removes a very legacy pass manager specific trick of using a lazy revisit list traversed during finalization of the CGSCC pass. There is no analogous finalization step in the new pass manager, and a lazy revisit list is just trying to produce an RPO iteration of the call graph. We can do that more directly if more expensively. It seems unlikely that this will be the expensive part of any compilation though as we never examine the function bodies here. Even in an LTO run over a very large module, this should be a reasonable fast set of operations over a reasonably small working set -- the function call graph itself. In the future, if this really is a compile time performance issue, we can look at building support for both post order and RPO traversals directly into a pass manager that builds and maintains the PO list of SCCs. Differential Revision: http://reviews.llvm.org/D15785 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257163 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-08 10:55:52 +00:00
Chandler Carruth	d97964253e	[attrs] Extract the pure inference of function attributes into a standalone pass. There is no call graph or even interesting analysis for this part of function attributes -- it is literally inferring attributes based on the target library identification. As such, we can do it using a much simpler module pass that just walks the declarations. This can also happen much earlier in the pass pipeline which has benefits for any number of other passes. In the process, I've cleaned up one particular aspect of the logic which was necessary in order to separate the two passes cleanly. It now counts inferred attributes independently rather than just counting all the inferred attributes as one, and the counts are more clearly explained. The two test cases we had for this code path are both ... woefully inadequate and copies of each other. I've kept the superset test and updated it. We need more testing here, but I had to pick somewhere to stop fixing everything broken I saw here. Differential Revision: http://reviews.llvm.org/D15676 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256466 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-27 08:41:34 +00:00
Chandler Carruth	6a1ce8ecd6	[attrs] Split off the forced attributes utility into its own pass that is (by default) run much earlier than FuncitonAttrs proper. This allows forcing optnone or other widely impactful attributes. It is also a bit simpler as the force attribute behavior needs no specific iteration order. I've added the pass into the default module pass pipeline and LTO pass pipeline which mirrors where function attrs itself was being run. Differential Revision: http://reviews.llvm.org/D15668 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256465 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-27 08:13:45 +00:00
Evgeniy Stepanov	053615be02	Cross-DSO control flow integrity (LLVM part). An LTO pass that generates a __cfi_check() function that validates a call based on a hash of the call-site-known type and the target pointer. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255693 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-15 23:00:08 +00:00
James Molloy	bfebefca9f	[PassManagerBuilder] Add a few more scalar optimization passes This patch does two things: 1. mem2reg is now run immediately after globalopt. Now that globalopt can localize variables more aggressively, it makes sense to lower them to SSA form earlier rather than later so they can benefit from the full set of optimization passes. 2. More scalar optimizations are run after the loop optimizations in LTO mode. The loop optimizations (especially indvars) can clean up scalar code sufficiently to make it worthwhile running more scalar passes. I've particularly added SCCP here as it isn't run anywhere else in the LTO pass pipeline. Mem2reg is super cheap and shouldn't affect compilation time at all. The rest of the added passes are in the LTO pipeline only so doesn't affect the vast majority of compilations, just the link step. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255634 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-15 09:24:01 +00:00
Teresa Johnson	5d0d98f6ec	[ThinLTO] Support for specifying function index from pass manager Summary: Add a field on the PassManagerBuilder that clang or gold can use to pass down a pointer to the function index in memory to use for importing when the ThinLTO backend is triggered. Add support to supply this to the function import pass. Reviewers: joker.eph, dexonsmith Subscribers: davidxl, llvm-commits, joker.eph Differential Revision: http://reviews.llvm.org/D15024 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254926 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-07 19:21:11 +00:00
James Molloy	683e0e5156	[LTO] Add an early run of functionattrs Because we internalize early, we can potentially mark a bunch of functions as norecurse. Do this before globalopt. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253451 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-18 11:24:42 +00:00
Adam Nemet	bd67564ff2	LLE 6/6: Add LoopLoadElimination pass Summary: The goal of this pass is to perform store-to-load forwarding across the backedge of a loop. E.g.: for (i) A[i + 1] = A[i] + B[i] => T = A[0] for (i) T = T + B[i] A[i + 1] = T The pass relies on loop dependence analysis via LoopAccessAnalisys to find opportunities of loop-carried dependences with a distance of one between a store and a load. Since it's using LoopAccessAnalysis, it was easy to also add support for versioning away may-aliasing intervening stores that would otherwise prevent this transformation. This optimization is also performed by Load-PRE in GVN without the option of multi-versioning. As was discussed with Daniel Berlin in http://reviews.llvm.org/D9548, this is inferior to a more loop-aware solution applied here. Hopefully, we will be able to remove some complexity from GVN/MemorySSA as a consequence. In the long run, we may want to extend this pass (or create a new one if there is little overlap) to also eliminate loop-indepedent redundant loads and store that require versioning due to may-aliasing intervening stores/loads. I have some motivating cases for store elimination. My plan right now is to wait for MemorySSA to come online first rather than using memdep for this. The main motiviation for this pass is the 456.hmmer loop in SPECint2006 where after distributing the original loop and vectorizing the top part, we are left with the critical path exposed in the bottom loop. Being able to promote the memory dependence into a register depedence (even though the HW does perform store-to-load fowarding as well) results in a major gain (~20%). This gain also transfers over to x86: it's around 8-10%. Right now the pass is off by default and can be enabled with -enable-loop-load-elim. On the LNT testsuite, there are two performance changes (negative number -> improvement): 1. -28% in Polybench/linear-algebra/solvers/dynprog: the length of the critical paths is reduced 2. +2% in Polybench/stencils/adi: Unfortunately, I couldn't reproduce this outside of LNT The pass is scheduled after the loop vectorizer (which is after loop distribution). The rational is to try to reuse LAA state, rather than recomputing it. The order between LV and LLE is not critical because normally LV does not touch scalar st->ld forwarding cases where vectorizing would inhibit the CPU's st->ld forwarding to kick in. LoopLoadElimination requires LAA to provide the full set of dependences (including forward dependences). LAA is known to omit loop-independent dependences in certain situations. The big comment before removeDependencesFromMultipleStores explains why this should not occur for the cases that we're interested in. Reviewers: dberlin, hfinkel Subscribers: junbuml, dberlin, mssimpso, rengolin, sanjoy, llvm-commits Differential Revision: http://reviews.llvm.org/D13259 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252017 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-03 23:50:08 +00:00

... 2 3 4 5 6 ...

355 Commits