llvm-mirror

mirror of https://github.com/RPCS3/llvm-mirror.git synced 2024-12-14 07:09:08 +00:00

Author	SHA1	Message	Date
Florian Hahn	6252c9b891	Recommit r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. For basic blocks with instructions between the beginning of the block and a call we have to duplicate the instructions before the call in all split blocks and add PHI nodes for uses of the duplicated instructions after the call. Currently, the threshold for the number of instructions before a call is quite low, to keep the impact on binary size low. Reviewers: junbuml, mcrosier, davidxl, davide Reviewed By: junbuml Differential Revision: https://reviews.llvm.org/D41860 llvm-svn: 325126	2018-02-14 13:59:12 +00:00
Florian Hahn	16c6e1004c	Revert r325001: [CallSiteSplitting] Support splitting of blocks with instrs before call. Due to memsan not being happy with the array of ValueToValue maps. llvm-svn: 325009	2018-02-13 14:48:39 +00:00
Florian Hahn	8156315329	[CallSiteSplitting] Fix new-pm test, as TargetIRAnalysis is run earlier now llvm-svn: 325002	2018-02-13 12:22:32 +00:00
Amjad Aboud	ba09d82dc0	Another try to commit 323321 (aggressive instruction combine). llvm-svn: 323416	2018-01-25 12:06:32 +00:00
Amjad Aboud	bed9def2b0	Reverted 323321. llvm-svn: 323326	2018-01-24 14:48:49 +00:00
Amjad Aboud	5a41bfbb07	[InstCombine] Introducing Aggressive Instruction Combine pass (-aggressive-instcombine). Combine expression patterns to form expressions with fewer, simple instructions. This pass does not modify the CFG. For example, this pass reduce width of expressions post-dominated by TruncInst into smaller width when applicable. It differs from instcombine pass in that it contains pattern optimization that requires higher complexity than the O(1), thus, it should run fewer times than instcombine pass. Differential Revision: https://reviews.llvm.org/D38313 llvm-svn: 323321	2018-01-24 12:42:42 +00:00
Jun Bum Lim	3e63efac67	Recommit r317351 : Add CallSiteSplitting pass This recommit r317351 after fixing a buildbot failure. Original commit message: Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr \|\| c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] llvm-svn: 317362	2017-11-03 20:41:16 +00:00
Jun Bum Lim	aa115afcd5	Revert "Add CallSiteSplitting pass" Revert due to Buildbot failure. This reverts commit r317351. llvm-svn: 317353	2017-11-03 19:17:11 +00:00
Jun Bum Lim	d171e076c9	Add CallSiteSplitting pass Summary: This change add a pass which tries to split a call-site to pass more constrained arguments if its argument is predicated in the control flow so that we can expose better context to the later passes (e.g, inliner, jump threading, or IPA-CP based function cloning, etc.). As of now we support two cases : 1) If a call site is dominated by an OR condition and if any of its arguments are predicated on this OR condition, try to split the condition with more constrained arguments. For example, in the code below, we try to split the call site since we can predicate the argument (ptr) based on the OR condition. Split from : if (!ptr \|\| c) callee(ptr); to : if (!ptr) callee(null ptr) // set the known constant value else if (c) callee(nonnull ptr) // set non-null attribute in the argument 2) We can also split a call-site based on constant incoming values of a PHI For example, from : BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2, label %BB1 BB1: br label %BB2 BB2: %p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ] call void @bar(i32 %p) to BB0: %c = icmp eq i32 %i1, %i2 br i1 %c, label %BB2-split0, label %BB1 BB1: br label %BB2-split1 BB2-split0: call void @bar(i32 0) br label %BB2 BB2-split1: call void @bar(i32 1) br label %BB2 BB2: %p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ] Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide Reviewed By: davidxl Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits Differential Revision: https://reviews.llvm.org/D39137 llvm-svn: 317351	2017-11-03 19:01:57 +00:00
Matthew Simpson	05237a905d	Add CalledValuePropagation pass This patch adds a new pass for attaching !callees metadata to indirect call sites. The pass propagates values to call sites by performing an IPSCCP-like analysis using the generic sparse propagation solver. For indirect call sites having a small set of possible callees, the attached metadata indicates what those callees are. The metadata can be used to facilitate optimizations like intersecting the function attributes of the possible callees, refining the call graph, performing indirect call promotion, etc. Differential Revision: https://reviews.llvm.org/D37355 llvm-svn: 316576	2017-10-25 13:40:08 +00:00
Dehao Chen	a8aee54f27	Make ICP uses PSI to check for hotness. Summary: Currently, ICP checks the count against a fixed value to see if it is hot enough to be promoted. This does not work for SamplePGO because sampled count may be much smaller. This patch uses PSI to check if the count is hot enough to be promoted. Reviewers: davidxl, tejohnson, eraman Reviewed By: davidxl Subscribers: sanjoy, llvm-commits, mehdi_amini Differential Revision: https://reviews.llvm.org/D36341 llvm-svn: 310416	2017-08-08 20:57:33 +00:00
Adam Nemet	ccc7f67a4e	Relax the matching in these tests Looks like the template arguments are displayed differently depending on the host compiler(?). E.g.: InnerAnalysisManagerProxy<CGSCCAnalysisManager InnerAnalysisManagerProxy<llvm::AnalysisManager<llvm::LazyCallGraph::SCC, ... Fix fallout after r309294 llvm-svn: 309297	2017-07-27 17:45:02 +00:00
Adam Nemet	e1bfd295b2	[ICP] Migrate to OptimizationRemarkEmitter This is a module pass so for the old PM, we can't use ORE, the function analysis pass. Instead ORE is created on the fly. A few notes: - isPromotionLegal is folded in the caller since we want to emit the Function in the remark but we can only do that if the symbol table look-up succeeded. - There was good test coverage for remarks in this pass. - promoteIndirectCall uses ORE conditionally since it's also used from SampleProfile which does not use ORE yet. Fixes PR33792. Differential Revision: https://reviews.llvm.org/D35929 llvm-svn: 309294	2017-07-27 16:54:15 +00:00
Adam Nemet	7b7de60cec	Migrate SimplifyLibCalls to new OptimizationRemarkEmitter Summary: This changes SimplifyLibCalls to use the new OptimizationRemarkEmitter API. In fact, as SimplifyLibCalls is only ever called via InstCombine, (as far as I can tell) the OptimizationRemarkEmitter is added there, and then passed through to SimplifyLibCalls later. I have avoided changing any remark text. This closes PR33787 Patch by Sam Elliott! Reviewers: anemet, davide Reviewed By: anemet Subscribers: davide, mehdi_amini, eraman, fhahn, llvm-commits Differential Revision: https://reviews.llvm.org/D35608 llvm-svn: 309158	2017-07-26 19:03:18 +00:00
Philip Pfaffe	3bb640f8fb	[PM] Enable registration of out-of-tree passes with PassBuilder Summary: This patch adds a callback registration API to the PassBuilder, enabling registering out-of-tree passes with it. Through the Callback API, callers may register callbacks with the various stages at which passes are added into pass managers, including parsing of a pass pipeline as well as at extension points within the default -O pipelines. Registering utilities like `require<>` and `invalidate<>` needs to be handled manually by the caller, but a helper is provided. Additionally, adding passes at pipeline extension points is exposed through the opt tool. This patch adds a `-passes-ep-X` commandline option for every extension point X, which opt parses into pipelines inserted into that extension point. Reviewers: chandlerc Reviewed By: chandlerc Subscribers: lksbhm, grosser, davide, mehdi_amini, llvm-commits, mgorny Differential Revision: https://reviews.llvm.org/D33464 llvm-svn: 307532	2017-07-10 10:57:55 +00:00
Chandler Carruth	ef741a9d22	[PM/Inliner] Make the new PM's inliner process call edges across an entire SCC before iterating on newly-introduced call edges resulting from any inlined function bodies. This more closely matches the behavior of the old PM's inliner. While it wasn't really clear to me initially, this behavior is actually essential to the inliner behaving reasonably in its current design. Because the inliner is fundamentally a bottom-up inliner and all of its cost modeling is designed around that it often runs into trouble within an SCC where we don't have any meaningful bottom-up ordering to use. In addition to potentially cyclic, infinite inlining that we block with the inline history mechanism, it can also take seemingly simple call graph patterns within an SCC and turn them into insanely large functions by accidentally working top-down across the SCC without any of the threshold limitations that traditional top-down inliners use. Consider this diabolical monster.cpp file that Richard Smith came up with to help demonstrate this issue: ``` template <int N> extern const char str; void g(const char ); template <bool K, int N> void f(bool B, bool E) { if (K) g(str<N>); if (B == E) return; if (B) f<true, N + 1>(B + 1, E); else f<false, N + 1>(B + 1, E); } template <> void f<false, MAX>(bool B, bool E) { return f<false, 0>(B, E); } template <> void f<true, MAX>(bool B, bool E) { return f<true, 0>(B, E); } extern bool arr, end; void test() { f<false, 0>(arr, end); } ``` When compiled with '-DMAX=N' for various values of N, this will create an SCC with a reasonably large number of functions. Previously, the inliner would try to exhaust the inlining candidates in a single function before moving on. This, unfortunately, turns it into a top-down inliner within the SCC. Because our thresholds were never built for that, we will incrementally decide that it is always worth inlining and proceed to flatten the entire SCC into that one function. What's worse, we'll then proceed to the next function, and do the exact same thing except we'll skip the first function, and so on. And at each step, we'll also make some of the constant factors larger, which is awesome. The fix in this patch is the obvious one which makes the new PM's inliner use the same technique used by the old PM: consider all the call edges across the entire SCC before beginning to process call edges introduced by inlining. The result of this is essentially to distribute the inlining across the SCC so that every function incrementally grows toward the inline thresholds rather than allowing the inliner to grow one of the functions vastly beyond the threshold. The code for this is a bit awkward, but it works out OK. We could consider in the future doing something more powerful here such as prioritized order (via lowest cost and/or profile info) and/or a code-growth budget per SCC. However, both of those would require really substantial work both to design the system in a way that wouldn't break really useful abstraction decomposition properties of the current inliner and to be tuned across a reasonably diverse set of code and workloads. It also seems really risky in many ways. I have only found a single real-world file that triggers the bad behavior here and it is generated code that has a pretty pathological pattern. I'm not worried about the inliner not doing an awesome* job here as long as it does ok. On the other hand, the cases that will be tricky to get right in a prioritized scheme with a budget will be more common and idiomatic for at least some frontends (C++ and Rust at least). So while these approaches are still really interesting, I'm not in a huge rush to go after them. Staying even closer to the existing PM's behavior, especially when this easy to do, seems like the right short to medium term approach. I don't really have a test case that makes sense yet... I'll try to find a variant of the IR produced by the monster template metaprogram that is both small enough to be sane and large enough to clearly show when we get this wrong in the future. But I'm not confident this exists. And the behavior change here should be unobservable without snooping on debug logging. So there isn't really much to test. The test case updates come from two incidental changes: 1) We now visit functions in an SCC in the opposite order. I don't think there really is a "right" order here, so I just update the test cases. 2) We no longer compute some analyses when an SCC has no call instructions that we consider for inlining. llvm-svn: 297374	2017-03-09 11:35:40 +00:00
Chandler Carruth	d7cc3d1b4a	[PH] Replace uses of AssertingVH from members of analysis results with a lazy-asserting PoisoningVH. AssertVH is fundamentally incompatible with cache-invalidation of analysis results. The invaliadtion happens after the AssertingVH has already fired. Instead, use a PoisoningVH that will assert if the dangling handle is ever used rather than merely be assigned or destroyed. This patch also removes all of the (numerous) doomed attempts to work around this fundamental incompatibility. It is a pretty significant simplification IMO. The most interesting change is in the Inliner where we still do some clearing because we don't want to rely on the coarse grained invalidation strategy of the containing pass manager. However, I prefer the approach that contains this logic to the cleanup phase of the Inliner, and I think we could enhance the CGSCC analysis management layer to make this even better in the future if desired. The rest is straight cleanup. I've also added a test for one of the harder cases to work around: when a module analysis contains many AssertingVHes pointing at functions. Differential Revision: https://reviews.llvm.org/D29006 llvm-svn: 292928	2017-01-24 12:55:57 +00:00
Chandler Carruth	8384cc2f9e	[PM] Further fixes to the test case in r292863. This should hopefully fix the MSVC failures remaining. llvm-svn: 292887	2017-01-24 05:30:41 +00:00
Davide Italiano	e929d35db5	[PM] Try to make all three compilers happy when it comes to pretty printing. Modeled after a similar change from Michael Kuperstein. Let's hope this sticks together. llvm-svn: 292872	2017-01-24 01:45:53 +00:00
Davide Italiano	ffa8336285	[PM] Flesh out the new pass manager LTO pipeline. Differential Revision: https://reviews.llvm.org/D28996 llvm-svn: 292863	2017-01-24 00:57:39 +00:00

20 Commits