RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2024-11-27 21:50:40 +00:00

Author	SHA1	Message	Date
Simon Pilgrim	df7181d34e	[CostModel][X86] Add missing AVX512DQ v8i64 fptosi/sitofp costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287760 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-23 13:42:09 +00:00
Simon Pilgrim	fca1eeaa1f	[CostModel][X86] Add v2f32 -> v2i64 fptosi/fptoui cost tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287756 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-23 11:43:00 +00:00
Simon Pilgrim	d93dfb18c0	[CostModel][X86] Updated sitofp/uitofp scalar/vector cost tests Better coverage of all legal types + special cases. Removed old fptoui tests which are all handled in fptoui.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287678 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-22 18:55:49 +00:00
Yaxun Liu	f570b5fd16	Fix known zero bits for addrspacecast. Currently LLVM assumes that a pointer addrspacecasted to a different addr space is equivalent to trunc or zext bitwise, which is not true. For example, in amdgcn target, when a null pointer is addrspacecasted from addr space 4 to 0, its value is changed from i64 0 to i32 -1. This patch teaches LLVM not to assume known bits of addrspacecast instruction to its operand. Differential Revision: https://reviews.llvm.org/D26803 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287545 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-21 15:42:31 +00:00
Craig Topper	fdec4f508b	[AVX-512] Support FCOPYSIGN for v16f32 and v8f64 Summary: This extends FCOPYSIGN support to 512-bit vectors. I've also added tests to show what the 128-bit and 256-bit cases look like with broadcast loads. Reviewers: delena, zvi, RKSimon, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26791 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287298 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-18 02:25:34 +00:00
Simon Pilgrim	ad27fdae89	[CostModel][X86] Added mul costs for vXi8 vectors More realistic v16i8/v32i8/v64i8 MUL costs - we have to extend to vXi16, use PMULLW and then truncate the result git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286838 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-14 15:54:24 +00:00
Simon Pilgrim	2e6f35ab88	[X86][AVX] Fixed v16i16/v32i8 ADD/SUB costs on AVX1 subtargets Add explicit v16i16/v32i8 ADD/SUB costs, matching the costs of v4i64/v8i32 - they were missing for some reason. This has side effects on the LV max bandwidth tests (AVX1 now prefers 128-bit vectors vs AVX2 which still prefers 256-bit) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286832 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-14 14:45:16 +00:00
Peter Collingbourne	ca668e1942	IR: Introduce inrange attribute on getelementptr indices. If the inrange keyword is present before any index, loading from or storing to any pointer derived from the getelementptr has undefined behavior if the load or store would access memory outside of the bounds of the element selected by the index marked as inrange. This can be used, e.g. for alias analysis or to split globals at element boundaries where beneficial. As previously proposed on llvm-dev: http://lists.llvm.org/pipermail/llvm-dev/2016-July/102472.html Differential Revision: https://reviews.llvm.org/D22793 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286514 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-10 22:34:55 +00:00
Tobias Grosser	541a4fd75d	[RegionInfo] Add three tests that include infinite loops These examples are variations that were inspired from a small subgraph taken from paper.ll which are interesting as they show certain issues with infinite loops. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286450 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-10 13:56:19 +00:00
Andrew Kaylor	665c3d9425	[BasicAA] Teach BasicAA to handle the inaccessiblememonly and inaccessiblemem_or_argmemonly attributes Differential Revision: https://reviews.llvm.org/D26382 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286294 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-08 21:07:42 +00:00
Simon Pilgrim	169b408a54	[VectorLegalizer] Expansion of CTLZ using CTPOP when possible This patch avoids scalarization of CTLZ by instead expanding to use CTPOP (ref: "Hacker's Delight") when the necessary operations are available. This also adds the necessary cost models for X86 SSE2 targets (the main beneficiary) to ensure vectorization only happens when its useful. Differential Revision: https://reviews.llvm.org/D25910 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286233 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-08 14:10:28 +00:00
Chad Rosier	4438f46674	[AliasSetTracker] Make AST smarter about assume intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D26252 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286108 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-07 14:11:45 +00:00
Alexey Bataev	7af02fcbc9	Improved cost model for FDIV and FSQRT, by Andrew Tischenko There is a bug describing poor cost model for floating point operations: Bug 29083 - [X86][SSE] Improve costs for floating point operations. This patch is the second one in series of patches dealing with cost model. Differential Revision: https://reviews.llvm.org/D25722 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285564 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-31 12:10:53 +00:00
Tom Stellard	d0b25b0041	[Loads] Fix crash in is isDereferenceableAndAlignedPointer() Summary: We were trying to add APInt values with different bit sizes after visiting an addrspacecast instruction which changed the bit width of the pointer. Reviewers: majnemer, hfinkel Subscribers: hfinkel, wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24774 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285407 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-28 15:32:28 +00:00
Simon Pilgrim	6c0e6ef493	[X86][AVX512] Fix MUL v8i64 costs on non-AVX512DQ targets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285329 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 18:32:06 +00:00
Simon Pilgrim	0de3e81c28	[X86][AVX512DQ] Improve lowering of MUL v2i64 and v4i64 With DQI but without VLX, lower v2i64 and v4i64 MUL operations with v8i64 MUL (vpmullq). Updated cost table accordingly. Differential Revision: https://reviews.llvm.org/D26011 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285304 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-27 15:27:00 +00:00
Chad Rosier	5a50eb6b58	Revert "[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory." This reverts commit r285191. LICM appears to rely on the Alias Set Tracker hitting lifetime markers to prevent code from being moved outside of the original scope. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285227 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 19:18:19 +00:00
Chad Rosier	e131755b83	[AliasSetTracker] Make AST smarter about intrinsics that don't actually affect memory. Differential Revision: https://reviews.llvm.org/D25969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285191 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-26 12:42:11 +00:00
Eli Friedman	10310d2d11	Fix regression from my recent GlobalsAA fix. There are two fixes here: one, AnalyzeUsesOfPointer can't return false until it has checked all the uses of the pointer. Two, if a global uses another global, we have to assume the address of the first global escapes. Fixes https://llvm.org/bugs/show_bug.cgi?id=30707 . Differential Revision: https://reviews.llvm.org/D25798 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285034 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-24 21:47:44 +00:00
Simon Pilgrim	27ae3e891c	[CostModel][X86] Added tests for current integer signed/unsigned remainder costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284940 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-23 18:35:02 +00:00
Simon Pilgrim	e8a0965304	[X86][SSE] Add SSE41/AVX1 costs for vector shifts. We were defaulting to SSE2 costs which weren't taking into account the availability of PBLENDW/PBLENDVB to improve merging of per-element shift results. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284939 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-23 16:49:04 +00:00
Simon Pilgrim	743f4719d2	[CostModel][X86] Added tests for current integer trunc costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284938 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-23 15:17:52 +00:00
Gerolf Hoflehner	e513820659	[BasicAA] Fix - missed alias in GEP expressions In BasicAA GEP operand values get adjusted ("wrap-around") based on the pointersize. Otherwise, in non-64b modes, AA could report false negatives. However, a wrap-around is valid only for a fully evaluated expression. It had been introduced to fix an alias problem in http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160118/326163.html. This commit restricts the wrap-around to constant gep operands only where the value is known at compile-time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284908 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-22 02:41:39 +00:00
Li Huang	f0a5715382	[SCEV] Memoize visitMulExpr results in SCEVRewriteVisitor. Summary: When SCEVRewriteVisitor traverses the SCEV DAG, it may visit the same SCEV multiple times if this SCEV is referenced by multiple other SCEVs. This has exponential time complexity in the worst case. Memoizing the results will avoid re-visiting the same SCEV. Add a map to save the results, and override the visit function of SCEVVisitor. Now SCEVRewriteVisitor only visit each SCEV once and thus returns the same result for the same input SCEV. This patch fixes PR18606, PR18607. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25810 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284868 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-21 20:05:21 +00:00
John Brawn	9e0c61cbeb	[LoopUnroll] Keep the loop test only on the first iteration of max-or-zero loops When we have a loop with a known upper bound on the number of iterations, and furthermore know that either the number of iterations will be either exactly that upper bound or zero, then we can fully unroll up to that upper bound keeping only the first loop test to check for the zero iteration case. Most of the work here is in plumbing this 'max-or-zero' information from the part of scalar evolution where it's detected through to loop unrolling. I've also gone for the safe default of 'false' everywhere but howManyLessThans which could probably be improved. Differential Revision: https://reviews.llvm.org/D25682 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284818 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-21 11:08:48 +00:00
Li Huang	3c09f0b786	[SCEV] Add a threshold to restrict number of mul operands to be inlined into SCEV This is to avoid inlining too many multiplication operands into a SCEV, which could take exponential time in the worst case. Reviewers: Sanjoy Das, Mehdi Amini, Michael Zolotukhin Differential Revision: https://reviews.llvm.org/D25794 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284784 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-20 21:38:39 +00:00
Simon Pilgrim	e1ac64bc87	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv uniformconst costs for 256/512 bit integer vectors We weren't checking for uniform const costs before the general cost, resulting in very high estimates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284755 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-20 18:00:35 +00:00
Simon Pilgrim	ca13ae18d8	[CostModel][X86] Added tests for sdiv/udiv costs for uniform const and uniform const power-of-2 Shows poor costings in AVX1/AVX512BW for certain vector types git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284748 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-20 17:16:38 +00:00
Simon Pilgrim	99edc4fc3c	[CostModel][X86] Fixed AVX1/AVX512 sdiv/udiv general costs for 256/512 bit integer vectors We weren't accounting for legal types on every subtarget, meaning that many of the costs were using defaults. We still don't correctly cost (or test) the 512-bit sdiv/udiv by uniform const cases, nor the power-of-2 cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284744 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-20 16:39:11 +00:00
Simon Pilgrim	7b259eb4f8	[CostModel][X86] Added tests for sdiv/udiv costs for scalar and 128/256/512 bit integer vectors Shows current bug in AVX1/AVX512BW costs for 256 bit vector types git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284723 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-20 12:34:00 +00:00
Chad Rosier	db638de2de	[AliasSetTracker] Add support for memcpy and memmove. Differential Revision: https://reviews.llvm.org/D25776 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284630 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-19 19:09:03 +00:00
John Brawn	a6faf3af51	[SCEV] More accurate calculation of max backedge count of some less-than loops In loops that look something like i = n; do { ... } while(i++ < n+k); where k is a constant, the maximum backedge count is k (in fact the backedge count will be either 0 or k, depending on whether n+k wraps). More generally for LHS < RHS if RHS-(LHS of first comparison) is a constant then the loop will iterate either 0 or that constant number of times. This allows for more loop unrolling with the recent upper bound loop unrolling changes, and I'm working on a patch that will let loop unrolling additionally make use of the loop being executed either 0 or k times (we need to retain the loop comparison only on the first unrolled iteration). Differential Revision: https://reviews.llvm.org/D25607 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284465 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-18 10:10:53 +00:00
Simon Pilgrim	e615ec6e15	[X86][SSE] Add lowering to cvttpd2dq/cvttps2dq for sitofp v2f64/2f32 to 2i32 As discussed on PR28461 we currently miss the chance to lower "fptosi <2 x double> %arg to <2 x i32>" to cvttpd2dq due to its use of illegal types. This patch adds support for fptosi to 2i32 from both 2f64 and 2f32. It also recognises that cvttpd2dq zeroes the upper 64-bits of the xmm result (similar to D23797) - we still don't do this for the cvttpd2dq/cvttps2dq intrinsics - this can be done in a future patch. Differential Revision: https://reviews.llvm.org/D23808 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284459 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-18 07:42:15 +00:00
Tobias Grosser	a00e39fd38	[SCEV] Consider delinearization pattern with extension with identity factor Summary: The delinearization algorithm did not consider terms which had an extension without a multiply factor, i.e. a identify factor. We lose cases where size is char type where there will no multiply factor. Reviewers: sanjoy, grosser Subscribers: mzolotukhin, Eugene.Zelenko, llvm-commits, mssimpso, sanjoy, grosser Differential Revision: https://reviews.llvm.org/D16492 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284378 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-17 11:56:26 +00:00
Tom Stellard	8b2bb5de9a	[ValueTracking] Fix crash in GetPointerBaseWithConstantOffset() Summary: While walking defs of pointer operands we were assuming that the pointer size would remain constant. This is not true, because addresspacecast instructions may cast the pointer to an address space with a different pointer width. This partial reverts r282612, which was a more conservative solution to this problem. Reviewers: reames, sanjoy, apilipenko Subscribers: wdng, llvm-commits Differential Revision: https://reviews.llvm.org/D24772 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283557 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-07 14:23:29 +00:00
Bjorn Pettersson	c4e191a67a	[ValueTracking] Teach computeKnownBits and ComputeNumSignBits to look through ExtractElement. Summary: The computeKnownBits and ComputeNumSignBits functions in ValueTracking can now do a simple look-through of ExtractElement. Reviewers: majnemer, spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D24955 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283434 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-06 09:56:21 +00:00
Eli Friedman	aba0070ed8	Make GlobalsAA ignore dead constant expressions. Slightly improves the precision of GlobalsAA in certain situations, and makes the behavior of optimization passes more predictable. Differential Revision: https://reviews.llvm.org/D24104 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283165 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-04 00:03:55 +00:00
Sanjay Patel	7b05e5c94e	[x86, SSE/AVX] allow 128/256-bit lowering for copysign vector intrinsics (PR30433) This should fix: https://llvm.org/bugs/show_bug.cgi?id=30433 There are a couple of open questions about the codegen: 1. Should we let scalar ops be scalars and avoid vector constant loads/splats? 2. Should we have a pass to combine constants such as the inverted pair that we have here? Differential Revision: https://reviews.llvm.org/D25165 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-03 16:38:27 +00:00
Simon Pilgrim	065e5924cd	[CostModel][X86] Added tests for current fptosi/fptoui costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283047 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-01 19:09:59 +00:00
Simon Pilgrim	1df457ad5e	[CostModel][X86] Added fcopysign costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283044 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-01 16:41:52 +00:00
Simon Pilgrim	089e03e79a	[CostModel][X86] Added fabs costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283042 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-01 16:30:13 +00:00
Jun Bum Lim	32ed609373	Enhance calcColdCallHeuristics for InvokeInst Summary: When identifying cold blocks, consider only the edge to the normal destination if the terminator is InvokeInst and let calcInvokeHeuristics() decide edge weights for the InvokeInst. Reviewers: mcrosier, hfinkel, davidxl Subscribers: mcrosier, llvm-commits Differential Revision: https://reviews.llvm.org/D24868 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282262 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-23 17:26:14 +00:00
Simon Pilgrim	08a3649b8b	[CostModel][X86] Added scalar float op costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281864 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-18 21:01:20 +00:00
Sanjay Patel	58071c976f	auto-generate checks git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281756 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-16 17:54:52 +00:00
David L Kreitzer	cd01b58518	Reapplying r278731 after fixing the problem that caused it to be reverted. Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281732 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-16 14:38:13 +00:00
Wei Mi	81a2b8958e	Create a getelementptr instead of sub expr for ValueOffsetPair if the value is a pointer. This patch is to fix PR30213. When expanding an expr based on ValueOffsetPair, if the value is of pointer type, we can only create a getelementptr instead of sub expr. Differential Revision: https://reviews.llvm.org/D24088 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281439 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 04:39:50 +00:00
Elena Demikhovsky	1b2a8500af	[Loop Vectorizer] Fixed memory confilict checks. Fixed a bug in run-time checks for possible memory conflicts inside loop. The bug is in Low <-> High boundaries calculation. The High boundary should be calculated as "last memory access pointer + element size". Differential revision: https://reviews.llvm.org/D23176 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279930 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-28 08:53:53 +00:00
Wei Mi	8ce317bf15	[UNROLL] Postpone ScalarEvolution::forgetLoop after TripCountSC is expanded when unroll runtime iteration loop. In llvm::UnrollRuntimeLoopRemainder, if the loop to be unrolled is the inner loop inside a loop nest, the scalar evolution needs to be dropped for its parent loop which is done by ScalarEvolution::forgetLoop. However, we can postpone forgetLoop to the end of UnrollRuntimeLoopRemainder so TripCountSC expansion can still reuse existing value. Differential Revision: https://reviews.llvm.org/D23572 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279748 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-25 16:17:18 +00:00
Evgeny Stupachenko	2af858265d	The patch improves ValueTracking on left shift with nsw flag. Summary: The patch fixes PR28946. Reviewers: majnemer, sanjoy Differential Revision: http://reviews.llvm.org/D23296 From: Li Huang git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279684 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-24 23:01:33 +00:00
Artur Pilipenko	cd61ceee32	Remove missing file from r279433 reversal git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279434 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-22 13:18:19 +00:00
Simon Pilgrim	fc26436d17	[CostModel][X86] Removed shift tests There are more thorough tests found in vshift-*-cost.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279406 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-21 19:56:02 +00:00
Simon Pilgrim	26eaf4f816	[CostModel][X86] Added costs for vXi16 and vXi8 vectors for add/sub/mul/and/or/xor tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279405 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-21 19:44:44 +00:00
Simon Pilgrim	799c5ebb1b	[CostModel][X86] Replaced SSSE3 with SSE2 costs to create a better baseline git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279404 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-21 19:14:48 +00:00
Simon Pilgrim	4630834e7c	[CostModel][X86] Added fsqrt and fma costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279403 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-21 19:06:25 +00:00
Simon Pilgrim	4597dd96a3	[CostModel][X86] Split off float arithmetic cost tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279402 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-21 18:34:47 +00:00
Simon Pilgrim	18333ab15e	[CostModel][X86] Added sub, or, and, fadd and fsub costs and missing 512-bit mul costs git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279301 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-19 19:07:10 +00:00
Simon Pilgrim	07010a7197	[CostModel][X86] Added some AVX512 and 512-bit vector cost tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279291 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-19 18:24:10 +00:00
Simon Pilgrim	50fbf501e3	[CostModel][X86] Add fdiv + frem cost tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279283 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-19 17:39:00 +00:00
Michael Kuperstein	15095e4af8	[AliasSetTracker] Degrade AliasSetTracker when may-alias sets get too large. Repeated inserts into AliasSetTracker have quadratic behavior - inserting a pointer into AST is linear, since it requires walking over all "may" alias sets and running an alias check vs. every pointer in the set. We can avoid this by tracking the total number of pointers in "may" sets, and when that number exceeds a threshold, declare the tracker "saturated". This lumps all pointers into a single "may" set that aliases every other pointer. (This is a stop-gap solution until we migrate to MemorySSA) This fixes PR28832. Differential Revision: https://reviews.llvm.org/D23432 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279274 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-19 17:05:22 +00:00
Hans Wennborg	3a15af3d32	SCEV: Don't assert about non-SCEV-able value in isSCEVExprNeverPoison() (PR28932) Differential Revision: https://reviews.llvm.org/D23594 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278999 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-17 22:50:18 +00:00
Reid Kleckner	8f5027f8f0	Revert "Enhance SCEV to compute the trip count for some loops with unknown stride." This reverts commit r278731. It caused http://crbug.com/638314 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278853 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-16 21:02:04 +00:00
Sanjoy Das	b4ca813e88	Revert "[ValueTracking] Improve ValueTracking on left shift with nsw flag" This reverts commit r278172. It causes PR28946. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278740 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-15 21:01:31 +00:00
David L Kreitzer	1e230bc8b4	Enhance SCEV to compute the trip count for some loops with unknown stride. Patch by Pankaj Chawla Differential Revision: https://reviews.llvm.org/D22377 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278731 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-15 20:21:41 +00:00
Eli Friedman	35ff6f0857	[DSE] Don't remove stores made live by a call which unwinds. Issue exposed by noalias or more aggressive alias analysis. Fixes http://llvm.org/PR25422. Differential revision: https://reviews.llvm.org/D21007 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278451 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-12 01:09:53 +00:00
Andrew Kaylor	72626e148e	[ValueTracking] An improvement to IR ValueTracking on Non-negative Integers Patch by Li Huang Differential Revision: https://reviews.llvm.org/D18777 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278267 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-10 18:47:19 +00:00
Andrew Kaylor	6c6978f931	[ValueTracking] Improve ValueTracking on left shift with nsw flag Patch by Li Huang Differential Revison: https://reviews.llvm.org/D23296 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278172 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-09 22:41:35 +00:00
Wei Mi	fc25cfb37a	Fix the runtime error caused by "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The patch is to fix the bug in PR28705. It was caused by setting wrong return value for SCEVExpander::findExistingExpansion. The return values of findExistingExpansion have different meanings when the function is used in different ways so it is easy to make mistake. The fix creates two new interfaces to replace SCEVExpander::findExistingExpansion, and specifies where each interface is expected to be used. Differential Revision: https://reviews.llvm.org/D22942 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278161 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-09 20:40:03 +00:00
Wei Mi	8b93225c07	Recommit "Use ValueOffsetPair to enhance value reuse during SCEV expansion". The fix for PR28705 will be committed consecutively. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278160 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-09 20:37:50 +00:00
Sanjoy Das	549a67c20d	[SCEV] Un-grep'ify tests; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277861 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-05 20:33:49 +00:00
Sanjoy Das	a2ef5492ed	[SCEV] Don't infinitely recurse on unreachable code git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277848 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-05 18:34:14 +00:00
Michael Kuperstein	e788186982	[LV, X86] Be more optimistic about vectorizing shifts. Shifts with a uniform but non-constant count were considered very expensive to vectorize, because the splat of the uniform count and the shift would tend to appear in different blocks. That made the splat invisible to ISel, and we'd scalarize the shift at codegen time. Since r201655, CodeGenPrepare sinks those splats to be next to their use, and we are able to select the appropriate vector shifts. This updates the cost model to to take this into account by making shifts by a uniform cheap again. Differential Revision: https://reviews.llvm.org/D23049 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277782 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 22:48:03 +00:00
Simon Pilgrim	2bdfdf95e6	[X86] Dropped XOP ctbits checks - they match the AVX checks git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277718 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 11:04:13 +00:00
Simon Pilgrim	d81c2d5aa5	[X86][SSE] Add initial costs for vector CTTZ/CTLZ git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277716 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 10:51:41 +00:00
George Burgess IV	4df351da7d	[CFLAA] Remove modref queries from CFLAA. As it turns out, modref queries are broken with CFLAA. Specifically, the data source we were using for determining modref behaviors explicitly ignores operations on non-pointer values. So, it wouldn't note e.g. storing an i32 to an i32* (or loading an i64 from an i64). It also ignores external function calls, rather than acting conservatively for them. (N.B. These operations, where necessary, are* tracked by CFLAA; we just use a different mechanism to do so. Said mechanism is relatively imprecise, so it's unlikely that we can provide reasonably good modref answers with it as implemented.) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22978 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277366 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-01 18:47:28 +00:00
Adam Nemet	b17a45cae0	[BPI] Add new LazyBPI analysis Summary: The motivation is the same as in D22141: In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. BFI depends on BPI so unless we make this lazy as well we would still compute BPI unconditionally. The solution is to use the new LazyBPI pass in LazyBFI and only compute BPI when computation of BFI is requested by the client. I extended the laziness test using a LoopDistribute test to also cover BPI. Reviewers: hfinkel, davidxl Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D22835 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277083 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 23:31:12 +00:00
George Burgess IV	aa6ca74bfc	[CFLAA] Add getModRefBehavior to CFLAnders. This patch lets CFLAnders respond to mod-ref queries. It also includes a small bugfix to CFLSteens. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22823 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276939 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-27 23:07:07 +00:00
Hans Wennborg	df483ab5a3	Revert r276136 "Use ValueOffsetPair to enhance value reuse during SCEV expansion." It causes Clang tests to fail after Windows self-host (PR28705). (Also reverts follow-up r276139.) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276822 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 23:25:13 +00:00
Wei Mi	feca1dba45	Remove useless pass from the pipeline in test/Analysis/Dominators/2007-01-14-BreakCritEdges.ll. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276644 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-25 16:27:34 +00:00
Sanjoy Das	fa29211988	[SCEV] Make isImpliedCondOperandsViaRanges smarter This change lets us prove things like "{X,+,10} s< 5000" implies "{X+7,+,10} does not sign overflow" It does this by replacing replacing getConstantDifference by computeConstantDifference (which is smarter) in isImpliedCondOperandsViaRanges. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276505 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-23 00:54:36 +00:00
Wei Mi	a6c7a032fe	[PM] Port BreakCriticalEdges to the new PM. Differential Revision: https://reviews.llvm.org/D22688 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276449 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 18:04:25 +00:00
Wei Mi	edd8aa084e	Fix test/Analysis/ScalarEvolution/scev-expander-existing-value-offset.ll for rL276136. The content in this testcase was accidentally duplicated. Fix the error. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276139 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 16:54:58 +00:00
Wei Mi	fe11fa4d16	Use ValueOffsetPair to enhance value reuse during SCEV expansion. In D12090, the ExprValueMap was added to reuse existing value during SCEV expansion. However, const folding and sext/zext distribution can make the reuse still difficult. A simplified case is: suppose we know S1 expands to V1 in ExprValueMap, and S1 = S2 + C_a S3 = S2 + C_b where C_a and C_b are different SCEVConstants. Then we'd like to expand S3 as V1 - C_a + C_b instead of expanding S2 literally. It is helpful when S2 is a complex SCEV expr and S2 has no entry in ExprValueMap, which is usually caused by the fact that S3 is generated from S1 after const folding. In order to do that, we represent ExprValueMap as a mapping from SCEV to ValueOffsetPair. We will save both S1->{V1, 0} and S2->{V1, C_a} into the ExprValueMap when we create SCEV for V1. When S3 is expanded, it will first expand S2 to V1 - C_a because of S2->{V1, C_a} in the map, then expand S3 to V1 - C_a + C_b. Differential Revision: https://reviews.llvm.org/D21313 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276136 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 16:40:33 +00:00
Simon Pilgrim	002e4b5635	[X86][SSE] Add cost model values for CTPOP of vectors This patch adds costs for the vectorized implementations of CTPOP, the default values were seriously underestimating the cost of these and was encouraging vectorization on targets where serialized use of POPCNT would be much better. Differential Revision: https://reviews.llvm.org/D22456 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276104 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-20 10:41:28 +00:00
George Burgess IV	bfc580351a	[CFLAA] Make a test tell the truth. NFC. Dishonesty noted by Jia Chen. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276028 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 20:56:41 +00:00
George Burgess IV	1caa063db8	[CFLAA] Add some interproc. analysis to CFLAnders. This patch adds function summary support to CFLAnders. It also comes with a lot of tests! Woohoo! Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22450 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276026 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-19 20:47:15 +00:00
Simon Pilgrim	08df0eb04c	[X86] Add CTPOP/CTLZ/CTTZ scalar cost tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275725 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-17 18:29:19 +00:00
George Burgess IV	2347de17da	[CFLAA] Add attributes handling for CFLAnders. This patch adds proper handling of stratified attributes into our anders-style CFLAA implementation. It also comes bundled with more CFLAnders tests. :) Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22325 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275604 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 20:02:49 +00:00
George Burgess IV	723a3ff949	[CFLAA] Add an initial CFLAnders implementation. This adds an incomplete anders-style implementation for CFLAA. It's incomplete in that it's missing interprocedural analysis, attrs handling, etc. and that it needs more tests. More tests and features will be added in future commits. Patch by Jia Chen. Differential Revision: https://reviews.llvm.org/D22291 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275602 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-15 19:53:25 +00:00
Tom Stellard	6e4b75df4f	GlobalsAA: Functions with the argmemonly attribute won't read arbitrary globals Summary: In preparation for changing GlobalsAA to stop assuming that intrinsics can't read arbitrary globals, we need to make sure GlobalsAA is querying function attributes rather than relying on this assumption. This patch was inspired by: http://reviews.llvm.org/D20206 Reviewers: jmolloy, hfinkel Subscribers: eli.friedman, llvm-commits Differential Revision: https://reviews.llvm.org/D21318 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275433 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-14 15:50:27 +00:00
Adam Nemet	15e85ff776	[BFI] Add new LazyBFI analysis pass Summary: This is necessary for D21771. In order to add the hotness attribute to optimization remarks we need BFI to be available in all passes that emit optimization remarks. However we don't want to pay for computing BFI unless the hotness attribute is requested. This is achieved by making BFI lazy at the very high-level through a new analysis pass -- BFI is not calculated unless requested. I am adding a test to check the laziness under D21771 where the first user of the analysis is added. Reviewers: hfinkel, dexonsmith, davidxl Subscribers: davidxl, dexonsmith, llvm-commits Differential Revision: http://reviews.llvm.org/D22141 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275250 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-13 05:01:48 +00:00
Keno Fischer	c8ce9e59a8	Fix ScalarEvolutionExpander step scaling bug The expandAddRecExprLiterally function incorrectly transforms `[Start + Step * X]` into `Step * [Start + X]` instead of the correct transform of `[Step * X] + Start`. This caused https://github.com/JuliaLang/julia/issues/14704#issuecomment-174126219 due to what appeared to be sufficiently complicated loop interactions. Patch by Jameson Nash (jameson@juliacomputing.com). Reviewers: sanjoy Differential Revision: http://reviews.llvm.org/D16505 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275239 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-13 01:28:12 +00:00
Michael Kuperstein	b1fce5cc4c	[X86] Make some cast costs more precise Make some AVX and AVX512 cast costs more precise. Based on part of a patch by Elena Demikhovsky (D15604). Differential Revision: http://reviews.llvm.org/D22064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275106 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-11 21:39:44 +00:00
Hal Finkel	0017f3683d	Teach isDereferenceablePointer to look through returned-argument functions For functions which are known to return their argument, isDereferenceableAndAlignedPointer can examine the argument value. Differential Revision: http://reviews.llvm.org/D9384 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275038 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-11 03:08:49 +00:00
Hal Finkel	837551f8d6	Teach SCEV to look through returned-argument functions When building SCEVs, if a function is known to return its argument, then we can build the SCEV using the corresponding argument value. Differential Revision: http://reviews.llvm.org/D9381 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275037 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-11 02:48:23 +00:00
Hal Finkel	4cb3366de0	BasicAA should look through functions with returned arguments Motivated by the work on the llvm.noalias intrinsic, teach BasicAA to look through returned-argument functions when answering queries. This is essential so that we don't loose all other AA information when supplementing with llvm.noalias. Differential Revision: http://reviews.llvm.org/D9383 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275035 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-11 01:32:20 +00:00
Adam Nemet	a898ac8198	[LAA] Port test to the new PM This is a follow-on to r274452. The LAA with the new PM is a loop pass so we go from inner to outer loops. Also using a CHECK-NOT didn't make much sense because we print something in either case; whether an invariant is 'found' or 'not found'. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274935 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-08 21:24:06 +00:00
Justin Bogner	cde945cbcd	NVPTX: Remove the legacy ptx intrinsics - Rename the ptx.read.* intrinsics to nvvm.read.ptx.sreg.* - some but not all of these registers were already accessible via the nvvm name. - Rename ptx.bar.sync nvvm.bar.sync, to match nvvm.bar0. There's a fair amount of code motion here, but it's all very mechanical. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274769 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-07 16:40:17 +00:00
Sean Silva	aa13ea4ca0	[PM] Avoid getResult on a higher level in LoopAccessAnalysis Note that require<domtree> and require<loops> aren't needed because they come in implicitly via the loop pass manager. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274712 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-07 01:01:53 +00:00
Sanjay Patel	368e7e3ad1	[x86] fix cost of SINT_TO_FP for i32 --> float (PR21356, PR28434) This is "cvtdq2ps" which does not appear to be particularly slow on any CPU according to Agner's tables. Choosing "5" as a cost here as suggested in: https://llvm.org/bugs/show_bug.cgi?id=21356 ...but it seems very conservative given that the instruction is fully pipelined, and I think these costs are supposed to model throughput. Note that related costs are also most likely too high, but this fixes PR21356 and partly fixes PR28434. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274658 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-06 19:15:54 +00:00
Michael Kuperstein	c7432f9ad3	[TTI] The cost model should not assume vector casts get completely scalarized The cost model should not assume vector casts get completely scalarized, since on targets that have vector support, the common case is a partial split up to the legal vector size. So, when a vector cast gets split, the resulting casts end up legal and cheap. Instead of pessimistically assuming scalarization, base TTI can use the costs the concrete TTI provides for the split vector, plus a fudge factor to account for the cost of the split itself. This fudge factor is currently 1 by default, except on AMDGPU where inserts and extracts are considered free. Differential Revision: http://reviews.llvm.org/D21251 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274642 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-06 17:30:56 +00:00
George Burgess IV	3b5b98a488	[CFLAA] Split into Anders+Steens analysis. StratifiedSets (as implemented) is very fast, but its accuracy is also limited. If we take a more aggressive andersens-like approach, we can be way more accurate, but we'll also end up being slower. So, we've decided to split CFLAA into CFLSteensAA and CFLAndersAA. Long-term, we want to end up in a place where CFLSteens is queried first; if it can provide an answer, great (since queries are basically map lookups). Otherwise, we'll fall back to CFLAnders, BasicAA, etc. This patch splits everything out so we can try to do something like that when we get a reasonable CFLAnders implementation. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21910 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-06 00:26:41 +00:00
Nemanja Ivanovic	ba988eb430	[PowerPC] - Legalize vector types by widening instead of integer promotion This patch corresponds to review: http://reviews.llvm.org/D20443 It changes the legalization strategy for illegal vector types from integer promotion to widening. This only applies for vectors with elements of width that is a multiple of a byte since we have hardware support for vectors with 1, 2, 3, 8 and 16 byte elements. Integer promotion for vectors is quite expensive on PPC due to the sequence of breaking apart the vector, extending the elements and reconstituting the vector. Two of these operations are expensive. This patch causes between minor and major improvements in performance on most benchmarks. There are very few benchmarks whose performance regresses. These regressions can be handled in a subsequent patch with a DAG combine (similar to how this patch handles int -> fp conversions of illegal vector types). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274535 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-05 09:22:29 +00:00
Nicolai Haehnle	b07f540456	Add writeonly IR attribute Summary: This complements the earlier addition of IntrWriteMem and IntrWriteArgMem LLVM intrinsic properties, see D18291. Also start using the attribute for memset, memcpy, and memmove intrinsics, and remove their special-casing in BasicAliasAnalysis. Reviewers: reames, joker.eph Subscribers: joker.eph, llvm-commits Differential Revision: http://reviews.llvm.org/D18714 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274485 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-04 08:01:29 +00:00
Xinliang David Li	10b22c8894	[PM] Port LoopAccessInfo analysis to new PM It is implemented as a LoopAnalysis pass as discussed and agreed upon. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274452 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-02 21:18:40 +00:00
Sanjoy Das	b4b9bfafe7	[SCEV] Compute max be count from shift operator only if all else fails In particular, check to see if we can compute a precise trip count by exhaustively simulating the loop first. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274199 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-30 02:47:28 +00:00
George Burgess IV	2e8079ddd8	[CFLAA] Add support for ModRef queries. This patch makes CFLAA answer some ModRef queries. Because we don't distinguish between reading/writing when making StratifiedSets, we're unable to offer any of the readonly-related answers. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21858 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274197 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-30 02:11:26 +00:00
Artur Pilipenko	48917c9e44	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274043 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-28 18:27:25 +00:00
Artur Pilipenko	be0da39a48	Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273895 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 16:54:33 +00:00
Artur Pilipenko	9227558e8e	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details). This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273892 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-27 16:29:26 +00:00
George Burgess IV	7161bf8a49	[CFLAA] Propagate StratifiedAttrs in interproc. analysis. This patch also has a refactor that kills StratifiedAttr, and leaves us with StratifiedAttrs, because having both was mildly redundant. This patch makes us correctly handle stratified attributes when doing interprocedural analysis. It also adds another attribute, AttrCaller, which acts like AttrUnknown. We can filter out AttrCaller values when during interprocedural analysis, since the caller should have information about what arguments it's passing to its callee. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21645 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273636 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 01:00:03 +00:00
George Burgess IV	39639d90df	[CFLAA] Use better interprocedural function summaries. Previously, we just unified any arguments that seemed to be related to each other. With this patch, we now respect dereference levels, etc. which should make us substantially more accurate. Proper handling of StratifiedAttrs will be done in a later patch. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21536 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273596 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-23 18:55:23 +00:00
Michael Kuperstein	4e93d1c1e6	[X86] Make arithmetic operations cost model test saner. NFC. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273316 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-21 20:41:40 +00:00
George Burgess IV	bc51bb9f20	[CFLAA] Be more aggressive with interprocedural analysis. This patch makes us perform interprocedural analysis on functions that don't have internal linkage. It also removes a test that should've been deleted in an earlier commit (since other tests now cover everything that the newly-removed test covers). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21513 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273229 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-21 01:42:47 +00:00
George Burgess IV	4ac871846d	[CFLAA] Add interprocedural function summaries. This patch adds function summaries, so that we don't need to recompute various properties about function parameters/return values at each callsite of a function. It also adds many interprocedural tests for CFLAA. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21475#inline-182390 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273219 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 23:10:56 +00:00
Simon Pilgrim	06026c4ca3	[X86][SSE] Add cost model for BSWAP of vectors The BSWAP of vector types is quite efficiently implemented using vector shuffles on SSE/AVX targets, we should reflect the typical cost of this to encourage vectorization. Differential Revision: http://reviews.llvm.org/D21521 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273217 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-20 23:08:21 +00:00
Sanjoy Das	bac5341e94	[SCEV] Fix incorrect trip count computation The way we elide max expressions when computing trip counts is incorrect -- it breaks cases like this: ``` static int wrapping_add(int a, int b) { return (int)((unsigned)a + (unsigned)b); } void test() { volatile int end_buf = 2147483548; // INT_MIN - 100 int end = end_buf; unsigned counter = 0; for (int start = wrapping_add(end, 200); start < end; start++) counter++; print(counter); } ``` Note: the `NoWrap` variable that was being tested has little to do with the values flowing into the max expression; it is a property of the induction variable. test/Transforms/LoopUnroll/nsw-tripcount.ll was added to solely test functionality I'm reverting in this change, so I've deleted the test fully. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273079 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-18 04:38:31 +00:00
George Burgess IV	e686c89077	[CFLAA] Tag arguments as escaped instead of unknown. This patch also includes some refactoring. Prior to this patch, we tagged all CFLAA attributes as unknown. This is suboptimal, since it meant that any Value used as an argument would be considered to alias any other Value that existed. Now that we have the machinery to tag sets below the set for an arbitrary value with attributes, it's okay to be less conservative with arguments. (Specifically, we still tag the set under an argument with unknown). Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21262 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272690 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-14 18:12:28 +00:00
Simon Pilgrim	07ef6bb26b	[CostModel][X86][SSE] Updated costs for vector BITREVERSE ops on SSSE3+ targets To account for the fast PSHUFB implementation now available git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272484 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-11 19:23:02 +00:00
Michael Kuperstein	e47deb88ee	[X86] Add costs for SSE zext/sext to v4i64 to TTI The costs are somewhat hand-wavy, but should be much closer to the truth than what we get from BasicTTI. Differential Revision: http://reviews.llvm.org/D21156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272406 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 17:01:05 +00:00
George Burgess IV	bec2df684f	[CFLAA] Handle global/arg attrs more sanely. Prior to this patch, we used argument/global stratified attributes in order to note that a value could have come from either dereferencing a global/arg, or from the assignment from a global/arg. Now, AttrUnknown is placed on sets when we see a dereference, instead of the global/arg attributes. This allows us to be more aggressive in the future when we see global/arg attributes without AttrUnknown. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21110 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272335 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:15:04 +00:00
Sanjoy Das	05c5b3fd8e	Be wary of abnormal exits from loop when exploiting UB We can safely rely on a NoWrap add recurrence causing UB down the road only if we know the loop does not have a exit expressed in a way that is opaque to ScalarEvolution (e.g. by a function call that conditionally calls exit(0)). I believe with this change PR28012 is fixed. Note: I had to change some llvm-lit tests in LoopReroll, since it looks like they were depending on this incorrect behavior. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272237 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 01:13:59 +00:00
Sanjoy Das	27dabd2db6	[SCEV] Track no-abnormal-exits instead of no-throw calls Absence of may-unwind calls is not enough to guarantee that a UB-generating use of an add-rec poison in the loop latch will actually cause UB. We also need to guard against calls that terminate the thread or infinite loop themselves. This partially addresses PR28012. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272181 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 17:48:42 +00:00
Sanjoy Das	39ede080e2	Teach isGuarantdToTransferExecToSuccessor about debug info intrinsics Calls to `@llvm.dbg.*` can be assumed to terminate. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272180 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 17:48:36 +00:00
Sanjoy Das	e118351c7e	Fix a bug in SCEV's poison value propagation The worklist algorithm introduced in rL271151 didn't check to see if the direct users of the post-inc add recurrence propagates poison. This change fixes the problem and makes the code structure more obvious. Note for release managers: correctness wise, this bug wasn't a regression introduced by rL271151 -- the behavior of SCEV around post-inc add recurrences was strictly improved (in terms of correctness) in rL271151. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272179 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 17:48:31 +00:00
George Burgess IV	90fe2c51ce	[CFLAA] Add AttrEscaped, remove bit twiddling functions. This patch does a few things: - Unifies AttrAll and AttrUnknown (since they were used for more or less the same purpose anyway). - Introduces AttrEscaped, an attribute that notes that a value escapes our analysis for a given set, but not that an unknown value flows into said set. - Removes functions that take bit indices, since we also had functions that took bitsets, and the use of both (with similar names) was unclear and bug-prone. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D21000 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272040 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 18:35:37 +00:00
Andrey Turetskiy	a87b055656	[LAA] Improve non-wrapping pointer detection by handling loop-invariant case. This fixes PR26314. This patch adds new helper “isNoWrap” with detection of loop-invariant pointer case. Patch by Roman Shirokiy. Ref: https://llvm.org/bugs/show_bug.cgi?id=26314 Differential Revision: http://reviews.llvm.org/D17268 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272014 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 14:55:27 +00:00
Easwaran Raman	7ef349f4f7	Reapply r271728 after adding move cobstructor for ProfileSummaryInfo git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271745 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 22:54:26 +00:00
Easwaran Raman	b920b27660	Revert r271728 as it breaks Windows build git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271738 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 21:14:26 +00:00
Easwaran Raman	f22c11f6fb	Analysis pass to access profile summary info Differential Revision: http://reviews.llvm.org/D20648 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271728 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-03 20:37:19 +00:00
Daniel Berlin	aed422107d	Revert "Claim NoAlias if two GEPs index different fields of the same struct" This reverts commit 2d5d6493f43eb68493a3852b8c226ac9fafdc7eb. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271422 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-01 18:55:32 +00:00
George Burgess IV	dbfc82f334	[CFLAA] Recognize builtin allocation functions. This patch extends CFLAA to recognize allocation functions such as malloc, free, etc, so we can treat them more aggressively. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D20776 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271421 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-01 18:39:54 +00:00
Daniel Berlin	7e49342f2d	Claim NoAlias if two GEPs index different fields of the same struct Patch by Taewook Oh Summary: Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure. Reviewers: hfinkel, dberlin Subscribers: dberlin, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20665 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271415 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-01 18:12:01 +00:00
Sanjoy Das	07e4e5556a	Reduce dependence on pointee types when deducing dereferenceability Summary: Change some of the internal interfaces in Loads.cpp to keep track of the number of bytes we're trying to prove dereferenceable using an explicit `Size` parameter. Before this, the `Size` parameter was implicitly inferred from the pointee type of the pointer whose dereferenceability we were trying to prove, causing us to be conservative around bitcasts. This was unfortunate since bitcast instructions are no-ops and should never break optimizations. With an explicit `Size` parameter, we're more precise (as shown in the test cases), and the code is simpler. We should eventually move towards a `DerefQuery` struct that groups together a base pointer, an offset, a size and an alignment; but this patch is a first step. Reviewers: apilipenko, dblaikie, hfinkel, reames Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D20764 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271406 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-01 16:47:45 +00:00
George Burgess IV	31047306fc	[CFLAA] Don't link GEP pointers to GEP indices. Code like the following is considered broken, and doesn't need to be supported by our AA magicks: void getFoo(int P) { int PAlias = (int )((char )NULL + (uintptr_t)P); } This patch makes CFLAA drop support for code like this. Patch by Jia Chen. Differential Revision: http://reviews.llvm.org/D20775 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271322 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-31 19:55:05 +00:00
Sanjoy Das	09cfc1ebb1	[SCEV] See through op.with.overflow intrinsics (re-apply) Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). This was first checked in at r265912 but reverted in r265950 because it exposed some issues around how SCEV handled post-inc add recurrences. Those issues have now been fixed. Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271152 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-29 00:34:42 +00:00
Sanjoy Das	11ed8edc19	[SCEV] Don't always add no-wrap flags to post-inc add recs Fixes PR27315. The post-inc version of an add recurrence needs to "follow the same rules" as a normal add or subtract expression. Otherwise we miscompile programs like ``` int main() { int a = 0; unsigned a_u = 0; volatile long last_value; do { a_u += 3; last_value = (long) ((int) a_u); if (will_add_overflow(a, 3)) { // Leave, and don't actually do the increment, so no UB. printf("last_value = %ld\n", last_value); exit(0); } a += 3; } while (a != 46); return 0; } ``` This patch changes SCEV to put no-wrap flags on post-inc add recurrences only when the poison from a potential overflow will go ahead to cause undefined behavior. To avoid regressing performance too much, I've assumed infinite loops without side effects is undefined behavior to prove poison<->UB equivalence in more cases. This isn't ideal, but is not new to LLVM as a whole, and far better than the situation I'm trying to fix. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271151 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-29 00:32:17 +00:00
Sanjoy Das	2f6c3f7e20	[ValueTracking] ICmp instructions propagate poison This is a stripped down version of D19211, leaving out the questionable "branching in poison is UB" bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271150 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-29 00:31:18 +00:00
Michael Kuperstein	b80d8daba8	[BasicAA] Extend inbound GEP negative offset logic to GlobalVariables r270777 improved the precision of alloca vs. inbounbds GEP alias queries: if we have (a) an inbounds GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points to would have a negative offset with respect to the alloca, then the GEP can not alias pointer (b). This makes the same logic fire when (b) is based on a GlobalVariable instead of an alloca. Differential Revision: http://reviews.llvm.org/D20652 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270893 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-26 19:30:49 +00:00
Michael Kuperstein	92631b8db7	[BasicAA] Improve precision of alloca vs. inbounds GEP alias queries If a we have (a) a GEP and (b) a pointer based on an alloca, and the beginning of the object the GEP points would have a negative offset with repsect to the alloca, then the GEP can not alias pointer (b). For example, consider code like: struct { int f0, int f1, ...} foo; ... foo alloca; foo random = bar(alloca); int f0 = &alloca.f0 int f1 = &random->f1; Which is lowered, approximately, to: %alloca = alloca %struct.foo %random = call %struct.foo @random(%struct.foo* %alloca) %f0 = getelementptr inbounds %struct, %struct.foo* %alloca, i32 0, i32 0 %f1 = getelementptr inbounds %struct, %struct.foo* %random, i32 0, i32 1 Assume %f1 and %f0 alias. Then %f1 would point into the object allocated by %alloca. Since the %f1 GEP is inbounds, that means %random must also point into the same object. But since %f0 points to the beginning of %alloca, the highest %f1 can be is (%alloca + 3). This means %random can not be higher than (%alloca - 1), and so is not inbounds, a contradiction. Differential Revision: http://reviews.llvm.org/D20495 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270777 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 22:23:08 +00:00
Oleg Ranevskyy	d380dee4ce	[SCEV] No-wrap flags are not propagated when folding "{S,+,X}+T ==> {S+T,+,X}" Summary: Description This makes `WidenIV::widenIVUse` (IndVarSimplify.cpp) fail to widen narrow IV uses in some cases. The latter affects IndVarSimplify which may not eliminate narrow IV's when there actually exists such a possibility, thereby producing ineffective code. When `WidenIV::widenIVUse` gets a NarrowUse such as `{(-2 + %inc.lcssa),+,1}<nsw><%for.body3>`, it first tries to get a wide recurrence for it via the `getWideRecurrence` call. `getWideRecurrence` returns recurrence like this: `{(sext i32 (-2 + %inc.lcssa) to i64),+,1}<nsw><%for.body3>`. Then a wide use operation is generated by `cloneIVUser`. The generated wide use is evaluated to `{(-2 + (sext i32 %inc.lcssa to i64))<nsw>,+,1}<nsw><%for.body3>`, which is different from the `getWideRecurrence` result. `cloneIVUser` sees the difference and returns nullptr. This patch also fixes the broken LLVM tests by adding missing <nsw> entries introduced by the correction. Minimal reproducer: ``` int foo(int a, int b, int c); int baz(); void bar() { int arr[20]; int i = 0; for (i = 0; i < 4; ++i) arr[i] = baz(); for (; i < 20; ++i) arr[i] = foo(arr[i - 4], arr[i - 3], arr[i - 2]); } ``` Clang command line: ``` clang++ -mllvm -debug -S -emit-llvm -O3 --target=aarch64-linux-elf test.cpp -o test.ir ``` Expected result: The ` -mllvm -debug` log shows that all the IV's for the second `for` loop have been eliminated. Reviewers: sanjoy Subscribers: atrick, asl, aemerson, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D20058 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270695 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 13:01:33 +00:00
Simon Pilgrim	611c8e7533	[CostModel][X86][XOP] Added XOP costmodel for BITREVERSE Now that we have a nice fast VPPERM solution. Added framework for future intrinsic costs as well. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270537 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-24 08:17:50 +00:00
Matthew Simpson	204e3200ad	[LAA] Check independence of strided accesses before forward case This patch changes the order in which we attempt to prove the independence of strided accesses. We previously did this after we knew the dependence distance was positive. With this change, we check for independence before handling the negative distance case. The patch prevents LAA from reporting forward dependences for independent strided accesses. This change was requested in the review of D19984. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270072 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-19 15:37:19 +00:00
Sanjoy Das	2848e3cd95	[SCEV] Be more aggressive in proving NUW ... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is the NUW variant of r269211 and fixes PR27691. (Note: PR27691 is not a correct or stability bug, it was created to track a pending task). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269790 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-17 17:51:14 +00:00
Simon Pilgrim	7537c45fbd	[CostModel][X86] Tidied up checks git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269770 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-17 14:43:41 +00:00
Simon Pilgrim	78ba7287ef	[CostModel][X86] Added scalar bitreverse tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269594 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-15 17:40:48 +00:00
Adam Nemet	27aef5052b	[LAA] Include MaxSafeDepDistBytes in the analysis print-out git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269508 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-13 22:49:13 +00:00
Sanjoy Das	2731c7a952	[SCEVExpander] Fix a failed cast<> assertion SCEVExpander::replaceCongruentIVs assumes the backedge value of an SCEV-analysable PHI to always be an instruction, when this is not necessarily true. For now address this by bailing out of the optimization if the backedge value of the PHI is a non-Instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269213 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-11 17:41:41 +00:00
Sanjoy Das	5bbe5abe5e	[SCEVExpander] Don't break SSA in replaceCongruentIVs `SCEVExpander::replaceCongruentIVs` bypasses `hoistIVInc` if both the original and the isomorphic increments are PHI nodes. Doing this can break SSA if the isomorphic increment is not dominated by the original increment. Get rid of the bypass, and let `hoistIVInc` do the right thing. Fixes PR27232 (compile time crash/hang). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269212 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-11 17:41:34 +00:00
Sanjoy Das	836f35c484	[SCEV] Be more aggressive around proving no-wrap ... for AddRec's in loops for which SCEV is unable to compute a max tripcount. This is not a problem for "normal" loops[0] that don't have guards or assumes, but helps in cases where we have guards or assumes in the loop that can be used to constrain incoming values over the backedge. This partially fixes PR27691 (we still don't handle the NUW case). [0]: for "normal" loops, in the cases where we'd be able to prove no-wrap via isKnownPredicate, we'd also be able to compute a max tripcount. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269211 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-11 17:41:26 +00:00
Vedant Kumar	0a0124375c	[BasicAA] Compare GEP indices based on value (Fix PR27418) Equivalent GEP indices with different types are treated as different indices altogether, leading to an incorrect AA result. Fix the issue by comparing indices based on their values. Thanks to Mikael Holmén for reporting the issue! Differential Revision: http://reviews.llvm.org/D19935 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269197 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-11 15:45:43 +00:00
Sanjoy Das	bbd902f57e	[BasicAA] Guard intrinsics don't write to memory Summary: The idea is very close to what we do for assume intrinsics: we mark the guard intrinsics as writing to arbitrary memory to maintain control dependence, but under the covers we teach AA that they do not mod any particular memory location. Reviewers: chandlerc, hfinkel, gbiv, reames Subscribers: george.burgess.iv, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19575 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269007 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-10 02:35:41 +00:00
Sanjoy Das	cbda428c36	[SCEV] Use guards to prove predicates We can use calls to @llvm.experimental.guard to prove predicates, relying on the fact that in all locations domianted by a call to @llvm.experimental.guard the predicate it is guarding is known to be true. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268997 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-10 00:31:49 +00:00
Simon Pilgrim	15a59473b3	[X86][SSE] Improve cost model for i64 vector comparisons on pre-SSE42 targets As discussed on PR24888, until SSE42 we don't have access to PCMPGTQ for v2i64 comparisons, but the cost models don't reflect this, resulting in over-optimistic vectorizaton. This patch adds SSE2 'base level' costs that match what a typical target is capable of and only reduces the v2i64 costs at SSE42. Technically SSE41 provides a PCMPEQQ v2i64 equality test, but as getCmpSelInstrCost doesn't give us a way to discriminate between comparison test types we can't easily make use of this, otherwise we could split the cost of integer equality and greater-than tests to give better costings of each. Differential Revision: http://reviews.llvm.org/D20057 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268972 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-09 21:14:38 +00:00
Matt Arsenault	51485ffe89	DivergenceAnalysis: Fix crash with no return blocks The post dominator tree does not have a root node in this case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268933 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-09 16:57:08 +00:00
Simon Pilgrim	ba60f16656	[CostModel][X86] Extended comparison instruction cost model tests to include SSE2/SSE3/SSSE3/SSE41/SSE42 targets git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268877 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-08 15:24:53 +00:00
Simon Pilgrim	a3ea594334	[CostModel][X86] Split BSWAP/BITREVERSE cost tests from CTPOP/CTLZ/CTTZ 'bit count' cost tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268859 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-07 16:34:16 +00:00
Simon Pilgrim	c67a48457d	[CostModel][X86] Tweak 'SSE2-only' test CPU as it was only disabling SSE41 not SSE3/SSSE3 etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268763 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-06 17:50:07 +00:00
Simon Pilgrim	556c69a0f5	[CostModel][X86] Added ctlz/cttz undef-zero costmodel tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268761 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-06 17:48:35 +00:00
Simon Pilgrim	179257b158	[CostModel][X86] Added costmodel tests for vector ctpop/ctlz/cttz/bitreverse/bswap git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268738 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-06 14:38:14 +00:00
Xinliang David Li	da713ca23f	[PM] port Branch Frequency Analaysis pass to new PM git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268687 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-05 21:13:27 +00:00
Xinliang David Li	454f7fa94d	[PM] Port Branch Probability Analysis pass to the new pass manager. Differential Revision: http://reviews.llvm.org/D19839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268601 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-05 02:59:57 +00:00
Sanjoy Das	b78000e38b	[SCEV] Tweak the output format and content of -analyze In the "LoopDispositions:" section: - Instead of printing out a list, print out a "dictionary" to make it obvious by inspection which disposition is for which loop. This is just a cosmetic change. - Print dispositions for parent _and_ sibling loops. I will use this to write a test case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268405 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-03 17:49:57 +00:00
Nicolai Haehnle	10ea516563	AMDGPU: llvm.SI.fs.constant is a source of divergence Summary: This intrinsic is used to get flat-shaded fragment shader inputs. Those are uniform across a primitive, but a fragment shader wave may process pixels from multiple primitives (as indicated by the prim_mask), and so that's where divergence can arise. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19747 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268259 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-02 17:37:01 +00:00
Sanjoy Das	17d8569b6e	[SCEV] When printing via -analysis, dump loop disposition There are currently some bugs in tree around SCEV caching an incorrect loop disposition. Printing out loop dispositions will let us write whitebox tests as those are fixed. The dispositions are printed as a list in "inside out" order, i.e. innermost loop first. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268177 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-01 04:51:05 +00:00
Matt Arsenault	e8448abead	DivergenceAnalysis: Fix crash with unreachable blocks Unreachable blocks may not be in the dominator tree, so don't crash on them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268001 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-29 06:17:47 +00:00
Artur Pilipenko	5f2b68d895	Use DL preferred alignment for alloca in Value::getPointerAlignment Teach Value::getPointerAlignment that allocas with no explicit alignment are aligned to preferred alignment of the allocated type. Reviewed By: hfinkel Differential Revision: http://reviews.llvm.org/D17569 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267689 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-27 10:42:29 +00:00
Silviu Baranga	5a397274ef	[SCEV] Improve the run-time checking of the NoWrap predicate Summary: This implements a new method of run-time checking the NoWrap SCEV predicates, which should be easier to optimize and nicer for targets that don't correctly handle multiplication/addition of large integer types (like i128). If the AddRec is {a,+,b} and the backedge taken count is c, the idea is to check that \|b\| * c doesn't have unsigned overflow, and depending on the sign of b, that: a + \|b\| * c >= a (b >= 0) or a - \|b\| * c <= a (b <= 0) where the comparisons above are signed or unsigned, depending on the flag that we're checking. The advantage of doing this is that we avoid extending to a larger type and we avoid the multiplication of large types (multiplying i128 can be expensive). Reviewers: sanjoy Subscribers: llvm-commits, mzolotukhin Differential Revision: http://reviews.llvm.org/D19266 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267389 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-25 09:27:16 +00:00
Sanjoy Das	c8ce43d192	Have isKnownNotFullPoison be smarter around control flow Summary: (... while still not using a PostDomTree) The way we use isKnownNotFullPoison from SCEV today, the new CFG walking logic will not trigger for any realistic cases -- it will kick in only for situations where we could have merged the contiguous basic blocks anyway[0], since the poison generating instruction dominates all of its non-PHI uses (which are the only uses we consider right now). However, having this change in place will allow a later bugfix to break fewer llvm-lit tests. [0]: i.e. cases where block A branches to block B and B is A's only successor and A is B's only predecessor. Reviewers: broune, bjarke.roune Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19212 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267175 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-22 17:41:06 +00:00
Ashutosh Nema	91d7ac06b6	[X86]: Changing cost for “TRUNCATE v16i32 to v16i8” in SSE4.1 mode. Summary: rL256194 transforms truncations between vectors of integers into PACKUS/PACKSS operations during DAG combine. This generates better code for truncate, so cost of truncate needs to be changed but looks like it got changed only in SSE2 table Whereas this change is also applicable for SSE4.1, so the cost of truncate needs to be changed for that as well. Cost of “TRUNCATE v16i32 to v16i8” & “TRUNCATE v16i16 to v16i8” should be same in SSE4.1 & SSE2 table. Removing their cost from SSE4.1, so it will fall back to SSE2. Reviewers: Simon Pilgrim git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267123 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-22 08:34:05 +00:00
Michael Kuperstein	3445cc7801	Port DemandedBits to the new pass manager. Differential Revision: http://reviews.llvm.org/D18679 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266699 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-18 23:55:01 +00:00
Sanjoy Das	7f8cd41782	[BPI] Consider deoptimize calls as "unreachable" Summary: Calls to @llvm.experimental.deoptimize are expected to "never execute", so optimize them as such. Reviewers: chandlerc Subscribers: junbuml, mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D19095 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266654 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-18 19:01:28 +00:00
Nicolai Haehnle	a16ecd4300	[DivergenceAnalysis] Treat PHI with incoming undef as constant Summary: If a PHI has an incoming undef, we can pretend that it is equal to one non-undef, non-self incoming value. This is particularly relevant in combination with the StructurizeCFG pass, which introduces PHI nodes with undefs. Previously, this lead to branch conditions that were uniform before StructurizeCFG to become non-uniform afterwards, which confused the SIAnnotateControlFlow pass. This fixes a crash when Mesa radeonsi compiles a shader from dEQP-GLES3.functional.shaders.switch.switch_in_for_loop_dynamic_vertex Reviewers: arsenm, tstellarAMD, jingyue Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D19013 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266347 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-14 17:42:47 +00:00
Silviu Baranga	5cb657f8dd	[SCEV][LAA] Add tests for SCEV expression transformations performed during LAA Summary: Add a print method to Predicated Scalar Evolution which prints all interesting transformations done by PSE. Loop Access Analysis will now print this as part of the analysis output. We now use this to check the exact expression transformations that were done by PSE in LAA. The additional checking also acts as white-box testing for the getAsAddRec method. Reviewers: anemet, sanjoy Subscribers: sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18792 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266334 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-14 16:08:45 +00:00
Adam Nemet	cf0a711bff	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This reverts commit r266086. It breaks the LTO build of gcc in SPEC2000. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266282 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-14 08:47:17 +00:00
Matt Arsenault	d5a8ffbb92	AMDGPU: Remove leftover ShaderType attributes in tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266155 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-13 00:39:48 +00:00
Artur Pilipenko	80ce67004b	Support arbitrary addrspace pointers in masked load/store intrinsics This is a resubmittion of 263158 change. This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266086 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-12 15:58:04 +00:00
Sanjoy Das	5e07ce6898	This reverts commit r265913 and r265912 See PR27315 r265913: "[IndVars] Eliminate op.with.overflow when possible" r265912: "[SCEV] See through op.with.overflow intrinsics" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265950 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-11 15:26:18 +00:00
Sanjoy Das	97ad447f43	[SCEV] See through op.with.overflow intrinsics Summary: This change teaches SCEV to see reduce `(extractvalue 0 (op.with.overflow X Y))` into `op X Y` (with a no-wrap tag if possible). Reviewers: atrick, regehr Subscribers: mcrosier, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D18684 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265912 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-10 22:50:26 +00:00
Silviu Baranga	d8cc816f81	Re-commit [SCEV] Introduce a guarded backedge taken count and use it in LAA and LV This re-commits r265535 which was reverted in r265541 because it broke the windows bots. The problem was that we had a PointerIntPair which took a pointer to a struct allocated with new. The problem was that new doesn't provide sufficient alignment guarantees. This pattern was already present before r265535 and it just happened to work. To fix this, we now separate the PointerToIntPair from the ExitNotTakenInfo struct into a pointer and a bool. Original commit message: Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265786 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-08 14:29:09 +00:00
Sanjoy Das	c9e3e3cbfd	Don't IPO over functions that can be de-refined Summary: Fixes PR26774. If you're aware of the issue, feel free to skip the "Motivation" section and jump directly to "This patch". Motivation: I define "refinement" as discarding behaviors from a program that the optimizer has license to discard. So transforming: ``` void f(unsigned x) { unsigned t = 5 / x; (void)t; } ``` to ``` void f(unsigned x) { } ``` is refinement, since the behavior went from "if x == 0 then undefined else nothing" to "nothing" (the optimizer has license to discard undefined behavior). Refinement is a fundamental aspect of many mid-level optimizations done by LLVM. For instance, transforming `x == (x + 1)` to `false` also involves refinement since the expression's value went from "if x is `undef` then { `true` or `false` } else { `false` }" to "`false`" (by definition, the optimizer has license to fold `undef` to any non-`undef` value). Unfortunately, refinement implies that the optimizer cannot assume that the implementation of a function it can see has all of the behavior an unoptimized or a differently optimized version of the same function can have. This is a problem for functions with comdat linkage, where a function can be replaced by an unoptimized or a differently optimized version of the same source level function. For instance, FunctionAttrs cannot assume a comdat function is actually `readnone` even if it does not have any loads or stores in it; since there may have been loads and stores in the "original function" that were refined out in the currently visible variant, and at the link step the linker may in fact choose an implementation with a load or a store. As an example, consider a function that does two atomic loads from the same memory location, and writes to memory only if the two values are not equal. The optimizer is allowed to refine this function by first CSE'ing the two loads, and the folding the comparision to always report that the two values are equal. Such a refined variant will look like it is `readonly`. However, the unoptimized version of the function can still write to memory (since the two loads //can// result in different values), and selecting the unoptimized version at link time will retroactively invalidate transforms we may have done under the assumption that the function does not write to memory. Note: this is not just a problem with atomics or with linking differently optimized object files. See PR26774 for more realistic examples that involved neither. This patch: This change introduces a new set of linkage types, predicated as `GlobalValue::mayBeDerefined` that returns true if the linkage type allows a function to be replaced by a differently optimized variant at link time. It then changes a set of IPO passes to bail out if they see such a function. Reviewers: chandlerc, hfinkel, dexonsmith, joker.eph, rnk Subscribers: mcrosier, llvm-commits Differential Revision: http://reviews.llvm.org/D18634 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265762 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-08 00:48:30 +00:00
Nicolai Haehnle	ea7a0c0467	AMDGPU: Add a shader calling convention This makes it possible to distinguish between mesa shaders and other kernels even in the presence of compute shaders. Patch By: Bas Nieuwenhuizen <bas@basnieuwenhuizen.nl> Differential Revision: http://reviews.llvm.org/D18559 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-06 19:40:20 +00:00
Silviu Baranga	89e8236bfb	Revert r265535 until we know how we can fix the bots git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265541 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-06 14:06:32 +00:00
Silviu Baranga	39fbde60e1	[SCEV] Introduce a guarded backedge taken count and use it in LAA and LV Summary: When the backedge taken codition is computed from an icmp, SCEV can deduce the backedge taken count only if one of the sides of the icmp is an AddRecExpr. However, due to sign/zero extensions, we sometimes end up with something that is not an AddRecExpr. However, we can use SCEV predicates to produce a 'guarded' expression. This change adds a method to SCEV to get this expression, and the SCEV predicate associated with it. In HowManyGreaterThans and HowManyLessThans we will now add a SCEV predicate associated with the guarded backedge taken count when the analyzed SCEV expression is not an AddRecExpr. Note that we only do this as an alternative to returning a 'CouldNotCompute'. We use new feature in Loop Access Analysis and LoopVectorize to analyze and transform more loops. Reviewers: anemet, mzolotukhin, hfinkel, sanjoy Subscribers: flyingforyou, mcrosier, atrick, mssimpso, sanjoy, mzolotukhin, llvm-commits Differential Revision: http://reviews.llvm.org/D17201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265535 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-06 13:18:26 +00:00
George Burgess IV	a8ddf55110	[CFLAA] Fix PR27213; incorrect tagging of args/globals Prior to this patch, CFLAA wouldn't tag arguments/globals properly if it didn't find any "interesting" edges on them. This means that, if all you do is store constants to a global or argument, we would never actually treat it as a global/argument. Test case: define void @foo(i32* %A, i32* %B) #0 { entry: store i32 0, i32* %A, align 4 store i32 0, i32* %B, align 4 ret void } CFLAA would say that %A can't alias %B, because neither pointer was used in an interesting way. This patch makes us note whether something is an argument, global, ... regardless of how interesting CFLAA thinks its uses are. (For the record, using a value in an interesting way means loading from it, using it in a GEP, ...) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265474 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-05 21:40:45 +00:00
Brendon Cahoon	3188534401	[DependenceAnalysis] Check if result of getConstantPart is null A seg-fault occurs due to a reference of a null pointer, which is the value returned by getConstantPart. This function returns null if the constant part is not found. The code that calls this function needs to check for the null return value. Differential Revision: http://reviews.llvm.org/D18718 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265319 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-04 18:13:18 +00:00
Benjamin Kramer	638cd0356d	[TTI] Let the cost model estimate ctpop costs based on legality PPC has a vector popcount, this lets the vectorizer use the correct cost for it. Tweak X86 test to use an intrinsic that's actually scalarized (we have a somewhat efficient lowering for vector popcount using SSE, the cost model finds that now). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265005 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-31 10:42:40 +00:00
Matt Arsenault	bb366db643	AMDGPU: Cost model for basic integer operations This resolves bug 21148 by preventing promotion to i64 induction variables. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264376 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-25 01:16:40 +00:00
Matt Arsenault	359a7d918e	AMDGPU: Partially implement getArithmeticInstrCost for FP ops git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264374 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-25 01:00:32 +00:00
Matt Arsenault	e4e369ab90	TTI: Report 0 cost for free addrspacecasts git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264369 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-25 00:26:29 +00:00
Matt Arsenault	93e0b28a0e	TTI: Use 0 for cost of fabs if free Ideally this would also happen for fneg, but that isn't a distinct operation in the IR. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264368 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-25 00:26:22 +00:00
Matt Arsenault	42792ff39d	AMDGPU: TTI: Make insertelement free. We don't want to have a cost to scalarizing operations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264364 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-25 00:14:11 +00:00
Adam Nemet	7cacf39c01	[LAA] Support memchecks involving loop-invariant addresses We used to only allow SCEVAddRecExpr for pointer expressions in order to be able to compute the bounds. However this is also trivially possible for loop-invariant addresses (scUnknown) since then the bounds are the address itself. Interestingly, we used allow this for the special case when the loop-invariant address happens to also be an SCEVAddRecExpr (in an outer loop). There are a couple more loops that are vectorized in SPEC after this. My guess is that the main reason we don't see more because for example a loop-invariant load is vectorized into a splat vector with several vector-inserts. This is likely to make the vectorization unprofitable. I.e. we don't notice that a later LICM will move all of this out of the loop so the cost estimate should really be 0. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264243 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-24 04:28:47 +00:00
Matthias Braun	a31e891389	Revert "Support arbitrary addrspace pointers in masked load/store intrinsics" This commit broke LTO builds. Reverting it to unbreak the bots while the issue is investigated. See also: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160321/341002.html This reverts r263158 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264088 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-22 20:24:34 +00:00
Nicolai Haehnle	e9f50a4929	AMDGPU/SI: Add llvm.amdgcn.buffer.atomic.* intrinsics Summary: These intrinsics expose the BUFFER_ATOMIC_* instructions and will be used by Mesa to implement atomics with buffer semantics. The intrinsic interface matches that of buffer.load.format and buffer.store.format, except that the GLC bit is not exposed (it is automatically deduced based on whether the return value is used). The change of hasSideEffects is required for TableGen to accept the pattern that matches the intrinsic. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, rivanvx, llvm-commits Differential Revision: http://reviews.llvm.org/D18151 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263791 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-18 16:24:31 +00:00
Nicolai Haehnle	d3939c80f8	AMDGPU: mark atomic instructions as sources of divergence Summary: As explained by the comment, threads will typically see different values returned by atomic instructions even if the arguments are equal. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263719 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-17 16:21:59 +00:00
Nicolai Haehnle	555e806a6f	AMDGPU: mark llvm.amdgcn.image.atomic.* as a source of divergence Summary: When multiple threads perform an atomic op with the same arguments, they will usually see different return values. Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D18101 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263440 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-14 15:37:18 +00:00
Chandler Carruth	f51faf0abd	[PM/AA] Teach the AAManager how to handle module analyses in addition to function analyses, and use it to wire up globals-aa to the new pass manager. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263211 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-11 09:15:11 +00:00
Artur Pilipenko	980df33d17	Support arbitrary addrspace pointers in masked load/store intrinsics This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace. The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics. Reviewed By: reames Differential Revision: http://reviews.llvm.org/D17270 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263158 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-10 20:39:22 +00:00
Chandler Carruth	f1b2c7b945	[CG] Add a new pass manager printer pass for the old call graph and actually finish wiring up the old call graph. There were bugs in the old call graph that hadn't been caught because it wasn't being tested. It wasn't being tested because it wasn't in the pipeline system and we didn't have a printing pass to run in tests. This fixes all of that. As for why I'm still keeping the old call graph alive its so that I can port GlobalsAA to the new pass manager with out forking it to work with the lazy call graph. That's clearly the right eventual design, but it seems pragmatic to defer that until its necessary. The old call graph works just fine for GlobalsAA. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263104 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-10 11:24:11 +00:00
Chandler Carruth	c9fd7655a0	[LCG] Spell the printing pass pipeline name for the lazy call graph 'lcg' instead of just 'cg'. This makes it consistent with the analysis name of 'lcg'. No functionality changed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263103 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-10 11:24:06 +00:00

... 2 3 4 5 6 ...

1387 Commits