RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2024-12-13 23:18:51 +00:00

Author	SHA1	Message	Date
Dario Domizioli	a054f10ffe	Test commit. Added blank line. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208298 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-08 11:28:14 +00:00
Hal Finkel	f35ce2376c	Move late partial-unrolling thresholds into the processor definitions The old method used by X86TTI to determine partial-unrolling thresholds was messy (because it worked by testing target features), and also would not correctly identify the target CPU if certain target features were disabled. After some discussions on IRC with Chandler et al., it was decided that the processor scheduling models were the right containers for this information (because it is often tied to special uop dispatch-buffer sizes). This does represent a small functionality change: - For generic x86-64 (which uses the SB model and, thus, will get some unrolling). - For AMD cores (because they still currently use the SB scheduling model) - For Haswell (based on benchmarking by Louis Gerbarg, it was decided to bump the default threshold to 50; we're working on a test case for this). Otherwise, nothing has changed for any other targets. The logic, however, has been moved into BasicTTI, so other targets may now also opt-in to this functionality simply by setting LoopMicroOpBufferSize in their processor model definitions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208289 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-08 09:14:44 +00:00
Duncan P. N. Exon Smith	76c17d324c	IR: Don't allow non-default visibility on local linkage Visibilities of `hidden` and `protected` are meaningless for symbols with local linkage. - Change the assembler to reject non-default visibility on symbols with local linkage. - Change the bitcode reader to auto-upgrade `hidden` and `protected` to `default` when the linkage is local. - Update LangRef. <rdar://problem/16141113> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208263 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-07 22:57:20 +00:00
Michael Zolotukhin	355e0a6460	[InstCombine] Add optimization of redundant insertvalue instructions. rdar://problem/11861387 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208214 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-07 14:30:18 +00:00
Nick Lewycky	05da4dd998	Improve 'tail' call marking in TRE. A bootstrap of clang goes from 375k calls marked tail in the IR to 470k, however this improvement does not carry into an improvement of the call/jmp ratio on x86. The most common pattern is a tail call + br to a block with nothing but a 'ret'. The number of tail call to loop conversions remains the same (1618 by my count). The new algorithm does a local scan over the use-def chains to identify local "alloca-derived" values, as well as points where the alloca could escape. Then, a visit over the CFG marks blocks as being before or after the allocas have escaped, and annotates the calls accordingly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208017 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-05 23:59:03 +00:00
Michael Zolotukhin	830195e5fb	Move test from r207969 to another folder and rename it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207984 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-05 18:10:15 +00:00
Yi Jiang	606660f1f3	Always set alignment of vectorized LD/ST in SLP-Vectorizer. <rdar://problem/16812145> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207983 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-05 17:59:14 +00:00
Duncan P. N. Exon Smith	bbd9c21e07	LTO: -internalize sets visibility to default Visibility is meaningless when the linkage is local. Change `-internalize` to reset the visibility to `default`. <rdar://problem/16141113> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207979 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-05 17:40:44 +00:00
Michael Zolotukhin	1c87e2a3a8	Fix test from r207966 and add a comment there. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207969 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-05 14:46:53 +00:00
Michael Zolotukhin	07db1c9b30	Add regression test for r207692. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207966 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-05 14:05:25 +00:00
Benjamin Kramer	99b03e3401	LoopUnroll: If we're doing partial unrolling, use the PartialThreshold to limit unrolling. Otherwise we use the same threshold as for complete unrolling, which is way too high. This made us unroll any loop smaller than 150 instructions by 8 times, but only if someone specified -march=core2 or better, which happens to be the default on darwin. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207940 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-04 19:12:38 +00:00
Arnold Schwaighofer	28a739b4dc	SLPVectorizer: Bring back the insertelement patch (r205965) with fixes When can't assume a vectorized tree is rooted in an instruction. The IRBuilder could have constant folded it. When we rebuild the build_vector (the series of InsertElement instructions) use the last original InsertElement instruction. The vectorized tree root is guaranteed to be before it. Also, we can't assume that the n-th InsertElement inserts the n-th element into a vector. This reverts r207746 which reverted the revert of the revert of r205018 or so. Fixes the test case in PR19621. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207939 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-04 17:10:15 +00:00
Karthik Bhat	486ad6262e	Vectorize intrinsic math function calls in SLPVectorizer. This patch adds support to recognize and vectorize intrinsic math functions in SLPVectorizer. Review: http://reviews.llvm.org/D3560 and http://reviews.llvm.org/D3559 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207901 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-03 09:59:54 +00:00
Adam Nemet	3aa9b4911c	[LSR] Add llc testcase for r207271/r207569. See PR19608 for the details but to summarize it was easy to modify the .ll file to get the desired def-use ordering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207887 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-02 23:49:01 +00:00
Nico Weber	3a5b1043d0	Teach GlobalDCE how to remove empty global_ctor entries. This moves most of GlobalOpt's constructor optimization code out of GlobalOpt into Transforms/Utils/CDtorUtils.{h,cpp}. The public interface is a single function OptimizeGlobalCtorsList() that takes a predicate returning which constructors to remove. GlobalOpt calls this with a function that statically evaluates all constructors, just like it did before. This part of the change is behavior-preserving. Also add a call to this from GlobalDCE with a filter that removes global constructors that contain a "ret" instruction and nothing else – this fixes PR19590. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207856 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-02 18:35:25 +00:00
Akira Hatanaka	d753e830cd	[GVN] Pass the phi-translated address of a load instead of the untranslated address to AnalyzeLoadFromClobberingLoad. This fixes a bug in load-PRE where PRE is applied to a load that is not partially redundant. <rdar://problem/16638765>. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207853 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-02 17:59:17 +00:00
Nick Lewycky	26ad3eb69d	Fold strlen(expr ? "str1" : "str2") to x ? len1 : len2. This fires about 330 times in a bootstrap of clang. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207828 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-02 04:11:45 +00:00
Eli Bendersky	167a57ca45	Add an optimization that does CSE in a group of similar GEPs. This optimization merges the common part of a group of GEPs, so we can compute each pointer address by adding a simple offset to the common part. The optimization is currently only enabled for the NVPTX backend, where it has a large payoff on some benchmarks. Review: http://reviews.llvm.org/D3462 Patch by Jingyue Wu. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207783 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-01 18:38:36 +00:00
Chandler Carruth	06ace3c869	Revert r205965, which essentially reverts r205018 for the second time. =[ Turns out that this was the root cause of PR19621. We found a crasher only recently (likely due to improvements elsewhere in the SLP vectorizer) but the reduced test case failed all the way back to here. I've confirmed that reverting this patch both fixes the reduced test case in PR19621 and the actual source file that led to it, so it seems to really be rooted here. I've replied to the commit thread with discussion of my (feeble) attempts to debug this. Didn't make it very far, so reverting now that we have a good test case so that things can get back to healthy while the debugging carries on. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207746 91177308-0d34-0410-b5e6-96231b3b80d8	2014-05-01 11:24:11 +00:00
Michael Zolotukhin	c80b103a2b	[X86] Never hoist the shift value of a shift instruction. There is no need to check if we want to hoist the immediate value of an shift instruction. Simply return TCC_Free right away. This change is like r206101, but for X86. rdar://problem/16190769 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207692 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-30 19:17:32 +00:00
Carlo Kok	78ecea93a3	[IPO/MergeFunctions] changes so it doesn't try to bitcast a struct return type but instead recreates it with insert/extract value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207679 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-30 17:53:04 +00:00
David Majnemer	bf741d2bdd	IR: Conservatively verify inalloca arguments Summary: Try to spot obvious mismatches with inalloca use. Reviewers: rnk Subscribers: llvm-commits Differential Revision: http://reviews.llvm.org/D3572 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207676 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-30 17:22:00 +00:00
Rafael Espindola	2259a26a5d	Also handle ConstantAggregateZero when optimizing vpermilvar*. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207582 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-29 22:20:40 +00:00
Rafael Espindola	984f2fc09e	Two fixes to the vpermilvar optimization. The instcomine logic to handle vpermilvar's pd and 256 variants was incorrect. The _256 variants have indexes into the individual 128 bit lanes and in all cases it also has to mask out unused bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207577 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-29 20:41:54 +00:00
Diego Novillo	55deff895d	Fix vectorization remarks. This patch changes the vectorization remarks to also inform when vectorization is possible but not beneficial. Added tests to exercise some loop remarks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207574 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-29 20:06:10 +00:00
Yi Jiang	bbea6143f2	Continue slp vectorization even the BB already has vectorized store radar://16641956 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207572 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-29 19:37:20 +00:00
Zinovy Nis	c5e41aed09	[OPENMP][LV][D3423] Respect Hints.Force meta-data for loops in LoopVectorizer git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207512 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-29 08:55:11 +00:00
Chandler Carruth	af09fb613f	Revert r207271 for now. This commit introduced a test case that ran clang directly from the LLVM test suite! That doesn't work. I've followed up on the review thread to try and get a viable solution sorted out, but trying to get the tree clean here. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207462 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-28 23:07:49 +00:00
Hans Wennborg	6426666f65	InstCombine: don't drop 'inalloca' in PromoteCastOfAllocation (PR19569) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207426 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-28 17:40:03 +00:00
Chandler Carruth	d6d57bc3fb	[inliner] Significantly improve the compile time in cases like PR19499 by avoiding inlining massive switches merely because they have no instructions in them. These switches still show up where we fail to form lookup tables, and in those cases they are actually going to cause a very significant code size hit anyways, so inlining them is not the right call. The right way to fix any performance regressions stemming from this is to enhance the switch-to-lookup-table logic to fire in more places. This makes PR19499 about 5x less bad. It uncovers a second compile time problem in that test case that is unrelated (surprisingly!). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207403 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-28 08:52:44 +00:00
Gerolf Hoflehner	b79f1fe084	RecursivelyDeleteTriviallyDeadInstructions() could remove more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 Repaired r207302. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207309 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 05:58:11 +00:00
Gerolf Hoflehner	9d4048578c	Revert commit r207302 since build failures have been reported. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207303 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 02:03:17 +00:00
Gerolf Hoflehner	4c9277bb9f	RecursivelyDeleteTriviallyDeadInstructions() could remove more than 1 instruction. The caller need to be aware of this and adjust instruction iterators accordingly. rdar://16679376 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207302 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 01:19:16 +00:00
Andrea Di Biagio	96db9b8ed8	[InstCombine][X86] Teach how to fold calls to SSE2/AVX2 packed logical shift right intrinsics. A packed logical shift right with a shift count bigger than or equal to the element size always produces a zero vector. In all other cases, it can be safely replaced by a 'lshr' instruction. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207299 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-26 01:03:22 +00:00
Adam Nemet	d761cc1dfa	[LoopStrengthReduce] Don't trim formula that uses a subset of required registers Consider this use from the new testcase: LSR Use: Kind=ICmpZero, Offsets={0}, widest fixup type: i32 reg({1000,+,-1}<nw><%for.body>) -3003 + reg({3,+,3}<nw><%for.body>) -1001 + reg({1,+,1}<nuw><nsw><%for.body>) -1000 + reg({0,+,1}<nw><%for.body>) -3000 + reg({0,+,3}<nuw><%for.body>) reg({-1000,+,1}<nw><%for.body>) reg({-3000,+,3}<nsw><%for.body>) This is the last use we consider for a solution in SolveRecurse, so CurRegs is a large set. (CurRegs is the set of registers that are needed by the previously visited uses in the in-progress solution.) ReqRegs is { {3,+,3}<nw><%for.body>, {1,+,1}<nuw><nsw><%for.body> } This is the intersection of the regs used by any of the formulas for the current use and CurRegs. Now, the code requires a formula to contain all these regs (the comment is simply wrong), otherwise the formula is immediately disqualified. Obviously, no formula for this use contains two regs so they will all get disqualified. The fix modifies the check to allow the formula in this case. The idea is that neither of these formulae is introducing any new registers which is the point of this early pruning as far as I understand. In terms of set arithmetic, we now allow formulas whose used regs are a subset of the required regs not just the other way around. There are few more loops in the test-suite that are now successfully LSRed. I have benchmarked those and found very minimal change. Fixes <rdar://problem/13965777> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207271 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-25 21:02:21 +00:00
Manman Ren	3bd471dee2	[inline cold threshold] Command line argument for inline threshold will override the default cold threshold. When we use command line argument to set the inline threshold, the default cold threshold will not be used. This is in line with how we use OptSizeThreshold. When we want a higher threshold for all functions, we do not have to set both inline threshold and cold threshold. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207245 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-25 17:34:55 +00:00
Karthik Bhat	ac16f0e024	Allow vectorization of bit intrinsics in BB Vectorizer. This patch adds support for vectorization of bit intrinsics such as bswap,ctpop,ctlz,cttz. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207174 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-25 03:33:48 +00:00
Zinovy Nis	25209ab486	[CLNUP] Test commit. Remove newline. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207089 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 08:42:58 +00:00
Karthik Bhat	0698b2b6cc	Allow vectorization of few missed llvm intrinsic calls in BBVectorizor by handling them in isVectorizableIntrinsic function. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207085 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 07:29:55 +00:00
Michael J. Spencer	96363d5001	[InstCombine][x86] Constant fold psll intrinsics. This excludes avx512 as I don't have hardware to verify. It excludes _dq variants because they are represented in the IR as <{2,4} x i64> when it's actually a byte shift of the entire i{128,265}. This also excludes _dq_bs as they aren't at all supported by the backend. There are also no corresponding instructions in the ISA. I have no idea why they exist... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207058 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 00:58:18 +00:00
Filipe Cabecinhas	cd9f6b870e	Optimize some special cases for SSE4a insertqi Summary: Since the upper 64 bits of the destination register are undefined when performing this operation, we can substitute it and let the optimizer figure out that only a copy is needed. Also added range merging, if an instruction copies a range that can be merged with a previous copied range. Added test cases for both optimizations. Reviewers: grosbach, nadav CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3357 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207055 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 00:38:14 +00:00
Matt Arsenault	8bd9405026	Handle addrspacecast when looking at memcpys from globals git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207054 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-24 00:01:09 +00:00
Matt Arsenault	0e92fe9dce	Convert test to FileCheck git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@207015 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-23 19:32:37 +00:00
Alexander Musman	bf255f5d5a	[LV] Statistics numbers for LoopVectorize introduced: a number of analyzed loops & a number of vectorized loops. Use -stats to see how many loops were analyzed for possible vectorization and how many of them were actually vectorized. Patch by Zinovy Nis Differential Revision: http://reviews.llvm.org/D3438 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206956 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-23 08:40:37 +00:00
Juergen Ributzka	b95412cc24	[Constant Hoisting] Materialize the constant before the cloned cast instruction. In the case where the constant comes from a cloned cast instruction, the materialization code has to go before the cloned cast instruction. This commit fixes the method that finds the materialization insertion point by making it aware of this case. This fixes <rdar://problem/15532441> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206913 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-22 18:06:58 +00:00
Rafael Espindola	db0a73f31b	Simplify a vpermil* with constant mask. With a constant mask a vpermil* is just a shufflevector. This patch implements that simplification. This allows us to produce denser code. It should also allow more folding down the line. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206801 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-21 22:06:04 +00:00
Reid Kleckner	0df9abbd63	Fix PR7272 in -tailcallelim instead of the inliner The -tailcallelim pass should be checking if byval or inalloca args can be captured before marking calls as tail calls. This was the real root cause of PR7272. With a better fix in place, revert the inliner change from r105255. The test case it introduced still passes and has been moved to test/Transforms/Inline/byval-tail-call.ll. Reviewers: chandlerc Differential Revision: http://reviews.llvm.org/D3403 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206789 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-21 20:48:47 +00:00
Jiangning Liu	eea662fead	Add missing config file for newly added test case introduced by r206563. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206567 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 09:05:50 +00:00
Jiangning Liu	a1da819896	This commit allows vectorized loops to be unrolled by a factor of 2 for AArch64. A new test case is also added for ARM64. Patched by Z.Zheng git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206563 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-18 07:57:54 +00:00
Diego Novillo	0a0d620db3	Fix bug 19437 - Only add discriminators for DWARF 4 and above. Summary: This prevents the discriminator generation pass from triggering if the DWARF version being used in the module is prior to 4. Reviewers: echristo, dblaikie CC: llvm-commits Differential Revision: http://reviews.llvm.org/D3413 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@206507 91177308-0d34-0410-b5e6-96231b3b80d8	2014-04-17 22:33:50 +00:00

1 2 3 4 5 ...

5542 Commits