archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Sanjay Patel	c810b4e17d	[InstCombine] fold shuffle-with-binop and common value This is the last significant change suggested in PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806#c5 ...though there are several follow-ups noted in the code comments in this patch to complete this transform. It's possible that a binop feeding a select-shuffle has been eliminated by earlier transforms (or the code was just written like this in the 1st place), so we'll fail to match the patterns that have 2 binops from: D48401, D48678, D48662, D48485. In that case, we can try to materialize identity constants for the remaining binop to fill in the "ghost" lanes of the vector (where we just want to pass through the original values of the source operand). I added comments to ConstantExpr::getBinOpIdentity() to show planned follow-ups. For now, we only handle the 5 commutative integer binops (add/mul/and/or/xor). Differential Revision: https://reviews.llvm.org/D48830 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336196 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-03 13:44:22 +00:00
Sanjay Patel	c9a157f7fd	[InstCombine] reverse canonicalization of add --> or to allow more shuffle folding This extends D48485 to allow another pair of binops (add/or) to be combined either with or without a leading shuffle: or X, C --> add X, C (when X and C have no common bits set) Here, we need value tracking to determine that the 'or' can be reversed into an 'add', and we've added general infrastructure to allow extending to other opcodes or moving to where other passes could use that functionality. Differential Revision: https://reviews.llvm.org/D48662 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336128 91177308-0d34-0410-b5e6-96231b3b80d8	2018-07-02 17:42:29 +00:00
Sanjay Patel	ce6e09c740	[InstCombine] enhance shuffle-of-binops to allow different variable ops (PR37806) This was discussed in D48401 as another improvement for: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have 2 different variable values, then we shuffle (select) those lanes, shuffle (select) the constants, and then perform the binop. This eliminates a binop. The new shuffle uses the same shuffle mask as the existing shuffle, so there's no danger of creating a difficult shuffle. All of the earlier constraints still apply, but we also check for extra uses to avoid creating more instructions than we'll remove. Additionally, we're disallowing the fold for div/rem because that could expose a UB hole. Differential Revision: https://reviews.llvm.org/D48678 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335974 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-29 13:44:06 +00:00
Sanjay Patel	83bc27f0aa	[InstCombine] fix opcode check in shuffle fold There's no way to expose this difference currently, but we should use the updated variable because the original opcodes can go stale if we transform into something new. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335920 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 20:52:43 +00:00
Sanjay Patel	25be6bdd6f	[InstCombine] allow shl+mul combos with shuffle (select) fold (PR37806) This is an enhancement to D48401 that was discussed in: https://bugs.llvm.org/show_bug.cgi?id=37806 We can convert a shift-left-by-constant into a multiply (we canonicalize IR in the other direction because that's generally better of course). This allows us to remove the shuffle as we do in the regular opcodes-are-the-same cases. This requires a small hack to make sure we don't introduce any extra poison: https://rise4fun.com/Alive/ZGv Other examples of opcodes where this would work are add+sub and fadd+fsub, but we already canonicalize those subs into adds, so there's nothing to do for those cases AFAICT. There are planned enhancements for opcode transforms such or -> add. Note that there's a different fold needed if we've already managed to simplify away a binop as seen in the test based on PR37806, but we manage to get that one case here because this fold is positioned above the demanded elements fold currently. Differential Revision: https://reviews.llvm.org/D48485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335888 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-28 17:48:04 +00:00
Sanjay Patel	bbfb91da0d	[InstCombine] rearrange shuffle-of-binops logic; NFC The commutative matcher makes things more complicated here, and I'm planning an enhancement where this form is more readable. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335343 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-22 12:46:16 +00:00
Sanjay Patel	7fe9364c8b	[InstCombine] fix shuffle-of-binops bug With non-commutative binops, we could be using the same variable value as operand 0 in 1 binop and operand 1 in the other, so we have to check for that possibility and bail out. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335312 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 23:56:59 +00:00
Sanjay Patel	d7f1ecfded	[InstCombine] fold vector select of binops with constant ops to 1 binop (PR37806) This is the simplest case from PR37806: https://bugs.llvm.org/show_bug.cgi?id=37806 If we have a common variable operand used in a pair of binops with vector constants that are vector selected together, then we can constant shuffle the constant vectors to eliminate the shuffle instruction. This has some tricky parts that are hopefully addressed in the tests and their respective comments: 1. If the shuffle mask contains an undef element, then that lane of the result is undef: http://llvm.org/docs/LangRef.html#shufflevector-instruction Therefore, we can replace the constant in that lane with an undef value except for div/rem. With div/rem, an undef in the divisor would cause the whole op to be undef. So I'm using the same hack as in D47686 - replace the undefs with '1'. 2. Intersect the wrapping and FMF of the original binops for the new binop. There should be no extra poison or fast-math potential in the new binop that wasn't possible in the original code. 3. Disregard other uses. Given that we're eliminating uses (shortening the dependency chain), I think that's always the right IR canonicalization. But I purposely chose the udiv test to demonstrate the scenario where both intermediate values have other uses because that seems likely worse for codegen with an expensive math op. This seems like a very rare possibility to me, so I don't think it requires a backend patch first. Differential Revision: https://reviews.llvm.org/D48401 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335283 91177308-0d34-0410-b5e6-96231b3b80d8	2018-06-21 20:15:09 +00:00
Simon Pilgrim	415d43cb70	[InstCombine] Gracefully handle out of range extractelement indices InstSimplify is responsible for handling these, but we shouldn't just assert here. Reduced from oss-fuzz #4808 test case git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321489 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-27 12:00:18 +00:00
Igor Laevsky	fcf12e077b	Reintroduce r320049, r320014 and r319894. OpenGL issues should be fixed by now. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320568 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-13 11:21:18 +00:00
Igor Laevsky	535b72d219	Revert r320049, r320014 and r319894 They were causing failures of the piglit OpenGL tests with AMD GPUs using the Mesa radeonsi driver. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320466 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-12 10:03:39 +00:00
Igor Laevsky	e18338969d	[InstCombine] Don't crash on out of bounds index in the insertelement Differential Revision: https://reviews.llvm.org/D40390 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320049 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 15:00:52 +00:00
Sanjay Patel	c7c532a9fd	[InstCombine] use 'auto' with 'dyn_cast'; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319067 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-27 18:19:32 +00:00
Eugene Zelenko	26ee77f253	[Transforms] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316503 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-24 21:24:53 +00:00
Sanjay Patel	963e18e511	[InstCombine] fix formatting; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315223 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-09 17:54:46 +00:00
Sanjay Patel	00b5bff2c2	[InstCombine] remove extract-of-select vector transform (2nd try) The 1st attempt at this: https://reviews.llvm.org/rL314117 was reverted at: https://reviews.llvm.org/rL314118 because of bot fails for clang tests that were checking optimized IR. That should be fixed with: https://reviews.llvm.org/rL314144 ...so try again. Original commit message: The transform to convert an extract-of-a-select-of-vectors was added at: https://reviews.llvm.org/rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314147 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-25 20:30:53 +00:00
Sanjay Patel	26f60ee8f1	revert r314117 because there are bogus clang tests that depend on the optimizer git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314118 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-25 17:00:04 +00:00
Sanjay Patel	752909f8e3	[InstCombine] remove extract-of-select vector transform The transform to convert an extract-of-a-select-of-vectors was added at: rL194013 And a question about the validity of this transform was raised in the review: https://reviews.llvm.org/D1539: ...but not answered AFAICT> Most of the motivating cases in that patch are now handled by other combines. These are the tests that were added with the original commit, but they are not regressing even after we remove the transform in this patch. The diffs we see after removing this transform cause us to avoid increasing the instruction count, so we don't want to do those transforms as canonicalizations. The motivation for not turning a vector-select-of-vectors into a scalar operation is shown in PR33301: https://bugs.llvm.org/show_bug.cgi?id=33301 ...in those cases, we'll get vector ops with this patch rather than the vector/scalar mix that we currently see. Differential Revision: https://reviews.llvm.org/D38006 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314117 91177308-0d34-0410-b5e6-96231b3b80d8	2017-09-25 16:41:34 +00:00
Sanjay Patel	da536d4e17	[InstCombine] improve demanded vector elements analysis of insertelement Recurse instead of returning on the first found optimization. Also, return early in the caller instead of continuing because that allows another round of simplification before we might potentially lose undef information from a shuffle mask by eliminating the shuffle. As noted in the review, we could probably do better and be more efficient by moving all of demanded elements into a separate pass, but this is yet another quick fix to instcombine. Differential Revision: https://reviews.llvm.org/D37236 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312248 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-31 15:57:17 +00:00
Florian Hahn	785320780e	[InstCombine] Fold insert sequence if first ins has multiple users. Summary: If the first insertelement instruction has multiple users and inserts at position 0, we can re-use this instruction when folding a chain of insertelement instructions. As we need to generate the first insertelement instruction anyways, this should be a strict improvement. We could get rid of the restriction of inserting at position 0 by creating a different shufflemask, but it is probably worth to keep the first insertelement instruction with position 0, as this is easier to do efficiently than at other positions I think. Reviewers: grosser, mkuper, fpetrogalli, efriedma Reviewed By: fpetrogalli Subscribers: gareevroman, llvm-commits Differential Revision: https://reviews.llvm.org/D37064 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312110 91177308-0d34-0410-b5e6-96231b3b80d8	2017-08-30 10:54:21 +00:00
Craig Topper	f552e96e02	[InstCombine] Make InstCombine's IRBuilder be passed by reference everywhere Previously the InstCombiner class contained a pointer to an IR builder that had been passed to the constructor. Sometimes this would be passed to helper functions as either a pointer or the pointer would be dereferenced to be passed by reference. This patch makes it a reference everywhere including the InstCombiner class itself so there is more inconsistency. This a large, but mechanical patch. I've done very minimal formatting changes on it despite what clang-format wanted to do. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@307451 91177308-0d34-0410-b5e6-96231b3b80d8	2017-07-07 23:16:26 +00:00
Craig Topper	d93348f701	[InstCombine] Pass a proper context instruction to all of the calls into InstSimplify Summary: This matches the behavior we already had for compares and makes us consistent everywhere. Reviewers: dberlin, hfinkel, spatel Reviewed By: dberlin Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D33604 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@305049 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-09 03:21:29 +00:00
Sven van Haastregt	3303806a3d	[InstCombine] Fix extractelement use before def This fixes a bug that can cause extractelements with operands that haven't been defined yet to be inserted at a wrong point when optimising insertelements. Patch by Karl Hylen. Differential Revision: https://reviews.llvm.org/D33449 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@304701 91177308-0d34-0410-b5e6-96231b3b80d8	2017-06-05 09:18:10 +00:00
Daniel Berlin	516ca41d78	InstCombine: Use the new SimplifyQuery versions of Simplify*. Use AssumptionCache, DominatorTree, TargetLibraryInfo everywhere. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301464 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-26 20:56:07 +00:00
Zvi Rackover	c7f9b76a74	InstCombine: Use the InstSimplify hook for shufflevector Summary: Start using the recently added InstSimplify hook for shuffles in the respective InstCombine visitor. Reviewers: spatel, RKSimon, craig.topper, majnemer Reviewed By: majnemer Subscribers: majnemer, llvm-commits Differential Revision: https://reviews.llvm.org/D31526 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299412 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 04:47:57 +00:00
Sanjay Patel	cdd1430efb	[InstCombine] canonicalize insertelement of scalar constant ahead of insertelement of variable insertelement (insertelement X, Y, IdxC1), ScalarC, IdxC2 --> insertelement (insertelement X, ScalarC, IdxC2), Y, IdxC1 As noted in the code comment and seen in the test changes, the motivation is that by pulling constant insertion up, we may be able to constant fold some insertelement instructions. Differential Revision: https://reviews.llvm.org/D31196 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298520 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 17:10:44 +00:00
Eugene Leviant	6684504281	InstCombine: fix extraction when performing vector/array punning Differential revision: https://reviews.llvm.org/D29491 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295429 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-17 07:36:03 +00:00
Craig Topper	f05137f8f8	[InstCombine] Use getVectorNumElements instead of explicitly casting to VectorType and calling getNumElements. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290707 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-29 07:03:18 +00:00
Michael Kuperstein	04912c8225	[InstCombine] Canonicalize insert splat sequences into an insert + shuffle This adds a combine that canonicalizes a chain of inserts which broadcasts a value into a single insert + a splat shufflevector. This fixes PR31286. Differential Revision: https://reviews.llvm.org/D27992 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290641 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-28 00:18:08 +00:00
Daniel Jasper	8de3a54f07	Revert @llvm.assume with operator bundles (r289755-r289757) This creates non-linear behavior in the inliner (see more details in r289755's commit thread). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290086 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-19 08:22:17 +00:00
Hal Finkel	bffeba468d	Remove the AssumptionCache After r289755, the AssumptionCache is no longer needed. Variables affected by assumptions are now found by using the new operand-bundle-based scheme. This new scheme is more computationally efficient, and also we need much less code... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289756 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-15 03:02:15 +00:00
Sanjay Patel	9c5e4bac4a	[InstCombine] avoid infinite loop from shuffle-extract-insert sequence (PR30923) Removing the limitation in visitInsertElementInst() causes several regressions because we're not prepared to fold sequences of shuffles or inserts and extracts separated by shuffles. Fixing that appears to be a difficult mission because we are purposely trying to avoid creating shuffles with arbitrary shuffle masks because some targets may choke on those. https://llvm.org/bugs/show_bug.cgi?id=30923 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286423 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-10 00:15:14 +00:00
Alexey Bataev	b9dfeea817	[InstCombine] Fix for PR29124: reduce insertelements to shufflevector If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282237 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-23 09:14:08 +00:00
Sanjay Patel	ef2c802039	[InsttCombine] fold insertelement of constant into shuffle with constant operand (PR29126) The motivating case occurs with SSE/AVX scalar intrinsics, so this is a first step towards shrinking that to a single shufflevector. Note that the transform is intentionally limited to shuffles that are equivalent to vector selects to avoid creating arbitrary shuffle masks that may not lower well. This should solve PR29126: https://llvm.org/bugs/show_bug.cgi?id=29126 Differential Revision: https://reviews.llvm.org/D23886 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280504 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-02 17:05:43 +00:00
Justin Bogner	afba697b6c	InstCombine: Replace some never-null pointers with references. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277792 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-05 01:06:44 +00:00
Michael Kuperstein	6954e6256d	[InstCombine] scalarizePHI should not assume the code it sees has been CSE'd scalarizePHI only looked for phis that have exactly two uses - the "latch" use, and an extract. Unfortunately, we can not assume all equivalent extracts are CSE'd, since InstCombine itself may create an extract which is a duplicate of an existing one. This extends it to handle several distinct extracts from the same index. This should fix at least some of the performance regressions from PR27988. Differential Revision: http://reviews.llvm.org/D20983 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271961 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 23:38:33 +00:00
Owen Anderson	2b8030cd97	Fix an issue where fast math flags were dropped during scalarization. Most portions of InstCombine properly propagate fast math flags, but apparently the vector scalarization section was overlooked. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262376 91177308-0d34-0410-b5e6-96231b3b80d8	2016-03-01 19:35:52 +00:00
Sanjay Patel	7d0cdb4a10	function names start with a lowercase letter; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259425 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-01 22:23:39 +00:00
Sanjay Patel	b17df8b4d7	[InstCombine] avoid an insertelement transformation that induces the opposite extractelement fold (PR26354) We would infinite loop because we created a shufflevector that was wider than needed and then failed to combine that with the insertelement. When subsequently visiting the extractelement from that shuffle, we see that it's unnecessary, delete it, and trigger another visit to the insertelement. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259236 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-29 20:21:02 +00:00
Sanjay Patel	810605370d	[InstCombine] insert a new shuffle in a safe place (PR25999) Limit this transform to a basic block and guard against PHIs. Hopefully, this fixes the remaining failures in PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257133 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-08 01:39:16 +00:00
Sanjay Patel	7a3b692c47	[InstCombine] insert a new shuffle before its uses (PR26015) Although this solves the test case in PR26015: https://llvm.org/bugs/show_bug.cgi?id=26015 And may solve PR25999: https://llvm.org/bugs/show_bug.cgi?id=25999 ...I suspect this is not the best solution. I think we want to insert the new shuffle just ahead of the earliest ExtractElementInst that we're replacing, but I don't know how that should be implemented. Differential Revision: http://reviews.llvm.org/D15878 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256857 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-05 19:09:47 +00:00
Sanjay Patel	75759ab3e9	[InstCombine] transform more extract/insert pairs into shuffles (PR2109) This is an extension of the shuffle combining from r203229: http://reviews.llvm.org/rL203229 The idea is to widen a short input vector with undef elements so the existing shuffle transform for extract/insert can kick in. The motivation is to finally solve PR2109: https://llvm.org/bugs/show_bug.cgi?id=2109 For that example, the IR becomes: %1 = bitcast <2 x i32>* %P to <2 x float>* %ld1 = load <2 x float>, <2 x float>* %1, align 8 %2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef> %i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5> ret <4 x float> %i2 And x86 SSE output improves from: movq (%rdi), %xmm1 ## xmm1 = mem[0],zero movdqa %xmm1, %xmm2 shufps $229, %xmm2, %xmm2 ## xmm2 = xmm2[1,1,2,3] shufps $48, %xmm0, %xmm1 ## xmm1 = xmm1[0,0],xmm0[3,0] shufps $132, %xmm1, %xmm0 ## xmm0 = xmm0[0,1],xmm1[0,2] shufps $32, %xmm0, %xmm2 ## xmm2 = xmm2[0,0],xmm0[2,0] shufps $36, %xmm2, %xmm0 ## xmm0 = xmm0[0,1],xmm2[2,0] retq To the almost optimal: movhpd (%rdi), %xmm0 Note: There's a tension in the existing transform related to generating arbitrary shufflevector masks. We avoid that in other places in InstCombine because we're scared that codegen can't handle strange masks, but it looks like we're ok with producing those here. I purposely chose weird insert/extract indexes for the regression tests to see the effect in these cases. For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or better for these examples. Differential Revision: http://reviews.llvm.org/D15096 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256394 91177308-0d34-0410-b5e6-96231b3b80d8	2015-12-24 21:17:56 +00:00
Sanjay Patel	f7790eca1d	fix typos in comments; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254266 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-29 22:09:34 +00:00
Sanjay Patel	7271f32728	function names start with a lower case letter; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253348 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-17 17:24:08 +00:00
Sanjay Patel	eaf6bc683e	use range-based for loop; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253256 91177308-0d34-0410-b5e6-96231b3b80d8	2015-11-16 22:16:52 +00:00
Duncan P. N. Exon Smith	f83f208edf	InstCombine: Remove ilist iterator implicit conversions, NFC Stop relying on implicit conversions of ilist iterators in LLVMInstCombine. No functionality change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250183 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-13 16:59:33 +00:00
Sanjay Patel	d096e43858	don't repeat function names in comments; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247154 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 15:24:36 +00:00
David Majnemer	46b13dd880	[InstSimplify] Teach InstSimplify how to simplify extractelement git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242008 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 01:15:53 +00:00
David Majnemer	701c2fca7e	[InstCombine] Use DataLayout to determine vector element width InstCombine didn't realize that it needs to use DataLayout to determine how wide pointers are. This lead to assertion failures. This fixes PR23113. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@234046 91177308-0d34-0410-b5e6-96231b3b80d8	2015-04-03 20:18:40 +00:00
David Blaikie	99b7898c29	[opaque pointer type] more gep API migrations Adding nullptr to all the IRBuilder stuff because it's the first thing that fails to build when testing without the back-compat functions, so I'll keep having to re-add these locally for each chunk of migration I do. Might as well check them in to save me the churn. Eventually I'll have to migrate these too, but I'm going breadth-first. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@232270 91177308-0d34-0410-b5e6-96231b3b80d8	2015-03-14 19:24:04 +00:00

1 2 3

113 Commits