RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-17 19:06:09 +00:00

Author	SHA1	Message	Date
Craig Topper	c7bad98e0e	[InstCombine] Move portion of SimplifyDemandedUseBits that deals with instructions with multiple uses out to a separate method. NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300082 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 18:05:21 +00:00
Craig Topper	3461e9c2e4	Teach SimplifyDemandedUseBits that adding or subtractings 0s from every bit below the highest demanded bit can be simplified If we are adding/subtractings 0s below the highest demanded bit we can just use the other operand and remove the operation. My primary motivation is observing that we can call ShrinkDemandedConstant for the add/sub and create a 0 constant, rather than removing the add completely. In the case I saw, we modified the constant on an add instruction to a 0, but the add is not put into the worklist. So we didn't revisit it until the next InstCombine iteration. This caused an IR modification to remove add and a subsequent iteration to be ran. With this change we get bypass the add in the first iteration and prevent the second iteration from changing anything. Differential Revision: https://reviews.llvm.org/D31120 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300075 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-12 16:49:59 +00:00
Craig Topper	099a6fd775	[InstCombine] Use setAllBits in place of getAllOnesValue since we know the bitwidths are the same. NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299413 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-04 05:03:02 +00:00
Craig Topper	68149f546e	[APInt] Move isMask and isShiftedMask out of APIntOps and into the APInt class. Implement them without memory allocation for multiword This moves the isMask and isShiftedMask functions to be class methods. They now use the MathExtras.h function for single word size and leading/trailing zeros/ones or countPopulation for the multiword size. The previous implementation made multiple temorary memory allocations to do the bitwise arithmetic operations to match the MathExtras.h implementation. Differential Revision: https://reviews.llvm.org/D31565 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299362 91177308-0d34-0410-b5e6-96231b3b80d8	2017-04-03 16:34:59 +00:00
Craig Topper	38017a1fee	[APInt] Remove shift functions from APIntOps namespace. Replace the few users with the APInt class methods. NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299248 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-31 20:01:16 +00:00
Craig Topper	da174a7f04	[InstCombine] Change the interface of SimplifyDemandedBits so that it takes the instruction and operand instead of the Use. The first thing it did was get the User for the Use to get the instruction back. This requires looking through the Uses for the User using the waymarking walk. That's pretty fast, but its probably still better to just pass the Instruction we already had. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298772 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-25 06:52:52 +00:00
Craig Topper	7ebd1797a3	Revert r298711 "[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits" Tsan bot is failing. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298745 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 22:12:10 +00:00
Craig Topper	bab6d5ee26	[InstCombine] Provide a way to calculate KnownZero/One for Add/Sub in SimplifyDemandedUseBits without recursing into ComputeKnownBits SimplifyDemandedUseBits for Add/Sub already recursed down LHS and RHS for simplifying bits. If that didn't provide any simplifications we fall back to calling computeKnownBits which will recurse again. Instead just take the known bits for LHS and RHS we already have and call into a new function in ValueTracking that can calculate the known bits given the LHS/RHS bits. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298711 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-24 16:56:51 +00:00
Craig Topper	2612791ba8	[InstCombine] Teach SimplifyDemandedUseBits to shrink Constants on the left side of subtracts Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side. Reviewers: davide, majnemer, spatel, sanjoy, hfinkel Reviewed By: spatel Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D31119 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298478 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-22 04:03:53 +00:00
Craig Topper	9534108197	[InstCombine] Remove duplicate code in SimplifyDemandedUseBits for URem. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298231 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-19 21:45:57 +00:00
Craig Topper	0a70890b84	[InstCombine] Use setHighBits/setLowBits/setBitsFrom in place of getLowBitsSet/getHighBitsSet. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298204 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-19 05:49:16 +00:00
Matt Arsenault	931794f288	AMDGPU: Fix insertion point when reducing load intrinsics The insertion point may be later than the next instruction, so it is necessary to set it when replacing the call. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297439 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 05:25:49 +00:00
Matt Arsenault	f90265a1b0	AMDGPU: Support for SimplifyDemandedVectorElts for load intrinsics git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297408 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 20:34:27 +00:00
Simon Pilgrim	85969be932	Use APInt::getLowBitsSet instead of APInt::getBitsSet for lower bit mask creation git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296882 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-03 16:56:33 +00:00
Craig Topper	4c2f2e48dc	[AVX-512][InstCombine] Teach InstCombine to optimize 512-bit packss/packus intrinsics like it does 128/256-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295294 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-16 07:35:23 +00:00
Sanjay Patel	5a764b03c6	[InstCombine] use m_APInt to allow demanded bits analysis on splat constants git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@294628 91177308-0d34-0410-b5e6-96231b3b80d8	2017-02-09 21:43:06 +00:00
Simon Pilgrim	52cd722eec	[InstCombine][X86] MULDQ/MULUDQ undef -> zero Added early out for single undef input - we were already supporting (and testing) this in the constant folding code, we just do it quicker now Drop undef handling from demanded elts code now that we handle it fully in InstCombiner::visitCallInst git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292913 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-24 11:07:41 +00:00
Simon Pilgrim	116ba1a31a	[InstCombine][X86] Add MULDQ/MULUDQ undef handling git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292627 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 18:20:30 +00:00
Simon Pilgrim	87735961e4	[InstCombine][SSE] Add DemandedElts support for PACKSS/PACKUS instructions Simplify a packss/packus truncation based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28777 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-20 09:28:21 +00:00
Simon Pilgrim	c2e261218f	[InstCombine][AVX2] Add DemandedElts support for VPERMD/VPERMPS shuffles Simplify a vpermv shuffle mask based on the elements of the mask that are actually demanded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292371 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-18 14:47:49 +00:00
Simon Pilgrim	04f56107c5	[InstCombine][X86][AVX] Add DemandedElts support for VPERMILPD/VPERMILPS instructions Simplify a vpermilvar shuffle mask based on the elements of the mask that are actually demanded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292209 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-17 11:35:03 +00:00
Simon Pilgrim	07d3c0f01c	[InstCombine][SSE] Add DemandedElts support for PSHUFB instructions Simplify a pshufb shuffle mask based on the elements of the mask that are actually demanded. Differential Revision: https://reviews.llvm.org/D28745 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292101 91177308-0d34-0410-b5e6-96231b3b80d8	2017-01-16 11:30:41 +00:00
Craig Topper	143fa1a52e	[InstCombine] Fix typo in comment. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290706 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-29 05:38:31 +00:00
Craig Topper	084109508e	[InstCombine] Use a 32-bits instead of 64-bits for storing the number of elements in VectorType for a ShuffleVector. While there getVectorNumElements to avoid an explicit cast. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290705 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-29 04:24:32 +00:00
Craig Topper	6a65066873	[InstCombine][X86] If the lowest element of a scalar intrinsic isn't used make sure we add it to the worklist so we can DCE it sooner. We bypassed the intrinsic and returned the passthru operand, but we should also add the intrinsic to the worklist since its now dead. This can allow DCE to find it sooner and remove it. Similar was done for InsertElement when the inserted element isn't demanded. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290704 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-29 03:30:17 +00:00
Craig Topper	b7ae55eb2c	[InstCombine][X86] Add DemandedElts support for 512-bit PMULDQ/PMULUDQ instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. This builds on r290554 which added supported for 128 and 256-bit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290582 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-27 05:30:09 +00:00
Simon Pilgrim	915b45f09f	[InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use. Differential Revision: https://reviews.llvm.org/D28119 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290554 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-26 23:28:17 +00:00
Craig Topper	0f5f69acca	[InstCombine] Simplify code slightly. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290046 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-17 18:10:04 +00:00
Craig Topper	d941d3f22f	[AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289759 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-15 03:49:45 +00:00
Craig Topper	2eef9bcab6	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better. Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289636 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-14 06:06:58 +00:00
Craig Topper	98bf16eccb	[X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts. Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289632 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-14 05:43:05 +00:00
Craig Topper	23156f1924	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289629 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-14 03:17:30 +00:00
Craig Topper	52ed6069ee	[X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly. Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Also calculate UndefElts correctly. Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289628 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-14 03:17:27 +00:00
Craig Topper	f19ce9bb49	[X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly. Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed. Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289523 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-13 07:45:45 +00:00
Craig Topper	e25a2790d2	[InstCombine][XOP] The instructions for the scalar frcz intrinsics are defined to put 0 in the upper bits, not pass bits through like other intrinsics. So we should return a zero vector instead. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289411 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-11 22:32:38 +00:00
Craig Topper	98435b8bdf	[X86][InstCombine] Add support for scalar FMA intrinsics to SimplifyDemandedVectorElts. This teaches SimplifyDemandedElts that the FMA can be removed if the lower element isn't used. It also teaches it that if upper elements of the first operand aren't used then we can simplify them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289377 91177308-0d34-0410-b5e6-96231b3b80d8	2016-12-11 08:54:52 +00:00
Craig Topper	614df99de4	[X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file. Reviewers: zvi, delena, RKSimon Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D26660 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287083 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-16 05:24:10 +00:00
Alexey Bataev	34552649e8	[InstCombine] Fixed bug introduced in r282237 The index of the new insertelement instruction was evaluated in the wrong way, it was considered as the index of the inserted value instead of index of the position, where the value should be inserted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282401 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-26 13:18:59 +00:00
Alexey Bataev	b9dfeea817	[InstCombine] Fix for PR29124: reduce insertelements to shufflevector If inserting more than one constant into a vector: define <4 x float> @foo(<4 x float> %x) { %ins1 = insertelement <4 x float> %x, float 1.0, i32 1 %ins2 = insertelement <4 x float> %ins1, float 2.0, i32 2 ret <4 x float> %ins2 } InstCombine could reduce that to a shufflevector: define <4 x float> @goo(<4 x float> %x) { %shuf = shufflevector <4 x float> %x, <4 x float> <float undef, float 1.0, float 2.0, float undef>, <4 x i32><i32 0, i32 5, i32 6, i32 3> ret <4 x float> %shuf } Also, InstCombine tries to convert shuffle instruction to single insertelement, if one of the vectors is a constant vector and only a single element from this constant should be used in shuffle, i.e. shufflevector <4 x float> %v, <4 x float> <float undef, float 1.0, float undef, float undef>, <4 x i32> <i32 0, i32 5, i32 undef, i32 undef> -> insertelement <4 x float> %v, float 1.0, 1 Differential Revision: https://reviews.llvm.org/D24182 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282237 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-23 09:14:08 +00:00
Sanjay Patel	2fa0c5869c	don't repeat function names in comments; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275470 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-14 20:54:43 +00:00
Benjamin Kramer	36538ffe93	Apply most suggestions of clang-tidy's performance-unnecessary-value-param Avoids unnecessary copies. All changes audited & pass tests with asan. No functional change intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272190 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-08 19:09:22 +00:00
Simon Pilgrim	30f995aa46	[InstCombine][MMX] Extend SimplifyDemandedUseBits MOVMSK support to MMX Add the MMX implementation to the SimplifyDemandedUseBits SSE/AVX MOVMSK support added in D19614 Requires a minor tweak as llvm.x86.mmx.pmovmskb takes a x86_mmx argument - so we have to be explicit about the implied v8i8 vector type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271789 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-04 13:42:46 +00:00
Sanjay Patel	a5edb789b1	[InstCombine] clean up; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268099 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-29 20:54:56 +00:00
Ahmed Bougacha	c775e31867	[InstCombine] Remove trailing whitespace. NFC. r267873. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267887 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-28 14:36:07 +00:00
Simon Pilgrim	fa0eab1450	[InstCombine][SSE] Add MOVMSK support to SimplifyDemandedUseBits The MOVMSK instructions copies a vector elements' sign bits to the low bits of a scalar register and zeros the high bits. This patch adds MOVMSK support to SimplifyDemandedUseBits so that its aware that the upper bits are known to be zero. It also removes the call to MOVMSK if none of the lower bits are actually required and just returns zero. Differential Revision: http://reviews.llvm.org/D19614 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267873 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-28 12:22:53 +00:00
Simon Pilgrim	fe702865fb	Tweak comments to make it clear that these combines are for SSE scalar instructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267360 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-24 19:31:56 +00:00
Simon Pilgrim	a07a9dbeff	[InstCombine][SSE] Reduce DIVSS/DIVSD to FDIV if only first element is required As discussed on D19318, if we only demand the first element of a DIVSS/DIVSD intrinsic, then reduce to a FDIV call. This matches the existing FADD/FSUB/FMUL patterns. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267359 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-24 18:35:59 +00:00
Simon Pilgrim	6efee72867	[InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 2 of 2) Split from D17490. This patch improves support for determining the demanded vector elements through SSE scalar intrinsics: 1 - demanded vector element support for unary and some extra binary scalar intrinsics (RCP/RSQRT/SQRT/FRCZ and ADD/CMP/DIV/ROUND). 2 - addss/addsd get simplified to a fadd call if we aren't interested in the pass through elements 3 - if we don't need the lowest element of a scalar operation then just use the first argument (the pass through elements) directly We can add support for propagating demanded elements through any equivalent packed SSE intrinsics in a future patch (these wouldn't use the pass through patterns). Differential Revision: http://reviews.llvm.org/D19318 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267357 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-24 18:23:14 +00:00
Sanjay Patel	7d0cdb4a10	function names start with a lowercase letter; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259425 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-01 22:23:39 +00:00
Andrea Di Biagio	b0c38394ff	[InstCombine] Teach SimplifyDemandedVectorElts how to handle ConstantVector select masks with ConstantExpr elements (PR24922) If the mask of a select instruction is a ConstantVector, method SimplifyDemandedVectorElts iterates over the mask elements to identify which values are selected from the select inputs. Before this patch, method SimplifyDemandedVectorElts always used method Constant::isNullValue() to check if a value in the mask was zero. Unfortunately that method always returns false when called on a ConstantExpr. This patch fixes the problem in SimplifyDemandedVectorElts by adding an explicit check for ConstantExpr values. Now, if a value in the mask is a ConstantExpr, we avoid calling isNullValue() on it. Fixes PR24922. Differential Revision: http://reviews.llvm.org/D13219 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249390 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-06 10:34:53 +00:00

1 2 3

122 Commits