archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Sanjay Patel	e5a55cebc1	[DAGCombiner] narrow shuffle of concatenated vectors // shuffle (concat X, undef), (concat Y, undef), Mask --> // concat (shuffle X, Y, Mask0), (shuffle X, Y, Mask1) The ARM changes with 'vtrn' and narrowed 'vuzp' are improvements. The x86 changes look neutral or better. There's one test with an extra instruction, but that could be reversed for a subtarget with the right attributes. But by default, we want to avoid the 256-bit op when possible (in my motivating benchmark, a handful of ymm ops sprinkled into a sequence of xmm ops are triggering frequency throttling on Haswell resulting in significantly worse perf). Differential Revision: https://reviews.llvm.org/D60545 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358291 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-12 16:31:56 +00:00
Simon Pilgrim	dfff56a5b1	[X86][SSE] Recognise vXi1 boolean anyof/allof reduction patterns Currently combineHorizontalPredicateResult only handles anyof/allof reduction patterns of legal types, which can be tricky to match as type legalization of bools can introduce bitcasts/truncs/extensions. This patch extends combineHorizontalPredicateResult to recognise vXi1 bool reductions as well and uses the existing combineBitcastvxi1 helper to create the MOVMSK necessary to then compare the signmask result. This ensures the accuracy of the reduction costs added in D60403 which assume the MOVMSK generation. Differential Revision: https://reviews.llvm.org/D60610 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358286 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-12 14:22:57 +00:00
Hans Wennborg	0366e3e181	Revert r358268 "[DebugInfo] DW_OP_deref_size in PrologEpilogInserter." It causes clang to crash while building Chromium. See https://crbug.com/952230 for reproducer. > The PrologEpilogInserter need to insert a DW_OP_deref_size before > prepending a memory location expression to an already implicit > expression to avoid having the existing expression act on the memory > address instead of the value behind it. > > The reason for using DW_OP_deref_size and not plain DW_OP_deref is that > big-endian targets need to read the right size as simply truncating a > larger read would yield the wrong result (LSB bytes are not at the lower > address). > > Differential Revision: https://reviews.llvm.org/D59687 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358281 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-12 12:54:52 +00:00
Markus Lavin	bcae242878	[DebugInfo] DW_OP_deref_size in PrologEpilogInserter. The PrologEpilogInserter need to insert a DW_OP_deref_size before prepending a memory location expression to an already implicit expression to avoid having the existing expression act on the memory address instead of the value behind it. The reason for using DW_OP_deref_size and not plain DW_OP_deref is that big-endian targets need to read the right size as simply truncating a larger read would yield the wrong result (LSB bytes are not at the lower address). Differential Revision: https://reviews.llvm.org/D59687 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358268 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-12 08:23:55 +00:00
Craig Topper	3f8580a865	[TargetLowering][X86] Teach SimplifyDemandedBits to use ShrinkDemandedOp on ISD::SHL nodes. If the upper bits of the SHL result aren't used, we might be able to use a narrower shift. For example, on X86 this can turn a 64-bit into 32-bit enabling a smaller encoding. Differential Revision: https://reviews.llvm.org/D60358 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358257 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-12 06:49:28 +00:00
Aaron Smith	ef13026b83	[DebugInfo] Combine Trivial and NonTrivial flags Summary: Companion to https://reviews.llvm.org/D59347 Reviewers: rnk, zturner, probinson, dblaikie, deadalnix Subscribers: aprantl, jdoerfert, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D59348 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358220 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 20:25:10 +00:00
Craig Topper	0a7c2b2b16	[X86] Restrict vselect handling in scalarizeExtEltFP to only case to pre type legalization where the setcc result type is vXi1. If the vector setcc has been legalized then we will need to convert a vector boolean of 0 or -1 to a scalar boolean of 0 or 1. The added test case previously crashed in 32-bit mode by creating a setcc with an i64 condition that type legalization couldn't expand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358218 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 19:57:44 +00:00
Craig Topper	b0ed82c8f9	[X86] Add 32-bit command line to extractelement-fp.ll so I can add a test case for a 32-bit only crasher. NFC This is a bit ugly for ABI reasons about how floats/doubles are returned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358217 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 19:57:24 +00:00
Craig Topper	0bc1a86ddd	[X86] Add patterns for using movss/movsd for atomic load/store of f32/64. Remove atomic fadd pseudos use isel patterns instead. This patch adds patterns for turning bitcasted atomic load/store into movss/sd. It also removes the pseudo instructions for atomic RMW fadd. Instead just adding isel patterns for folding an atomic load into addss/sd. And relying on the new movss/sd store pattern to handle the write part. This also makes the fadd patterns use VEX and EVEX instructions when AVX or AVX512F are enabled. Differential Revision: https://reviews.llvm.org/D60394 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358215 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 19:19:52 +00:00
Craig Topper	b5079b20d2	Recommit r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2" With correct test checks this time. If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integ This matches what gcc and icc do for this case and removes an existing FIXME. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358214 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 19:19:42 +00:00
Craig Topper	d11a7fa9a6	Revert r358211 "[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2" I seem to have messed up the test checks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358212 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 19:04:38 +00:00
Craig Topper	27970a3895	[X86] Use FILD/FIST to implement i64 atomic load on 32-bit targets with X87, but no SSE2 If we have X87, but not SSE2 we can atomicaly load an i64 value into the significand of an 80-bit extended precision x87 register using fild. We can then use a fist instruction to convert it back to an i64 integer and store it to a stack temporary. From there we can do two 32-bit loads to get the value into integer registers without worrying about atomicness. This matches what gcc and icc do for this case and removes an existing FIXME. Differential Revision: https://reviews.llvm.org/D60156 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358211 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 18:40:21 +00:00
Craig Topper	f509674afe	[X86] Pre-commit i64 volatile test case for D60156. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358210 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 18:40:08 +00:00
Simon Pilgrim	6b550bf3fa	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV3 mask support Completes SimplifyDemandedVectorElts's basic variable shuffle mask support which should help D60512 + D60562 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358186 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 15:29:15 +00:00
Simon Pilgrim	30bc60c9e3	[X86][AVX] Tweak X86ISD::VPERMV3 demandedelts test Original test was too dependent on the order of the combines that could cause the inserted element being demanded after all git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358182 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 15:09:03 +00:00
Simon Pilgrim	d0586349ae	[X86][AVX] Add X86ISD::VPERMV3 demandedelts test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358175 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:48:46 +00:00
Simon Pilgrim	9078304a8e	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMV mask support git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358174 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:35:45 +00:00
Simon Pilgrim	2d199db3b4	[X86][AVX] Add X86ISD::VPERMV demandedelts test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358173 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:26:32 +00:00
Sanjay Patel	8e4da35c4a	[DAGCombiner][x86] scalarize inserted vector FP ops // bo (build_vec ...undef, x, undef...), (build_vec ...undef, y, undef...) --> // build_vec ...undef, (bo x, y), undef... The lifetime of the nodes in these examples is different for variables versus constants, but they are all build vectors briefly, so I'm proposing to catch them in this form to handle all of the leading examples in the motivating test file. Before we have build vectors, we might have insert_vector_element. After that, we might have scalar_to_vector and constant pool loads. It's going to take more work to ensure that FP vector operands are getting simplified with undef elements, so this transform can apply more widely. In a non-loose FP environment, we are likely simplifying FP elements to NaN values rather than undefs. We also need to allow more opcodes down this path. Eg, we don't handle FP min/max flavors yet. Differential Revision: https://reviews.llvm.org/D60514 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358172 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:21:57 +00:00
Simon Pilgrim	31e4f8550c	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMILPV mask support git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358170 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:15:01 +00:00
Simon Pilgrim	491c504644	[X86][AVX] Add X86ISD::VPERMILPV demandedelts tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358168 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:09:35 +00:00
Simon Pilgrim	c4c427ffea	[X86] SimplifyDemandedVectorElts - add X86ISD::VPERMIL2 mask support git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358167 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 14:04:19 +00:00
Simon Pilgrim	e5d96fd941	[X86][XOP] Add X86ISD::VPERMIL2 demandedelts test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358166 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 13:52:43 +00:00
Simon Pilgrim	ec7786cfeb	[X86] SimplifyDemandedVectorElts - add VPPERM support We need to add support for all variable shuffle mask ops, but VPPERM is the only one that already has test coverage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358165 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-11 13:30:38 +00:00
Craig Topper	c5f25a94d8	[X86] Add SSE1 command line to atomic-fp.ll and atomic-non-integer.ll. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358141 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 22:35:32 +00:00
Craig Topper	e99a86d0a5	[X86] Autogenerate complete checks. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358140 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 22:35:24 +00:00
Craig Topper	b8a9dcf26b	[X86] Teach foldMaskedShiftToScaledMask to look through an any_extend from i32 to i64 between the and & shl foldMaskedShiftToScaledMask tries to reorder and & shl to enable the shl to fold into an LEA. But if there is an any_extend between them it doesn't work. This patch modifies the code to look through any_extend from i32 to i64 when the and mask only uses bits that weren't from the extended part. This will prevent a regression from D60358 caused by 64-bit SHL being narrowed to 32-bits when their upper bits aren't demanded. Differential Revision: https://reviews.llvm.org/D60532 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358139 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 21:42:08 +00:00
Craig Topper	a8783de71b	[X86] Add test case for LEA formation regression seen with D60358. NFC If we have an (add X, (and (aext (shl Y, C1)), C2)), we can pull the shift through and+aext to fold into an LEA with the. Assuming C1 is small enough and C2 masks off all of the extend bits. This pattern showed up in D60358. And we need to handle it to prevent a regression. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358124 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 19:09:06 +00:00
David Green	45a375eb6b	Revert rL357745: [SelectionDAG] Compute known bits of CopyFromReg Certain optimisations from ConstantHoisting and CGP rely on Selection DAG not seeing through to the constant in other blocks. Revert this patch while we come up with a better way to handle that. I will try to follow this up with some better tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358113 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 18:00:41 +00:00
Simon Pilgrim	d4d582af7c	[X86][AVX] getTargetConstantBitsFromNode - extract bits from X86ISD::SUBV_BROADCAST git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358096 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-10 16:24:47 +00:00
Craig Topper	b77427871e	[DAGCombiner][X86][SystemZ] Canonicalize SSUBO with immediate RHS to SADDO by negating the immediate. This lines up with what we do for regular subtract and it matches up better with X86 assumptions in isel patterns that add with immediate is more canonical than sub with immediate. Differential Revision: https://reviews.llvm.org/D60020 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358027 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-09 18:33:56 +00:00
Simon Pilgrim	0c7bc1e6bb	[TargetLowering] SimplifyDemandedBits - add ISD::INSERT_SUBVECTOR support git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358019 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-09 16:52:21 +00:00
Simon Pilgrim	6aaa856e17	[TargetLowering] SimplifyDemandedBits - call SimplifyDemandedBits in bitcast handling When bitcasting from a source op to a larger bitwidth op, split the demanded bits and OR them on top of one another and demand those merged bits in the SimplifyDemandedBits call on the source op. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357992 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-09 10:27:59 +00:00
Simon Pilgrim	a2b71f96a9	[TargetLowering] SimplifyDemandedBits - use DemandedElts in bitcast handling Be more selective in the SimplifyDemandedBits -> SimplifyDemandedVectorElts bitcast call based on the demanded elts. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357942 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-08 20:59:38 +00:00
Simon Pilgrim	c32c31e720	[X86][AVX] Add PR34380 shuffle test cases git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357914 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-08 14:05:42 +00:00
Sanjay Patel	50de781fc9	[x86] make 8-bit shl undesirable I was looking at a potential DAGCombiner fix for 1 of the regressions in D60278, and it caused severe regression test pain because x86 TLI lies about the desirability of 8-bit shift ops. We've hinted at making all 8-bit ops undesirable for the reason in the code comment: // TODO: Almost no 8-bit ops are desirable because they have no actual // size/speed advantages vs. 32-bit ops, but they do have a major // potential disadvantage by causing partial register stalls. ...but that leads to massive diffs and exposes all kinds of optimization holes itself. Differential Revision: https://reviews.llvm.org/D60286 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357912 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-08 13:58:50 +00:00
Craig Topper	064d9fea37	[X86] Split floating point tests out of atomic-mi.ll into atomic-fp.ll. Add avx and avx512f command lines. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357882 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-08 01:54:27 +00:00
Craig Topper	968007b51b	[X86] Add avx and avx512f command lines to atomic-non-integer.ll. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357881 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-08 01:54:24 +00:00
Craig Topper	7ef283f68d	[X86] Use (SUBREG_TO_REG (MOV32rm)) for extloadi64i8/extloadi64i16 when the load is 4 byte aligned or better and not volatile. Summary: Previously we would use MOVZXrm8/MOVZXrm16, but those are longer encodings. This is similar to what we do in the loadi32 predicate. Reviewers: RKSimon, spatel Reviewed By: RKSimon Subscribers: hiraditya, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60341 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357875 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-07 19:19:44 +00:00
Simon Pilgrim	e9d45f9ecc	[X86][SSE] SimplifyDemandedBitsForTargetNode - Add initial PACKSS support In the case where we only want the sign bit (e.g. when using PACKSS truncation of comparison results for MOVMSK) then we can just demand the sign bit of the source operands. This makes use of the fact that PACKSS saturates out of range values to the min/max int values - so the sign bit is always preserved. Differential Revision: https://reviews.llvm.org/D60333 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357859 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-07 10:40:01 +00:00
Craig Topper	ec0a133717	[X86] When converting (x << C1) AND C2 to (x AND (C2>>C1)) << C1 during isel, try using andl over andq by favoring 32-bit unsigned immediates. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357848 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 19:00:11 +00:00
Craig Topper	58723280e4	[X86] Use a signed mask in foldMaskedShiftToScaledMask to enable a shorter immediate encoding. This function reorders AND and SHL to enable the SHL to fold into an LEA. The upper bits of the AND will be shifted out by the SHL so it doesn't matter what mask value we use for these bits. By using sign bits from the original mask in these upper bits we might enable a shorter immediate encoding to be used. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357846 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 18:00:50 +00:00
Craig Topper	37f40046fb	[X86] Add test cases to show missed opportunities to use a sign extended 8 or 32 bit immediate AND when reversing SHL+AND to form an LEA. When we shift the AND mask over we should shift in sign bits instead of zero bits. The scale in the LEA will shift these bits out so it doesn't matter whether we mask the bits off or not. Using sign bits will potentially allow a sign extended immediate to be used. Also add some other test cases for cases that are currently optimal. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357845 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 18:00:45 +00:00
Craig Topper	af98c632de	[X86] Autogenerate complete checks. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357844 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 18:00:41 +00:00
Simon Pilgrim	b29aa0ed5a	[X86] Add AVX-target expandload and compressstore tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357842 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 14:40:52 +00:00
Simon Pilgrim	98514317dd	[X86] Split expandload and compressstore tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357840 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 14:14:54 +00:00
Simon Pilgrim	9d6cfa979f	[X86][SSE] Add more exhaustive masked load/store tests Reordered/renamed some existing tests to match the cleaned up order git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357839 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-06 14:01:37 +00:00
Francis Visoiu Mistrih	5122ddd5d5	[X86] Enable tail calls for CallingConv::Swift It's currently only enabled on AArch64 (enabled in r281376). git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357809 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 20:18:25 +00:00
Francis Visoiu Mistrih	051d4978f8	[X86] Preserve operand flag when expanding TCRETURNri The expansion of TCRETURNri(64) would not keep operand flags like undef/renamable/etc. which can result in machine verifier issues. Also add plumbing to be able to use `-run-pass=x86-pseudo`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357808 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 20:18:21 +00:00
Craig Topper	2310900b63	[X86] Merge the different Jcc instructions for each condition code into single instructions that store the condition code as an operand. Summary: This avoids needing an isel pattern for each condition code. And it removes translation switches for converting between Jcc instructions and condition codes. Now the printer, encoder and disassembler take care of converting the immediate. We use InstAliases to handle the assembly matching. But we print using the asm string in the instruction definition. The instruction itself is marked IsCodeGenOnly=1 to hide it from the assembly parser. Reviewers: spatel, lebedev.ri, courbet, gchatelet, RKSimon Reviewed By: RKSimon Subscribers: MatzeB, qcolombet, eraman, hiraditya, arphaman, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D60228 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357802 91177308-0d34-0410-b5e6-96231b3b80d8	2019-04-05 19:28:09 +00:00

1 2 3 4 5 ...

14009 Commits