archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Nirav Dave	5fc240a5b6	Recommitting Craig Topper's patch now that r296476 has been recommitted. When checking if chain node is foldable, make sure the intermediate nodes have a single use across all results not just the result that was used to reach the chain node. This recovers a test case that was severely broken by r296476, my making sure we don't create ADD/ADC that loads and stores when there is also a flag dependency. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297698 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-14 01:42:23 +00:00
Nirav Dave	3bbf394145	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Recommiting with compiler time improvements Recommitting after fixup of 32-bit aliasing sign offset bug in DAGCombiner. * Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search and chain alias analysis which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. When merging stores search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and the output Codegen (save perhaps for some ARM cases where we correctly constructs wider loads, but then promotes them to float operations which appear but requires more expensive constant generation). Some minor peephole optimizations to deal with improved SubDAG shapes (listed below) Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seems sufficient to not cause regressions in tests. 5. Remove Chain dependencies of Memory operations on CopyfromReg nodes as these are captured by data dependence 6. Forward loads-store values through tokenfactors containing {CopyToReg,CopyFromReg} Values. 7. Peephole to convert buildvector of extract_vector_elt to extract_subvector if possible (see CodeGen/AArch64/store-merge.ll) 8. Store merging for the ARM target is restricted to 32-bit as some in some contexts invalid 64-bit operations are being generated. This can be removed once appropriate checks are added. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable, improving load-store forwarding. One test in particular is worth noting: CodeGen/PowerPC/ppc64-align-long-double.ll - Improved load-store forwarding converts a load-store pair into a parallel store and a memory-realized bitcast of the same value. However, because we lose the sharing of the explicit and implicit store values we must create another local store. A similar transformation happens before SelectionDAG as well. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297695 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-14 00:34:14 +00:00
Craig Topper	85ad85b52e	[AVX-512] Fix another case where we are copying from a mask register using AH/BH/CH/DH with fastisel. Fixes PR32256. Still planning to do an audit for other possible cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297678 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 21:58:54 +00:00
Simon Pilgrim	8879a17087	[X86][MMX] Fix folding of shift value loads to cover whole 64-bits rL230225 made the assumption that only the lower 32-bits of an MMX register load is used as a shift value, when in fact the whole 64-bits are reloaded and treated as a i64 to determine the shift value. This patch reverts rL230225 to ensure that the whole 64-bits of memory are folded and ensures that the upper 32-bit are zero'd for cases where the shift value has come from a scalar source. Found during fuzz testing. Differential Revision: https://reviews.llvm.org/D30833 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297667 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 21:23:29 +00:00
Andrew Kaylor	4ae8ddf57b	Revert r295004 (Add MXCSR) due to errors reported by MachineVerifier I am leaving the code in clang which filters mxcsr from the clobber list because that is still technically correct and will be useful again when the MXCSR register is reintroduced. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297664 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 20:35:10 +00:00
Jessica Paquette	af024ba867	[Outliner] Add tail call support This commit adds tail call support to the MachineOutliner pass. This allows the outliner to insert jumps rather than calls in areas where tail calling is possible. Outlined tail calls include the return or terminator of the basic block being outlined from. Tail call support allows the outliner to take returns and terminators into consideration while finding candidates to outline. It also allows the outliner to save more instructions. For example, in the X86-64 outliner, a tail called outlined function saves one instruction since no return has to be inserted. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297653 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 18:39:33 +00:00
Craig Topper	57255ff3b4	[X86] Lower AVX2 gather intrinsics similar to AVX-512. Apply the same input source optimizations to break execution dependencies. For AVX-512 we force the input to zero if the input is undef or the mask is all ones to break an execution dependency. This patch brings the same behavior to AVX2. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297652 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 18:34:46 +00:00
Craig Topper	131341ea2e	[AVX-512] If gather mask is all ones, force the input to a zero vector. We were already forcing undef inputs to become a zero vector, this now catches an all ones mask too. Ideally we'd use undef and let execution dep fix handle picking the best register/clearance for the undef, but I don't think it can handle the early clobber today. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297651 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 18:17:46 +00:00
Craig Topper	04b36a711b	[AVX-512] Add EVEX2VEX test cases for the cvt instructions fixed in r297599 and r297600. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297603 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 05:47:56 +00:00
Craig Topper	760a31890d	Revert "[AVX-512] EVEX2VEX, don't reject intrinsic instructions when both have a memory operand. We should just continue to check other operands instead." This reverts r297596. There were other issues that were making this not work that have been fixed now. Reverting this results in a more accurate table. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297602 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 05:34:03 +00:00
Craig Topper	c68171ad8f	[AVX-512] EVEX2VEX, don't reject intrinsic instructions when both have a memory operand. We should just continue to check other operands instead. This exposed that we have several intrinsic instructions that have identical TSFlags to other instructions. We should merge their patterns and kill of the duplicate. I'll fix that in a follow up patch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297596 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-13 00:36:49 +00:00
Craig Topper	acde888274	[AVX-512] Fix the valid immediates for the scatter/gather prefetch intrinsics. The immediate should be 1 or 2, not 0 or 1. This was found while adding bounds checking to clang. In fact the existing clang builtin test failed if we ran it all the way to assembly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-12 22:29:12 +00:00
Sanjay Patel	cc9614d291	[x86] don't blindly transform SETB into SBB I noticed unnecessary 'sbb' instructions in D30472 and while looking at 'ptest' codegen recently. This happens because we were transforming any 'setb' - even when we only wanted a single-bit result. This patch moves those transforms under visitAdd/visitSub, so we we're only creating sbb/adc when it is a win. I don't know why we need a SETCC_CARRY node type, but I'm not proposing to change that existing behavior in this patch. Also, I'm skeptical that sbb/adc are a win for all micro-arches, so I added comments to the test files where this transform still fires. The test changes here are all cases where we no longer produce sbb/adc. Avoiding partial register stalls (generating an xor to clear a register) is not handled in some cases, but that's a separate issue. Differential Revision: https://reviews.llvm.org/D30611 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297586 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-12 18:28:48 +00:00
Igor Breger	fbb692e572	[X86] Add vector zext tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297581 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-12 13:20:10 +00:00
Craig Topper	2e124a6c7c	[AVX-512] Fix a bad use of a high GR8 register after copying from a mask register during fast isel. This ends up extracting from bits 15:8 instead of the lower bits of the mask. I'm pretty sure there are more problems lurking here. But I think this fixes PR32241. I've added the test case from that bug and added asserts that will fail if we ever try to copy between high registers and mask registers again. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297574 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-12 03:37:37 +00:00
Craig Topper	a168e94d13	[AVX-512] Add test case for PR32241. Fix coming in another commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297573 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-12 03:37:34 +00:00
Simon Pilgrim	c856889d16	[X86][SSE] Improve extraction of elements from v16i8 (pre-SSE41) Without SSE41 (pextrb) we currently extract byte elements from a vector by spilling to stack and reloading the byte. This patch is an initial attempt at using MOVD/PEXTRW to extract the relevant DWORD/WORD from the vector and then shift+truncate to collect the correct byte. Extraction of multiple bytes this way would result in code bloat, but as explained in the patch we could probably afford to be more aggressive with the supported extractions before again falling back on spilling - possibly through counting the number of extracts and which DWORD/WORD they originate? Differential Revision: https://reviews.llvm.org/D29841 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297568 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-11 20:42:31 +00:00
Craig Topper	39936dcadb	[X86] Add avx2 gather tests cases that show a failure to remove zeroing of the source when the mask is all ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297564 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-11 18:26:00 +00:00
Simon Pilgrim	5b42c82e6a	[X86][SSE] Fix load folding for (V)CVTDQ2PD This only requires a 64-bit memory source, not the whole 128-bits. But the 128-bit case is still supported via X86InstrInfo::foldMemoryOperandImpl git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297523 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 22:35:07 +00:00
Simon Pilgrim	be5b44e27d	[X86][RTM] Regenerate RTM intrinsic tests for 32/64-bit targets. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297518 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 21:55:24 +00:00
Simon Pilgrim	49e0bf23d7	[SelectionDAG] Add support for BUILD_VECTOR to ComputeNumSignBits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297492 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 18:36:46 +00:00
Simon Pilgrim	b101e283a3	[X86][SSE] Added tests showing missed truncations for sitofp conversion SelectionDAG::ComputeNumSignBits is poor at build_vector handling, meaning that we can't see that all the vXi64 sources are in fact sign extended i32 or smaller. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297486 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 18:01:53 +00:00
Amaury Sechet	5436744eb4	[SelectionDAG] Make SelectionDAG aware of the known bits in USUBO and SSUBO and SUBC. Summary: Depends on D30379 This improves the state of things for the sub class of operation. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30436 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297482 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 17:26:44 +00:00
Simon Pilgrim	f4e8e993d9	[X86][MMX] Add tests showing missed opportunities to use MMX sitofp conversions If we are transferring MMX registers to XMM for conversion we could use the MMX equivalents (CVTPI2PD + CVTPI2PS) without affecting rounding/exceptions etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297481 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 17:23:55 +00:00
Amaury Sechet	a29d680671	[SelectionDAG] Make SelectionDAG aware of the known bits in UADDO and SADDO. Summary: As per title. This is extracted from D29872 and I threw SADDO in. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297479 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 17:06:52 +00:00
Simon Pilgrim	6d0254e14c	[X86][MMX] Add tests showing missed opportunities to use MMX fptosi conversions If we are transferring XMM conversion results to MMX registers we could use the MMX equivalents (CVTPD2PI/CVTTPD2PI + CVTPS2PI/CVTTPS2PI) with affecting rounding/expections etc. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 16:59:43 +00:00
Simon Pilgrim	5f07c22a43	[X86][MMX] Updated bad stack spill shift value test to actually show the problem Cleaning up the ir had stopped showing the issue. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297475 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 16:18:50 +00:00
Simon Pilgrim	585605abc6	[X86][MMX] Regenerate mmx bitcast tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297474 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 16:07:39 +00:00
Simon Pilgrim	68540a2e35	[X86][MMX] Add test showing bad stack spill of shift value i32 is spilled to stack but 64-bit mmx is reloaded - leaving garbage in the other half of the register git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297471 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 15:53:41 +00:00
Simon Pilgrim	9e4921a584	[X86][MMX] Regenerate mmx load folding tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297470 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 15:41:05 +00:00
Sanjay Patel	46756ef2bd	[x86] add tests for vec div/rem with 0 element in divisor; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297433 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-10 00:55:29 +00:00
Amaury Sechet	2cab1ec06e	[DAGCombiner] Do various combine on uaddo. Summary: This essentially does the same transform as for ADC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30417 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297416 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 22:47:00 +00:00
Amaury Sechet	f15775bffb	[DAGCombiner] Do various combine on usubo. Summary: This essentially does the same transform as for SUBC. Reviewers: jyknight, nemanjai, mkuper, spatel, RKSimon, zvi, bkramer Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D30437 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297404 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 19:28:00 +00:00
Sanjay Patel	4f7ea50478	[DAG] recognize div/rem by 0 as undef before trying constant folding As discussed in the review thread for rL297026, this is actually 2 changes that would independently fix all of the test cases in the patch: 1. Return undef in FoldConstantArithmetic for div/rem by 0. 2. Move basic undef simplifications for div/rem (simplifyDivRem()) before foldBinopIntoSelect() as a matter of efficiency. I will handle the case of vectors with any zero element as a follow-up. That change is the DAG sibling for D30665 + adding a check of vector elements to FoldConstantVectorArithmetic(). I'm deleting the test for PR30693 because it does not test for the actual bug any more (dangers of using bugpoint). Differential Revision: https://reviews.llvm.org/D30741 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297384 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 15:02:25 +00:00
Adam Nemet	0213fce043	[SSP] In opt remarks, stream Function directly With this, it shows up as an attribute in YAML and non-printable characters are properly removed by GlobalValue::getRealLinkageName. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297362 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 06:10:27 +00:00
Matt Arsenault	6d62c71357	DAG: Check no signed zeros instead of unsafe math attribute git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297354 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-09 01:36:39 +00:00
Sanjay Patel	3c479a9070	[x86] regenerate checks; NFC This test could be reduced? The check fails for a seemingly unrelated change, so I'm adding full checks to see what is happening. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297296 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 17:19:56 +00:00
Eli Friedman	9ac562c933	[DAGCombine] Simplify ISD::AND in GetDemandedBits. This helps in cases involving bitfields where an AND is exposed by legalization. Differential Revision: https://reviews.llvm.org/D30472 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297249 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-08 00:56:35 +00:00
Ayman Musa	82804ad01f	[X86][AVX512] Add missing entries to EVEX2VEX tables evex2vex pass defines 2 tables which maps EVEX instructions to their VEX identical when possible. Adding all missing entries. Differential Revision: https://reviews.llvm.org/D30501 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297126 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-07 08:05:53 +00:00
Tim Northover	2c87ca8a0e	GlobalISel: restrict G_EXTRACT instruction to just one operand. A bit more painful than G_INSERT because it was more widely used, but this should simplify the handling of extract operations in most locations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297100 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 23:50:28 +00:00
Jessica Paquette	d43adee378	[Outliner] Fixed Asan bot failure in r296418 Fixed the asan bot failure which led to the last commit of the outliner being reverted. The change is in lib/CodeGen/MachineOutliner.cpp in the SuffixTree's constructor. LeafVector is no longer initialized using reserve but just a standard constructor. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297081 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 21:31:18 +00:00
Reid Kleckner	08e9d0ec12	[X86] Fix arg copy elision for illegal types Use the store size of the argument type, which will be a byte-sized quantity, rather than dividing the size in bits by 8. Fixes PR32136 and re-enables copy elision from i64 arguments. Reverts the workaround in from r296950. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297045 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 18:39:39 +00:00
Sanjay Patel	58868f1c75	[DAGCombiner] simplify div/rem-by-0 Refactoring of duplicated code and more fixes to follow. This is motivated by the post-commit comments for r296699: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170306/435182.html Ie, we can crash if we're missing obvious simplifications like this that exist in the IR simplifier or if these occur later than expected. The x86 change for non-splat division shows a potential opportunity to improve vector codegen: we assumed that since only one lane had meaningful results, we should do the math in scalar. But that means moving back and forth from vector registers. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297026 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 16:36:42 +00:00
Sanjay Patel	86c2471b68	[x86] add tests to show missing div/rem simplifications; NFC These are not x86-specific, but the problem is not visible for all targets because it is masked by other transforms. These can lead to compiler crashes. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297017 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 15:50:07 +00:00
Dean Michael Berris	8c95619c81	[XRay] Allow logging the first argument of a function call. Summary: Functions with the "xray-log-args" attribute will have a special XRay sled kind emitted, for compiler-rt to copy any call arguments to your logging handler. For practical and performance reasons, only the first argument is supported, and only up to 64 bits. Reviewers: dberris Reviewed By: dberris Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D29702 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296998 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-06 06:48:56 +00:00
Simon Pilgrim	08c7984cee	[SelectionDAG] Fix vector splitting for *_EXTEND_VECTOR_INREG instructions Found by fuzz testing after rL296985 landed git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296989 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-05 15:52:18 +00:00
Simon Pilgrim	ca6750e3d5	[X86][SSE] Lower 128-bit vectors to SIGN/ZERO_EXTEND_VECTOR_IN_REG ops As described on PR31712, we miss a variety of legalization combines because we lower these to X86ISD::VSEXT/VZEXT despite them having the same functionality. This patch makes 128-bit (SSE41) SIGN/ZERO_EXTEND_VECTOR_IN_REG ops legal, adds the necessary tablegen plumbing and uses a helper 'getExtendInVec' to decide when to use SIGN/ZERO_EXTEND_VECTOR_IN_REG or VSEXT/VZEXT. We're missing a couple of shuffle combines that will be added in a future patch for review. Later patches can then support the AVX2 cases as a mixture of SIGN/ZERO_EXTEND and SIGN/ZERO_EXTEND_VECTOR_IN_REG, and then finally deal with the AVX512 cases. Differential Revision: https://reviews.llvm.org/D30549 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296985 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-05 09:57:20 +00:00
Sanjay Patel	ecd4d53281	[x86] don't require a zext when forming ADC/SBB The larger goal is to move the ADC/SBB transforms currently in combineX86SetCC() to combineAddOrSubToADCOrSBB() because we're creating ADC/SBB in lots of places where we shouldn't. This was intended to be an NFC change, but avx-512 has something strange going on. It doesn't seem like any of the affected tests should really be using SET+TEST or ADC; a simple ADD could replace several instructions. But that's another bug... git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296978 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-04 20:35:19 +00:00
Sanjay Patel	59071e49a9	[DAGCombiner] allow transforming (select Cond, C +/- 1, C) to (add(ext Cond), C) select Cond, C +/- 1, C --> add(ext Cond), C -- with a target hook. This is part of the ongoing process to obsolete D24480. The motivation is to canonicalize to select IR in InstCombine whenever possible, so we need to have a way to undo that easily in codegen. PowerPC is an obvious winner for this kind of transform because it has fast and complete bit-twiddling abilities but generally lousy conditional execution perf (although this might have changed in recent implementations). x86 also sees some wins, but the effect is limited because these transforms already mostly exist in its target-specific combineSelectOfTwoConstants(). The fact that we see any x86 changes just shows that that code is a mess of special-case holes. We may be able to remove some of that logic now. My guess is that other targets will want to enable this hook for most cases. The likely follow-ups would be to add value type and/or the constants themselves as parameters for the hook. As the tests in select_const.ll show, we can transform any select-of-constants to math/logic, but the general transform for any 2 constants needs one more instruction (multiply or 'and'). ARM is one target that I think may not want this for most cases. I see infinite loops there because it wants to use selects to enable conditionally executed instructions. Differential Revision: https://reviews.llvm.org/D30537 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296977 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-04 19:18:09 +00:00
Simon Pilgrim	a71393791a	[X86][SSE] Enable post-legalize vXi64 shuffle combining on 32-bit targets Long ago (2010 according to svn blame), combineShuffle probably needed to prevent the accidental creation of illegal i64 types but there doesn't appear to be any combines that can cause this any more as they all have their own legality checks. Differential Revision: https://reviews.llvm.org/D30213 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296966 91177308-0d34-0410-b5e6-96231b3b80d8	2017-03-04 12:50:47 +00:00

1 2 3 4 5 ...

9538 Commits