RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-04-12 13:03:14 +00:00

Author	SHA1	Message	Date
Sanjay Patel	66c0dee698	[DAG] add folds for negated shifted sign bit The same folds exist in InstCombine already. This came up as part of: https://reviews.llvm.org/D25485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284239 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 14:26:47 +00:00
Nicolai Haehnle	9b219839bc	Fix use-after-frees Extracted from D25313, as suggested by Justin Bogner. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284220 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 09:49:51 +00:00
Craig Topper	cfa4f53d33	[DAGCombiner] Teach createBuildVecShuffle to handle cases where input vectors are less than half of the output vector size. This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284204 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-14 06:00:42 +00:00
Sanjay Patel	52b9988f47	[DAG] hoist DL(N) and fix formatting; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284170 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 22:27:10 +00:00
Nirav Dave	080559c6d3	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r284151 which appears to be triggering a LTO failures on Hexagon git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284157 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 20:23:25 +00:00
Nirav Dave	19dc709f4b	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Retrying after upstream changes. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284151 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 19:20:16 +00:00
Simon Pilgrim	97ca021c6f	[DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), Y) style combines git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284122 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 14:04:35 +00:00
Simon Pilgrim	9dbdf67986	[DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284117 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 12:49:31 +00:00
Simon Pilgrim	00d9dddfae	[DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization Improves commutation potential git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284113 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-13 12:05:20 +00:00
Simon Pilgrim	58682d3599	[DAGCombiner] Update most ADD combines to support general vector combines Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors. Differential Revision: https://reviews.llvm.org/D25374 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284015 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 13:48:10 +00:00
Konstantin Zhuravlyov	b34a186175	[DAGCombiner] Do not remove the load of stored values when optimizations are disabled This combiner breaks debug experience and should not be run when optimizations are disabled. For example: int main() { int j = 0; j += 2; if (j == 2) return 0; return 5; } When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function. Differential Revision: https://reviews.llvm.org/D19268 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284014 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-12 13:44:24 +00:00
Michael Kuperstein	13a7e10301	[DAG] Fix crash in build_vector -> vector_shuffle combine Fixes a crash in the build_vector -> vector_shuffle combine when the first vector input is twice as wide as the output, and the second input vector is even wider. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283953 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 22:44:31 +00:00
Sanjay Patel	14c8ab9f30	[DAG] add fold for masked negated sign-extended bool This enhances the fold added with: https://reviews.llvm.org/rL283900 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283905 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 17:05:52 +00:00
Sanjay Patel	4cc2f1f09a	[DAG] add fold for masked negated extended bool The non-obvious motivation for adding this fold (which already happens in InstCombine) is that we want to canonicalize IR towards select instructions and canonicalize DAG nodes towards boolean math. So we need to recreate some folds in the DAG to handle that change in direction. An interesting implementation difference for cases like this is that InstCombine generally works top-down while the DAG goes bottom-up. That means we need to detect different patterns. In this case, the SimplifyDemandedBits fold prevents us from performing a zext to sext fold that would then be recognized as a negation of a sext. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283900 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 16:26:36 +00:00
Sanjay Patel	b3c22e4e95	[DAG] simplify logic; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283885 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 14:14:30 +00:00
Sanjay Patel	8a60c21c5e	[DAG] hoist DL(N) and fix formatting; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283884 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 14:04:24 +00:00
Sanjay Patel	4566b7e67b	[DAG] fix formatting; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283878 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-11 13:47:43 +00:00
Sanjay Patel	72ac867a92	[DAG] clean up foldSelectOfConstants(); NFCI Rename variables, simplify logic. Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283613 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-07 21:55:42 +00:00
Sanjay Patel	7e1f5027b3	[DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFC We're missing at least 3 other similar folds based on what we have in InstCombine. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283596 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-07 20:47:51 +00:00
Vedant Kumar	4a0edca8c6	Delete some dead code in SelectionDAG (NFC) Differential Revision: https://reviews.llvm.org/D24435 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283505 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-06 22:53:43 +00:00
Michael Kuperstein	1ac52953b4	[DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputs This generalizes the build_vector -> vector_shuffle combine to support any number of inputs. The idea is to create a binary tree of shuffles, where the first layer performs pairwise shuffles of the input vectors placing each input element into the correct lane, and the rest of the tree blends these shuffles together. This doesn't try to be smart and create any sort of "optimal" shuffles. The assumption is that even a "poor" shuffle sequence is better than extracting and inserting the elements one by one. Differential Revision: https://reviews.llvm.org/D24683 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283480 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-06 18:58:24 +00:00
Sanjay Patel	097e03386e	fix formatting; NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283115 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-03 15:18:36 +00:00
Nirav Dave	bb15ebf5c7	Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled." This reverts commit r282600 due to test failues with MCJIT git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282604 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-28 16:37:50 +00:00
Nirav Dave	a6d3e00dff	In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled. Simplify Consecutive Merge Store Candidate Search Now that address aliasing is much less conservative, push through simplified store merging search which only checks for parallel stores through the chain subgraph. This is cleaner as the separation of non-interfering loads/stores from the store-merging logic. Whem merging stores, search up the chain through a single load, and finds all possible stores by looking down from through a load and a TokenFactor to all stores visited. This improves the quality of the output SelectionDAG and generally the output CodeGen (with some exceptions). Additional Minor Changes: 1. Finishes removing unused AliasLoad code 2. Unifies the the chain aggregation in the merged stores across code paths 3. Re-add the Store node to the worklist after calling SimplifyDemandedBits. 4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is arbitrary, but seemed sufficient to not cause regressions in tests. This finishes the change Matt Arsenault started in r246307 and jyknight's original patch. Many tests required some changes as memory operations are now reorderable. Some tests relying on the order were changed to use volatile memory operations Noteworthy tests: CodeGen/AArch64/argument-blocks.ll - It's not entirely clear what the test_varargs_stackalign test is supposed to be asserting, but the new code looks right. CodeGen/AArch64/arm64-memset-inline.lli - CodeGen/AArch64/arm64-stur.ll - CodeGen/ARM/memset-inline.ll - The backend now generates worse code due to store merging succeeding, as we do do a 16-byte constant-zero store efficiently. CodeGen/AArch64/merge-store.ll - Improved, but there still seems to be an extraneous vector insert from an element to itself? CodeGen/PowerPC/ppc64-align-long-double.ll - Worse code emitted in this case, due to the improved store->load forwarding. CodeGen/X86/dag-merge-fast-accesses.ll - CodeGen/X86/MergeConsecutiveStores.ll - CodeGen/X86/stores-merging.ll - CodeGen/Mips/load-store-left-right.ll - Restored correct merging of non-aligned stores CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll - Improved. Correctly merges buffer_store_dword calls CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll - Improved. Sidesteps loading a stored value and merges two stores CodeGen/X86/pr18023.ll - This test has been removed, as it was asserting incorrect behavior. Non-volatile stores CAN be moved past volatile loads, and now are. CodeGen/X86/vector-idiv.ll - CodeGen/X86/vector-lzcnt-128.ll - It's basically impossible to tell what these tests are actually testing. But, looks like the code got better due to the memory operations being recognized as non-aliasing. CodeGen/X86/win32-eh.ll - Both loads of the securitycookie are now merged. CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll - This test appears to work but no longer exhibits the spill behavior. Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel Differential Revision: https://reviews.llvm.org/D14834 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282600 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-28 15:50:43 +00:00
Michael Kuperstein	db4e01f9a1	[DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine This check currently doesn't seem to do anything useful on any in-tree target: On non-x86, it always evaluates to false, so we never hit the code path that creates the shuffle with zero. On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to query in general, but doesn't make sense if only restricted to zero blends. Differential Revision: https://reviews.llvm.org/D24625 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282567 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-28 06:13:58 +00:00
Wei Mi	f54e133dd2	Change the order of the splitted store from high - low to low - high. It is a trivial change which could make the testcase easier to be reused for the store splitting in CodeGenPrepare. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281846 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-18 06:10:32 +00:00
Sanjay Patel	a7c48ccd3f	getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281493 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 16:05:51 +00:00
Sanjay Patel	04e0167eac	getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281490 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:43:44 +00:00
Sanjay Patel	f4559b5e2c	getScalarType().getSizeInBits() -> getScalarSizeInBits() ; NFCI git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281489 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:21:00 +00:00
Michael Kuperstein	3389341950	[DAG] Allow build-to-shuffle combine to combine builds from two wide vectors. This allows us to, in some cases, create a vector_shuffle out of a build_vector, when the inputs to the build are extract_elements from two different vectors, at least one of which is wider than the output. (E.g. a <8 x i16> being constructed out of elements from a <16 x i16> and a <8 x i16>). Differential Revision: https://reviews.llvm.org/D24491 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281402 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-13 21:53:32 +00:00
Simon Pilgrim	8288ccf5dc	[DAGCombiner] Use APInt directly in (shl (zext (srl x, C)), C) combine range test To avoid assertion, we must ensure that the inner shift constant is within range before calling ConstantSDNode::getZExtValue(). We already know that the outer shift constant is in range. Followup to D23007 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281362 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-13 18:33:29 +00:00
Simon Pilgrim	45e3537596	[DAGCombiner] Use APInt directly in (shl (ext (shl x, c1)), c2) combine Fix failure to detect out of range shift constants leading to assert in ConstantSDNode::getZExtValue() Followup to D23007 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281354 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-13 17:15:28 +00:00
Ayman Musa	7dc5b7e72a	Remove MVT:i1 xor instruction before SELECT. (Performance improvement). Differential Revision: https://reviews.llvm.org/D23764 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281308 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-13 09:12:45 +00:00
Michael Kuperstein	8d684cae12	[DAG] Refactor BUILD_VECTOR combine to make it easier to extend. NFCI. This should make it easier to add cases that we currently don't cover, like supporting more kinds of type mismatches and more than 2 input vectors. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281283 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-13 00:57:43 +00:00
Justin Lebar	c71d5b41ef	[CodeGen] Split out the notions of MI invariance and MI dereferenceability. Summary: An IR load can be invariant, dereferenceable, neither, or both. But currently, MI's notion of invariance is IR-invariant && IR-dereferenceable. This patch splits up the notions of invariance and dereferenceability at the MI level. It's NFC, so adds some probably-unnecessary "is-dereferenceable" checks, which we can remove later if desired. Reviewers: chandlerc, tstellarAMD Subscribers: jholewinski, arsenm, nemanjai, llvm-commits Differential Revision: https://reviews.llvm.org/D23371 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281151 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-11 01:38:58 +00:00
Simon Pilgrim	31bcc1eeed	[SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type Fixes issue with rL280927 identified by Mikael Holmén git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281042 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-09 13:31:52 +00:00
Simon Pilgrim	707719e625	[DAGCombiner] Enable AND combines of splatted constant vectors Allow AND combines to use a vector splatted constant as well as a constant scalar. Preliminary part of D24253. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280926 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-08 12:36:39 +00:00
Hal Finkel	8d0ebe9402	[DAGCombine] More fixups to SETCC legality checking (visitANDLike/visitORLike) I might have called this "r246507, the sequel". It fixes the same issue, as the issue has cropped up in a few more places. The underlying problem is that isSetCCEquivalent can pick up select_cc nodes with a result type that is not legal for a setcc node to have, and if we use that type to create new setcc nodes, nothing fixes that (and so we've violated the contract that the infrastructure has with the backend regarding setcc node types). Fixes PR30276. For convenience, here's the commit message from r246507, which explains the problem is greater detail: [DAGCombine] Fixup SETCC legality checking SETCC is one of those special node types for which operation actions (legality, etc.) is keyed off of an operand type, not the node's value type. This makes sense because the value type of a legal SETCC node is determined by its operands' value type (via the TLI function getSetCCResultType). When the SDAGBuilder creates SETCC nodes, it either creates them with an MVT::i1 value type, or directly with the value type provided by TLI.getSetCCResultType. The first problem being fixed here is that DAGCombine had several places querying TLI.isOperationLegal on SETCC, but providing the return of getSetCCResultType, instead of the operand type directly. This does not mean what the author thought, and "luckily", most in-tree targets have SETCC with Custom lowering, instead of marking them Legal, so these checks return false anyway. The second problem being fixed here is that two of the DAGCombines could create SETCC nodes with arbitrary (integer) value types; specifically, those that would simplify: (setcc a, b, op1) and\|or (setcc a, b, op2) -> setcc a, b, op3 (which is possible for some combinations of (op1, op2)) If the operands of the and\|or node are actual setcc nodes, then this is not an issue (because the and\|or must share the same type), but, the relevant code in DAGCombiner::visitANDLike and DAGCombiner::visitORLike actually calls DAGCombiner::isSetCCEquivalent on each operand, and that function will recognise setcc-like select_cc nodes with other return types. And, thus, when creating new SETCC nodes, we need to be careful to respect the value-type constraint. This is even true before type legalization, because it is quite possible for the SELECT_CC node to have a legal type that does not happen to match the corresponding TLI.getSetCCResultType type. To be explicit, there is nothing that later fixes the value types of SETCC nodes (if the type is legal, but does not happen to match TLI.getSetCCResultType). Creating SETCCs with an MVT::i1 value type seems to work only because, either MVT::i1 is not legal, or it is what TLI.getSetCCResultType returns if it is legal. Fixing that is a larger change, however. For the time being, restrict the relevant transformations to produce only SETCC nodes with a value type matching TLI.getSetCCResultType (or MVT::i1 prior to type legalization). Fixes PR24636. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280767 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-06 23:02:23 +00:00
Wei Mi	c306b6131a	Split the store of a wide value merged from an int-fp pair into multiple stores. For the store of a wide value merged from a pair of values, especially int-fp pair, sometimes it is more efficent to split it into separate narrow stores, which can remove the bitwise instructions or sink them to colder places. Now the feature is only enabled on x86 target, and only store of int-fp pair is splitted. It is possible that the application scope gets extended with perf evidence support in the future. Differential Revision: https://reviews.llvm.org/D22840 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280505 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-02 17:17:04 +00:00
Andrea Di Biagio	997fee1acc	[DAGcombiner] Fix incorrect sinking of a truncate into the operand of a shift. This fixes a regression introduced by revision 268094. Revision 268094 added the following dag combine rule: // trunc (shl x, K) -> shl (trunc x), K => K < vt.size / 2 That rule converts a truncate of a shift-by-constant into a shift of a truncated value. We do this only if the shift count is less than half the size in bits of the truncated value (K < vt.size / 2). The problem is that the constraint on the shift count is incorrect, so the rule doesn't work well in some cases involving vector types. The combine rule should have been written instead like this: // trunc (shl x, K) -> shl (trunc x), K => K < vt.getScalarSizeInBits() Basically, if K is smaller than the "scalar size in bits" of the truncated value then we know that by "sinking" the truncate into the operand of the shift we would never accidentally make the shift undefined. This patch fixes the check on the shift count, and adds test cases to make sure that we don't regress the behavior. Differential Revision: https://reviews.llvm.org/D24154 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280482 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-02 11:29:09 +00:00
Michael Kuperstein	1b35037238	[DAGCombine] Don't fold a trunc if it feeds an anyext Legalization tends to create anyext(trunc) patterns. This should always be combined - into either a single trunc, a single ext, or nothing if the types match exactly. But if we happen to combine the trunc first, we may pull the trunc away from the anyext or make it implicit (e.g. the truncate(extract) -> extract(bitcast) fold). To prevent this, we can avoid doing the fold, similarly to how we already handle fpround(fpextend). Differential Revision: https://reviews.llvm.org/D23893 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280386 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-01 17:59:24 +00:00
Simon Pilgrim	24f7c903eb	Use SDValue::getOpcode() helper instead of via SDValue::getNode() git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279381 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-20 20:04:18 +00:00
Justin Bogner	7d7a23e700	Replace a few more "fall through" comments with LLVM_FALLTHROUGH Follow up to r278902. I had missed "fall through", with a space. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278970 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-17 20:30:52 +00:00
David Majnemer	975248e4fb	Use the range variant of find instead of unpacking begin/end If the result of the find is only used to compare against end(), just use is_contained instead. No functionality change is intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278433 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-11 22:21:41 +00:00
David Majnemer	dc9c737666	Use range algorithms instead of unpacking begin/end No functionality change is intended. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278417 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-11 21:15:00 +00:00
Simon Pilgrim	d8d5e79555	[DAGCombine] Avoid INSERT_SUBVECTOR reinsertions (PR28678) If the input vector to INSERT_SUBVECTOR is another INSERT_SUBVECTOR, and this inserted subvector replaces the last insertion, then insert into the common source vector. i.e. INSERT_SUBVECTOR( INSERT_SUBVECTOR( Vec, SubOld, Idx ), SubNew, Idx ) --> INSERT_SUBVECTOR( Vec, SubNew, Idx ) Differential Revision: https://reviews.llvm.org/D23330 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278211 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-10 10:50:53 +00:00
Simon Pilgrim	4cdc9853bc	[DAGCombiner] Better support for shifting large value type by constants As detailed on D22726, much of the shift combining code assume constant values will fit into a uint64_t value and calls ConstantSDNode::getZExtValue where it probably shouldn't (leading to asserts). Using APInt directly avoids this problem but we encounter other assertions if we attempt to compare/operate on 2 APInt of different bitwidths. This patch adds a helper function to ensure that 2 APInt values are zero extended as required so that they can be safely used together. I've only added an initial example use for this to the '(SHIFT (SHIFT x, c1), c2) --> (SHIFT x, (ADD c1, c2))' combines. Further cases can easily be added as required. Differential Revision: https://reviews.llvm.org/D23007 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278141 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-09 17:39:11 +00:00
Nikolai Bozhenov	17c4ba4fe4	[X86] Heuristic to selectively build Newton-Raphson SQRT estimation On modern Intel processors hardware SQRT in many cases is faster than RSQRT followed by Newton-Raphson refinement. The patch introduces a simple heuristic to choose between hardware SQRT instruction and Newton-Raphson software estimation. The patch treats scalars and vectors differently. The heuristic is that for scalars the compiler should optimize for latency while for vectors it should optimize for throughput. It is based on the assumption that throughput bound code is likely to be vectorized. Basically, the patch disables scalar NR for big cores and disables NR completely for Skylake. Firstly, scalar SQRT has shorter latency than NR code in big cores. Secondly, vector SQRT has been greatly improved in Skylake and has better throughput compared to NR. Differential Revision: https://reviews.llvm.org/D21379 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277725 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 12:47:28 +00:00
Diana Picus	de3bef53a2	Typo fix in comment. NFC git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277704 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 08:25:08 +00:00
Michael Kuperstein	8be735fdf7	[DAGCombine] Make sext(setcc) combine respect getBooleanContents We used to combine "sext(setcc x, y, cc) -> (select (setcc x, y, cc), -1, 0)" Instead, we should combine to (select (setcc x, y, cc), T, 0) where the value of T is 1 or -1, depending on the type of the setcc, and getBooleanContents() for the type if it is not i1. This fixes PR28504. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277371 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-01 19:39:49 +00:00

1 2 3 4 5 ...

1631 Commits