1653 Commits

Author SHA1 Message Date
Simon Pilgrim
d9bc309e9c [DAGCombiner] Enable (urem x, (shl pow2, y)) -> (and x, (add (shl pow2, y), -1)) combine for splatted vectors
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285129 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 22:01:09 +00:00
Simon Pilgrim
b03ba30cbf [DAGCombiner] Enable srem(x.y) -> urem(x,y) combine for vectors
SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285123 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 21:20:18 +00:00
Simon Pilgrim
baad275225 [DAGCombiner] Enable sdiv(x.y) -> udiv(x,y) combine for vectors
SelectionDAG::SignBitIsZero (via SelectionDAG::computeKnownBits) has supported vectors since rL280927

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285118 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 20:56:42 +00:00
Zvi Rackover
3ed32c90ce [DAGCombine] Preserve shuffles when one of the vector operands is constant
Summary:
Do *not* perform combines such as:

    vector_shuffle<4,1,2,3>(build_vector(Ud, C0, C1 C2), scalar_to_vector(X))
    ->
    build_vector(X, C0, C1, C2)

Keeping the shuffle allows lowering the constant build_vector to a materialized
constant vector (such as a vector-load from the constant-pool or some other idiom).

Reviewers: delena, igorb, spatel, mkuper, andreadb, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D25524

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285063 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-25 12:14:19 +00:00
Simon Pilgrim
09116d9d18 Use SDValue::getConstantOperandVal() helper. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284949 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-23 20:17:21 +00:00
Sanjay Patel
be29b1044e [DAG] fold negation of sign-bit
0 - X --> 0, if the sub is NUW
0 - X --> 0, if X is 0 or the minimum signed value and the sub is NSW
0 - X --> X, if X is 0 or the minimum signed value

This is the DAG equivalent of:
https://reviews.llvm.org/rL284649

plus the fold for the NUW case which already existed in InstSimplify.

Note that we miss a vector fold because of a deficiency in the DAG version of
computeKnownBits().



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284844 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 17:24:26 +00:00
Sanjay Patel
f8de69b006 [DAG] use SDNode flags 'nsz' to enable fadd/fsub with zero folds
As discussed in D24815, let's start the process of killing off the broken fast-math global
state housed in TargetOptions and eliminate the need for function-level fast-math attributes.

Here we enable two similar folds that are possible when we don't care about signed-zero:
fadd nsz x, 0 --> x
fsub nsz 0, x --> -x

Note that although the test cases include a 'sin' function call, I'm side-stepping the 
FMF-on-calls question (and lack of support in the DAG) for now. It's not needed for these
tests - isNegatibleForFree/GetNegatedExpression just look through a ISD::FSIN node.

Also, when we create an FNEG node and propagate the Flags of the FSUB to it, this doesn't
actually do anything today because Flags are silently dropped for any node that is not a
binary operator.

Differential Revision: https://reviews.llvm.org/D25297


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284824 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 14:36:58 +00:00
Sanjay Patel
928f047b68 [Target] remove TargetRecip class; 2nd try
This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs
caused by faulty usage of StringRef.

This version also renames a pair of functions:
getRecipEstimateDivEnabled()
getRecipEstimateSqrtEnabled()
as suggested by Eric Christopher.

original commit msg:

[Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering

This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes
rather than TargetOptions.

This patch is intended to be a structural, but not functional change. By moving all of the
TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate
state, shield the callers from the string format implementation, and simplify/localize the
logic needed for a target to enable this.

If a function has a "reciprocal-estimates" attribute, those settings may override the target's
default reciprocal preferences for whatever operation and data type we're trying to optimize.
If there's no attribute string or specific setting for the op/type pair, just use the target
default settings.

As noted earlier, a better solution would be to move the reciprocal estimate settings to IR
instructions and SDNodes rather than function attributes, but that's a multi-step job that
requires infrastructure improvements. I intend to work on that, but it's not clear how long
it will take to get all the pieces in place.

Differential Revision: https://reviews.llvm.org/D25440



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284746 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 16:55:45 +00:00
Simon Pilgrim
b4c99dd5d2 [DAGCombiner] Add general constant vector support to (srl (shl x, c), c) -> (and x, cst2)
We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284717 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 11:10:21 +00:00
Simon Pilgrim
440f75c589 Merged nested ifs. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284616 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 17:30:24 +00:00
Simon Pilgrim
9f2ab76dca [DAGCombiner] Add general constant vector support to (shl (add x, c1), c2) -> (add (shl x, c2), c1 << c2)
We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284613 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 17:12:22 +00:00
Simon Pilgrim
057fdd87c1 [DAGCombiner] Add general constant vector support to (shl (sra x, c1), c1) -> (and x, (shl -1, c1))
We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284608 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 16:15:30 +00:00
Simon Pilgrim
f02821bc98 [DAGCombiner] Add general constant vector support to (shl (mul x, c1), c2) -> (mul x, c1 << c2)
We already supported scalar constant / splatted constant vector - now accepts any (non opaque) constant scalar / vector

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284607 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 15:59:28 +00:00
Simon Pilgrim
9f4aec5ce3 [DAGCombiner] Just call isConstOrConstSplat directly. NFCI.
This will get the same ConstantSDNode scalar or vector splat value as the current separate dyn_cast<ConstantSDNode> / isVector() approach.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284578 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 11:28:15 +00:00
Simon Pilgrim
8f03548fcb [DAGCombine] Generalize distributeTruncateThroughAnd to work with any non-opaque constant or constant vector
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284574 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 08:57:37 +00:00
Sanjay Patel
bbcb21daf0 revert r284495: [Target] remove TargetRecip class
There's something wrong with the StringRef usage while parsing the attribute string.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284513 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 18:36:49 +00:00
Sanjay Patel
5800d6e9a7 [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering
This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes
rather than TargetOptions.

This patch is intended to be a structural, but not functional change. By moving all of the
TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate
state, shield the callers from the string format implementation, and simplify/localize the
logic needed for a target to enable this.

If a function has a "reciprocal-estimates" attribute, those settings may override the target's
default reciprocal preferences for whatever operation and data type we're trying to optimize.
If there's no attribute string or specific setting for the op/type pair, just use the target
default settings.

As noted earlier, a better solution would be to move the reciprocal estimate settings to IR
instructions and SDNodes rather than function attributes, but that's a multi-step job that
requires infrastructure improvements. I intend to work on that, but it's not clear how long
it will take to get all the pieces in place.

Differential Revision: https://reviews.llvm.org/D25440


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 17:05:05 +00:00
Simon Pilgrim
08bb504cb9 [DAGCombiner] Add splatted vector support to (udiv x, (shl pow2, y)) -> x >>u (log2(pow2)+y)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284491 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 16:36:00 +00:00
Simon Pilgrim
ede854370c Strip trailing whitespace (NFCI)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284478 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 13:44:00 +00:00
Sanjay Patel
8a2a445fa8 [DAG] make isConstOrConstSplat and isConstOrConstSplatFP more accessible; NFC
As noted in:
https://reviews.llvm.org/D25685

This is the next-to-smallest step needed to enable the ComputeNumSignBits fix in that patch. 
In a minor attempt to keep some structure, we're pulling the FP helper over along with its
integer sibling, but clearly we can and should do more refactoring of the similar helper
functions in DAGCombiner and SelectionDAG to simplify and not duplicate functionality.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284421 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-17 20:26:46 +00:00
Sanjay Patel
6eae33cd6b [DAG] optimize away an arithmetic-right-shift of a 0 or -1 value
This came up as part of:
https://reviews.llvm.org/D25485

Note that the vector case is missed because ComputeNumSignBits() is deficient for vectors.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284395 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-17 15:58:28 +00:00
Sanjay Patel
1167ade633 [DAG] avoid creating illegal node when transforming negated shifted sign bit
Eli noted this potential bug in the post-commit thread for:
https://reviews.llvm.org/rL284239
...but I'm not sure how to trigger it, so there's no test case yet.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284268 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 19:46:31 +00:00
Sanjay Patel
66c0dee698 [DAG] add folds for negated shifted sign bit
The same folds exist in InstCombine already.

This came up as part of:
https://reviews.llvm.org/D25485



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284239 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 14:26:47 +00:00
Nicolai Haehnle
9b219839bc Fix use-after-frees
Extracted from D25313, as suggested by Justin Bogner.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284220 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 09:49:51 +00:00
Craig Topper
cfa4f53d33 [DAGCombiner] Teach createBuildVecShuffle to handle cases where input vectors are less than half of the output vector size.
This will be needed by a future commit to support sign/zero extending from v8i8 to v8i64 which requires a sign/zero_extend_vector_inreg to be created which requires v8i8 to be concatenated upto v64i8 and goes through this code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284204 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 06:00:42 +00:00
Sanjay Patel
52b9988f47 [DAG] hoist DL(N) and fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284170 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 22:27:10 +00:00
Nirav Dave
080559c6d3 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r284151 which appears to be triggering a LTO
failures on Hexagon

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284157 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 20:23:25 +00:00
Nirav Dave
19dc709f4b In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after upstream changes.

   Simplify Consecutive Merge Store Candidate Search

   Now that address aliasing is much less conservative, push through
   simplified store merging search which only checks for parallel stores
   through the chain subgraph. This is cleaner as the separation of
   non-interfering loads/stores from the store-merging logic.

   Whem merging stores, search up the chain through a single load, and
   finds all possible stores by looking down from through a load and a
   TokenFactor to all stores visited. This improves the quality of the
   output SelectionDAG and generally the output CodeGen (with some
   exceptions).

   Additional Minor Changes:

       1. Finishes removing unused AliasLoad code
       2. Unifies the the chain aggregation in the merged stores across
       code paths
       3. Re-add the Store node to the worklist after calling
       SimplifyDemandedBits.
       4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
       arbitrary, but seemed sufficient to not cause regressions in
       tests.

   This finishes the change Matt Arsenault started in r246307 and
   jyknight's original patch.

   Many tests required some changes as memory operations are now
   reorderable. Some tests relying on the order were changed to use
   volatile memory operations

   Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

    CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll -
      This test appears to work but no longer exhibits the spill behavior.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284151 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 19:20:16 +00:00
Simon Pilgrim
97ca021c6f [DAGCombiner] Add vector support to (mul (shl X, Y), Z) -> (shl (mul X, Z), Y) style combines
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284122 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 14:04:35 +00:00
Simon Pilgrim
9dbdf67986 [DAGCombiner] Add vector support to C2-(A+C1) -> (C2-C1)-A folding
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284117 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 12:49:31 +00:00
Simon Pilgrim
00d9dddfae [DAGCombiner] Add vector support to (sub -1, x) -> (xor x, -1) canonicalization
Improves commutation potential

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284113 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 12:05:20 +00:00
Simon Pilgrim
58682d3599 [DAGCombiner] Update most ADD combines to support general vector combines
Add a number of helper functions to match scalar or vector equivalent constant/splat values to allow most of the combine patterns to be used by vectors.

Differential Revision: https://reviews.llvm.org/D25374

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284015 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 13:48:10 +00:00
Konstantin Zhuravlyov
b34a186175 [DAGCombiner] Do not remove the load of stored values when optimizations are disabled
This combiner breaks debug experience and should not be run when optimizations are disabled.

For example:
  int main() {
    int j = 0;
    j += 2;
    if (j == 2)
      return 0;
    return 5;
  }
When debugging this code compiled in /O0, it should be valid to break at line "j+=2;" and edit the value of j. It should change the return value of the function.

Differential Revision: https://reviews.llvm.org/D19268


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284014 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 13:44:24 +00:00
Michael Kuperstein
13a7e10301 [DAG] Fix crash in build_vector -> vector_shuffle combine
Fixes a crash in the build_vector -> vector_shuffle combine
when the first vector input is twice as wide as the output,
and the second input vector is even wider.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283953 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 22:44:31 +00:00
Sanjay Patel
14c8ab9f30 [DAG] add fold for masked negated sign-extended bool
This enhances the fold added with:
https://reviews.llvm.org/rL283900



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283905 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 17:05:52 +00:00
Sanjay Patel
4cc2f1f09a [DAG] add fold for masked negated extended bool
The non-obvious motivation for adding this fold (which already happens in InstCombine)
is that we want to canonicalize IR towards select instructions and canonicalize DAG 
nodes towards boolean math. So we need to recreate some folds in the DAG to handle that
change in direction. 

An interesting implementation difference for cases like this is that InstCombine
generally works top-down while the DAG goes bottom-up. That means we need to detect 
different patterns. In this case, the SimplifyDemandedBits fold prevents us from 
performing a zext to sext fold that would then be recognized as a negation of a sext. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283900 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 16:26:36 +00:00
Sanjay Patel
b3c22e4e95 [DAG] simplify logic; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283885 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 14:14:30 +00:00
Sanjay Patel
8a60c21c5e [DAG] hoist DL(N) and fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283884 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 14:04:24 +00:00
Sanjay Patel
4566b7e67b [DAG] fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283878 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 13:47:43 +00:00
Sanjay Patel
72ac867a92 [DAG] clean up foldSelectOfConstants(); NFCI
Rename variables, simplify logic. 
Not clear yet why we don't handle a target with ZeroOrNegativeOneBooleanContent too.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283613 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-07 21:55:42 +00:00
Sanjay Patel
7e1f5027b3 [DAG] move fold (select C, 0, 1 -> xor C, 1) to a helper function; NFC
We're missing at least 3 other similar folds based on what we have in InstCombine. 


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283596 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-07 20:47:51 +00:00
Vedant Kumar
4a0edca8c6 Delete some dead code in SelectionDAG (NFC)
Differential Revision: https://reviews.llvm.org/D24435

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283505 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-06 22:53:43 +00:00
Michael Kuperstein
1ac52953b4 [DAG] Generalize build_vector -> vector_shuffle combine for more than 2 inputs
This generalizes the build_vector -> vector_shuffle combine to support any
number of inputs. The idea is to create a binary tree of shuffles, where
the first layer performs pairwise shuffles of the input vectors placing each
input element into the correct lane, and the rest of the tree blends these
shuffles together.

This doesn't try to be smart and create any sort of "optimal" shuffles.
The assumption is that even a "poor" shuffle sequence is better than extracting
and inserting the elements one by one.

Differential Revision: https://reviews.llvm.org/D24683



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283480 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-06 18:58:24 +00:00
Sanjay Patel
097e03386e fix formatting; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-03 15:18:36 +00:00
Nirav Dave
bb15ebf5c7 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r282600 due to test failues with MCJIT

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282604 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 16:37:50 +00:00
Nirav Dave
a6d3e00dff In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Simplify Consecutive Merge Store Candidate Search

  Now that address aliasing is much less conservative, push through
  simplified store merging search which only checks for parallel stores
  through the chain subgraph. This is cleaner as the separation of
  non-interfering loads/stores from the store-merging logic.

  Whem merging stores, search up the chain through a single load, and
  finds all possible stores by looking down from through a load and a
  TokenFactor to all stores visited. This improves the quality of the
  output SelectionDAG and generally the output CodeGen (with some
  exceptions).

  Additional Minor Changes:

    1. Finishes removing unused AliasLoad code
    2. Unifies the the chain aggregation in the merged stores across
       code paths
    3. Re-add the Store node to the worklist after calling
       SimplifyDemandedBits.
    4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
       arbitrary, but seemed sufficient to not cause regressions in
       tests.

  This finishes the change Matt Arsenault started in r246307 and
  jyknight's original patch.

  Many tests required some changes as memory operations are now
  reorderable. Some tests relying on the order were changed to use
  volatile memory operations

  Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -
      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

    CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll -
      This test appears to work but no longer exhibits the spill
      behavior.

Reviewers: arsenm, hfinkel, tstellarAMD, nhaehnle, jyknight

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282600 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 15:50:43 +00:00
Michael Kuperstein
db4e01f9a1 [DAG] Remove isVectorClearMaskLegal() check from vector_build dagcombine
This check currently doesn't seem to do anything useful on any in-tree target:
On non-x86, it always evaluates to false, so we never hit the code path that
creates the shuffle with zero.
On x86, it just forwards to isShuffleMaskLegal(), which is a reasonable thing to
query in general, but doesn't make sense if only restricted to zero blends.

Differential Revision: https://reviews.llvm.org/D24625


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282567 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-28 06:13:58 +00:00
Wei Mi
f54e133dd2 Change the order of the splitted store from high - low to low - high.
It is a trivial change which could make the testcase easier to be reused
for the store splitting in CodeGenPrepare.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281846 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-18 06:10:32 +00:00
Sanjay Patel
a7c48ccd3f getValueType().getSizeInBits() -> getValueSizeInBits() ; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281493 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-14 16:05:51 +00:00
Sanjay Patel
04e0167eac getValueType().getScalarSizeInBits() -> getScalarValueSizeInBits() ; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281490 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-14 15:43:44 +00:00