2609 Commits

Author SHA1 Message Date
Roman Lebedev
e270352fc3 [DAGCombine][X86][AArch64][ARM] (C - x) + y -> (y - x) + C fold
Summary:
All changes except ARM look **great**.
https://rise4fun.com/Alive/R2M

The regression `test/CodeGen/ARM/addsubcarry-promotion.ll`
is recovered fully by D62392 + D62450.

Reviewers: RKSimon, craig.topper, spatel, rogfer01, efriedma

Reviewed By: efriedma

Subscribers: dmgreen, javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62266

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362487 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 11:06:08 +00:00
Simon Pilgrim
11c4864708 [SelectionDAG] Add fpto[us]i(undef) --> undef constant fold
Follow up to D62807.

Differential Revision: https://reviews.llvm.org/D62811

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362483 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 10:04:55 +00:00
QingShan Zhang
d59be50a0a [DAGCombine] Match a pattern where a wide type scalar value is stored by several narrow stores
This opportunity is found from spec 2017 557.xz_r. And it is used by the sha encrypt/decrypt. See sha-2/sha512.c

static void store64(u64 x, unsigned char* y)
{
    for(int i = 0; i != 8; ++i)
        y[i] = (x >> ((7-i) * 8)) & 255;
}

static u64 load64(const unsigned char* y)
{
    u64 res = 0;
    for(int i = 0; i != 8; ++i)
        res |= (u64)(y[i]) << ((7-i) * 8);
    return res;
}
The load64 has been implemented by https://reviews.llvm.org/D26149
This patch is trying to implement the store pattern.

Match a pattern where a wide type scalar value is stored by several narrow
stores. Fold it into a single store or a BSWAP and a store if the targets
supports it.

Assuming little endian target:
i8 *p = ...
i32 val = ...
p[0] = (val >> 0) & 0xFF;
p[1] = (val >> 8) & 0xFF;
p[2] = (val >> 16) & 0xFF;
p[3] = (val >> 24) & 0xFF;

>
*((i32)p) = val;

i8 *p = ...
i32 val = ...
p[0] = (val >> 24) & 0xFF;
p[1] = (val >> 16) & 0xFF;
p[2] = (val >> 8) & 0xFF;
p[3] = (val >> 0) & 0xFF;

>
*((i32)p) = BSWAP(val);

Differential Revision: https://reviews.llvm.org/D61843



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362472 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 08:53:53 +00:00
Michael Berg
66d9648f59 Propagate fmf for setcc/select folds
Summary: This change facilitates propagating fmf which was placed on setcc from fcmp through folds with selects so that back ends can model this path for arithmetic folds on selects in SDAG.

Reviewers: qcolombet, spatel

Reviewed By: qcolombet

Subscribers: nemanjai, jsji

Differential Revision: https://reviews.llvm.org/D62552

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362439 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 19:12:15 +00:00
Simon Pilgrim
2d2b2ea8ca [SelectionDAG] Add [us]itofp(undef) --> 0 constant fold (PR39205)
We were missing this fold in the DAG, which I've copied directly from llvm::ConstantFoldCastInstruction

Differential Revision: https://reviews.llvm.org/D62807

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362397 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 13:02:07 +00:00
Florian Hahn
cbd1abe8d8 Recommit r360171: [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor.
If we hit the limit, we do expand the outstanding tokenfactors.
Otherwise, we might drop nodes with users in the unexpanded
tokenfactors. This fixes the crashes reported by Jordan Rupprecht.

Reviewers: niravd, spatel, craig.topper, rupprecht

Reviewed By: niravd

Differential Revision: https://reviews.llvm.org/D62633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362350 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 01:30:19 +00:00
Craig Topper
12cba04e86 [DAGCombiner][X86] Fold away masked store and scatter with all zeroes mask.
Similar to what was done for masked load and gather.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362342 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-02 22:52:38 +00:00
Craig Topper
010dd36cca [DAGCombiner] Replace masked loads with a zero mask with the passthru value
Similar to what was recently done for gathers in r362015.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362337 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-02 18:58:46 +00:00
Simon Pilgrim
68ea3a046b [DAGCombine] Fold insert_subvector(bitcast(x),bitcast(y),c1) -> bitcast(insert_subvector(x,y),c2)
Move this combine from x86 into generic DAGCombine, which currently only manages cases where the bitcast is between types of the same scalarsize.

Differential Revision: https://reviews.llvm.org/D59188

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362324 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-02 14:42:11 +00:00
Craig Topper
6f01df3d75 [DAGCombiner] Replace two unchecked dyn_casts with casts.
The results of the dyn_casts were immediately dereferenced on the next line
so they had better not be null.

I don't think there's any way for these dyn_casts to fail, so use a cast
of adding null check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362315 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-02 03:31:01 +00:00
Roman Lebedev
deaea97947 [DAGCombine] Limit 'hoist add/sub binop w/ constant op' to non-opaque consts
I don't have a test case for these, but there is a test case for D62266
where, even after all the constant-folding patches, we still end up
with endless combine loop. Which makes sense, since we don't constant
fold for opaque constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362156 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 21:10:37 +00:00
Roman Lebedev
f9fcbf5fb1 [DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold. Try 2
Summary:
Only vector tests are being affected here,
since subtraction by scalar constant is rewritten
as addition by negated constant.

No surprising test changes.

https://rise4fun.com/Alive/pbT

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362146 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 20:37:49 +00:00
Roman Lebedev
438582d348 [DAGCombine] (x - C) - y -> (x - y) - C fold. Try 3
Summary:
Again only vectors affected. Frustrating. Let me take a look into that..

https://rise4fun.com/Alive/AAq

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362145 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 20:37:39 +00:00
Roman Lebedev
97a393580c [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold. Try 3
Summary:
This prevents regressions in next patch,
and somewhat recovers from the regression to AMDGPU test in D62223.

It is indeed not great that we leave vector decrement,
don't transform it into vector add all-ones..

https://rise4fun.com/Alive/ZRl

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel, arsenm

Reviewed By: RKSimon, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62263

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362144 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 20:37:29 +00:00
Roman Lebedev
58c76e2b4e [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 3
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362143 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 20:37:18 +00:00
Roman Lebedev
1c91b28c61 [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 3
Summary:
The main motivation is shown by all these `neg` instructions that are now created.
In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test.

AArch64 test changes all look good (`neg` created), or neutral.

X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created).

I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill
is now hoisted into preheader (which should still be good?),
2 4-byte reloads become 1 8-byte reload, and are elsewhere,
but i'm not sure how that affects that loop.

I'm unable to interpret AMDGPU change, looks neutral-ish?

This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later)

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs, and then reverted in
rL362109 to fix missing constant folds that were causing
endless combine loops.

Reviewers: craig.topper, RKSimon, spatel, arsenm

Reviewed By: RKSimon

Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362142 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 20:36:54 +00:00
Roman Lebedev
c88a7aafed [DAGCombine] ((c1-A)-c2) -> ((c1-c2)-A) constant-fold
Summary: https://rise4fun.com/Alive/B0A

Reviewers: t.p.northover, RKSimon, spatel, craig.topper

Reviewed By: RKSimon

Subscribers: javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62691

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362135 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 19:27:51 +00:00
Roman Lebedev
2b88706ce1 [DAGCombine] (A-C1)-C2 -> A-(C1+C2) constant-fold
Summary: https://rise4fun.com/Alive/Mb1M

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62689

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362134 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 19:27:42 +00:00
Roman Lebedev
42f63886b9 [DAGCombine] (A+C1)-C2 -> A+(C1-C2) constant-fold
Summary:
Direct sibling of D62662, the root cause of the endless combine loop in D62257

https://rise4fun.com/Alive/d3W

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62664

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362133 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 19:27:32 +00:00
Roman Lebedev
69aa603332 [DAGCombine] Use FoldConstantArithmetic() to perform C2-(A+C1) -> (C2-C1)-A fold
Summary:
No tests change, and i'm not sure how to test this, but it's better safe than sorry.

Reviewers: spatel, RKSimon, craig.topper, t.p.northover

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62663

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362132 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 19:27:26 +00:00
Roman Lebedev
0a21e6fba3 [DAGCombine] ((A-c1)+c2) -> (A+(c2-c1)) constant-fold
Summary:
This was the root cause of the endless combine loop in D62257

https://rise4fun.com/Alive/d3W

Reviewers: RKSimon, spatel, craig.topper, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, javed.absar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362131 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 19:27:19 +00:00
Roman Lebedev
59dfcbec08 [DAGCombine] Use FoldConstantArithmetic() to perform ((c1-A)+c2) -> (c1+c2)-A fold
Summary: No tests change, and i'm not sure how to test this, but it's better safe than sorry.

Reviewers: spatel, RKSimon, craig.topper, t.p.northover

Reviewed By: craig.topper

Subscribers: llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62661

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362130 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 19:27:10 +00:00
Roman Lebedev
ed880bf1d6 [DAGCombine] Revert of recommit of "binop-with-const hoisting" patches
I was looking into an endless combine loop the uncommitted follow-up patch
was causing, and it appears even these patches can exibit such an
endless loop. The root cause is that we try to hoist one binop (add/sub) with
constant operand, and if we get two such binops both of which are
eligible for this hoisting, we get stuck.

Some cases may highlight missing constant-folds.

Reverts r361871,r361872,r361873,r361874.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362109 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 16:07:11 +00:00
Benjamin Kramer
a9b9cf4281 [DAGCombiner] Replace gathers with a zero mask with the passthru value
These can be created by the legalizer when splitting a larger gather.

See https://llvm.org/PR42055 for a motivating example.

Differential Revision: https://reviews.llvm.org/D62613

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362015 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-29 19:24:19 +00:00
Roman Lebedev
d96f9a2ce2 [DAGCombine] (x - C) - y -> (x - y) - C fold. Try 2
Summary:
Again only vectors affected. Frustrating. Let me take a look into that..

https://rise4fun.com/Alive/AAq

This is a recommit, originally committed in rL361856, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361874 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 20:40:10 +00:00
Roman Lebedev
0430f97eac [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold. Try 2
Summary:
This prevents regressions in next patch,
and somewhat recovers from the regression to AMDGPU test in D62223.

It is indeed not great that we leave vector decrement,
don't transform it into vector add all-ones..

https://rise4fun.com/Alive/ZRl

This is a recommit, originally committed in rL361855, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel, arsenm

Reviewed By: RKSimon, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62263

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361873 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 20:40:03 +00:00
Roman Lebedev
1504fb4b30 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold. Try 2
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

This is a recommit, originally committed in rL361853, but reverted
to investigate test-suite compile-time hangs.

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361872 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 20:39:55 +00:00
Roman Lebedev
a0926a5947 [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold. Try 2
Summary:
The main motivation is shown by all these `neg` instructions that are now created.
In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test.

AArch64 test changes all look good (`neg` created), or neutral.

X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created).

I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill
is now hoisted into preheader (which should still be good?),
2 4-byte reloads become 1 8-byte reload, and are elsewhere,
but i'm not sure how that affects that loop.

I'm unable to interpret AMDGPU change, looks neutral-ish?

This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later)

This is a recommit, originally committed in rL361852, but reverted
to investigate test-suite compile-time hangs.

Reviewers: craig.topper, RKSimon, spatel, arsenm

Reviewed By: RKSimon

Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361871 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 20:39:39 +00:00
Roman Lebedev
f8b1164eaa Revert DAGCombine "hoist binop with const" folds
Appear to introduce test-suite compile-time hang.

http://lab.llvm.org:8011/builders/clang-cmake-x86_64-sde-avx512-linux/builds/22825

This reverts r361852,r361853,r361854,r361855,r361856

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361865 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 19:04:21 +00:00
Roman Lebedev
247b7e930a [DAGCombine] (x - C) - y -> (x - y) - C fold
Summary:
Again only vectors affected. Frustrating. Let me take a look into that..

https://rise4fun.com/Alive/AAq

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, JDevlieghere, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62294

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361856 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 17:54:21 +00:00
Roman Lebedev
6f9cfda93b [DAGCombine][X86][AArch64][AMDGPU] (x - y) + -1 -> add (xor y, -1), x fold
Summary:
This prevents regressions in next patch,
and somewhat recovers from the regression to AMDGPU test in D62223.

It is indeed not great that we leave vector decrement,
don't transform it into vector add all-ones..

https://rise4fun.com/Alive/ZRl

Reviewers: RKSimon, craig.topper, spatel, arsenm

Reviewed By: RKSimon, arsenm

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62263

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361855 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 17:54:13 +00:00
Roman Lebedev
5188cc4e06 [DAGCombiner][X86][AArch64] (x - C) + y -> (x + y) - C fold
Summary:
Only vector tests are being affected here,
since subtraction by scalar constant is rewritten
as addition by negated constant.

No surprising test changes.

https://rise4fun.com/Alive/pbT

Reviewers: RKSimon, craig.topper, spatel

Reviewed By: RKSimon

Subscribers: javed.absar, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361854 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 17:54:04 +00:00
Roman Lebedev
7728af49e5 [DAGCombiner][X86][AArch64][SPARC][SystemZ] y - (x + C) -> (y - x) - C fold
Summary:
Direct sibling of D62223 patch.
While i don't have a direct motivational pattern for this,
it would seem to make sense to handle both patterns (or none),
for symmetry?

The aarch64 changes look neutral;
sparc and systemz look like improvement (one less instruction each);
x86 changes - 32bit case improves, 64bit case shows that LEA no longer
gets constructed, which may be because that whole test is `-mattr=+slow-lea,+slow-3ops-lea`

https://rise4fun.com/Alive/ffh

Reviewers: RKSimon, craig.topper, spatel, t.p.northover

Reviewed By: t.p.northover

Subscribers: t.p.northover, jyknight, javed.absar, kristof.beyls, fedor.sergeev, jrtc27, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62252

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361853 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 17:53:54 +00:00
Roman Lebedev
63a3358033 [DAGCombiner][X86][AArch64][AMDGPU] (x + C) - y -> (x - y) + C fold
Summary:
The main motivation is shown by all these `neg` instructions that are now created.
In particular, the `@reg32_lshr_by_negated_unfolded_sub_b` test.

AArch64 test changes all look good (`neg` created), or neutral.

X86 changes look neutral (vectors), or good (`neg` / `xor eax, eax` created).

I'm not sure about `X86/ragreedy-hoist-spill.ll`, it looks like the spill
is now hoisted into preheader (which should still be good?),
2 4-byte reloads become 1 8-byte reload, and are elsewhere,
but i'm not sure how that affects that loop.

I'm unable to interpret AMDGPU change, looks neutral-ish?

This is hopefully a step towards solving [[ https://bugs.llvm.org/show_bug.cgi?id=41952 | PR41952 ]].

https://rise4fun.com/Alive/pkdq (we are missing more patterns, i'll submit them later)

Reviewers: craig.topper, RKSimon, spatel, arsenm

Reviewed By: RKSimon

Subscribers: bjope, qcolombet, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, javed.absar, dstuttard, tpr, t-tye, kristof.beyls, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D62223

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361852 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-28 17:53:43 +00:00
Alexander Timofeev
d224ecc383 [AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
             the correct register classes to the cross block values beforehand. For the divergent targets
             same value type requires different register classes dependent on the value divergence.

    Reviewers: rampitec, nhaehnle

    Differential Revision: https://reviews.llvm.org/D59990

    This commit was reverted because of the build failure.
    The reason was mlformed patch.
    Build failure fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361741 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-26 20:33:26 +00:00
Peter Collingbourne
d7a83f9517 Revert r361644, "[AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence."
Broke sanitizer bots:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/21694/steps/bootstrap%20clang/logs/stdio
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/32478/steps/check-llvm%20asan/logs/stdio

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361688 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-25 01:52:38 +00:00
Alexander Timofeev
6a29119c95 [AMDGPU] Divergence driven ISel. Assign register class for cross block values according to the divergence.
Details: To make instruction selection really divergence driven it is necessary to assign
         the correct register classes to the cross block values beforehand. For the divergent targets
         same value type requires different register classes dependent on the value divergence.

Reviewers: rampitec, nhaehnle

Differential Revision: https://reviews.llvm.org/D59990

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361644 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-24 15:32:18 +00:00
Simon Pilgrim
128852a784 [SelectionDAG] computeKnownBits - support constant pool values from target
This patch adds the overridable TargetLowering::getTargetConstantFromLoad function which allows targets to return any constant value loaded by a LoadSDNode node - only X86 makes use of this so far but everything should be in place for other targets.

computeKnownBits then uses this function to improve codegen, notably vector code after legalization.

A future commit will do the same for ComputeNumSignBits but computeKnownBits sees the bigger benefit.

This required a couple of fixes:
* SimplifyDemandedBits must early-out for getTargetConstantFromLoad cases to prevent infinite loops of constant regeneration (similar to what we already do for BUILD_VECTOR).
* Fix a DAGCombiner::visitTRUNCATE issue as we had trunc(shl(v8i32),v8i16) <-> shl(trunc(v8i16),v8i32) infinite loops after legalization on AVX512 targets.

Differential Revision: https://reviews.llvm.org/D61887

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361620 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-24 10:03:11 +00:00
Sanjay Patel
52cdffd499 [DAGCombiner] make folds of binops safe for opcodes that produce >1 value
This is no-functional-change-intended currently because the definition
of isBinOp() only includes opcodes that produce 1 value. But if we
share that implementation with isCommutativeBinOp() as proposed in
D62191, then we need to make sure that the callers bail out for
opcodes that they are not prepared to handle correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361547 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-23 20:17:25 +00:00
Sanjay Patel
c32b944575 [DAGCombiner] prevent unsafe reassociation of FP ops
There are no FP callers of DAGCombiner::reassociateOps() currently,
but we can add a fast-math check to make sure this API is not being
misused.

This was noted as a potential risk (and that risk might increase) with:
D62191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361268 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-21 14:47:38 +00:00
Craig Topper
97086cbf72 [DAGCombiner] Refactor code in visitShiftByConstant slightly to make it more readable. NFC
This changes the isShift variable to include the constant operand
check that was previously in the if statement.

While there fix an 80 column violation and an unnecessary use of
getNode. Also fix variable name capitalization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361168 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-20 16:26:55 +00:00
Roman Lebedev
9ca8d0cce0 [DAGCombiner] visitShiftByConstant(): drop bogus signbit check
Summary:
That check claims that the transform is illegal otherwise.
That isn't true:
1. For `ISD::ADD`, we only process `ISD::SHL` outer shift => sign bit does not matter
   https://rise4fun.com/Alive/K4A
2. For `ISD::AND`, there is no restriction on constants:
   https://rise4fun.com/Alive/Wy3
3. For `ISD::OR`, there is no restriction on constants:
   https://rise4fun.com/Alive/GOH
3. For `ISD::XOR`, there is no restriction on constants:
   https://rise4fun.com/Alive/ml6

So, why is it there then?

This changes the testcase that was touched by @spatel in rL347478,
but i'm not sure that test tests anything particular?

Reviewers: RKSimon, spatel, craig.topper, jojo, rengolin

Reviewed By: spatel

Subscribers: javed.absar, llvm-commits, spatel

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61918

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@361044 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-17 15:52:58 +00:00
Clement Courbet
def5cb78d0 [[DAGCombiner][NFC] Add a comment.
As suggested in D61846.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360755 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-15 08:21:18 +00:00
Sanjay Patel
172a32ffb7 [SDAG] fix unused variable warning and unneeded indirection; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360640 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-14 00:57:31 +00:00
Sanjay Patel
216d69b719 [SDAG, x86] allow targets to override test for binop opcodes
This follows the pattern of the existing isCommutativeBinOp().

x86 shows improvements from vector narrowing for the min/max opcodes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360639 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-14 00:39:40 +00:00
Sanjay Patel
f098d456c4 [DAGCombiner] narrow vector binop with inserts/extract
We catch most of these patterns (on x86 at least) by matching
a concat vectors opcode early in combining, but the pattern may
emerge later using insert subvector instead.

The AVX1 diffs for add/sub overflow show another missed narrowing
pattern. That one may be falling though the cracks because of
combine ordering and multiple uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360585 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-13 14:31:14 +00:00
Clement Courbet
013814dc77 [DAGCombiner] Fix invalid alias analysis.
Summary:
When we know for sure whether two addresses do or do not alias, we
should immediately return from DAGCombiner::isAlias().

I think this comes from a bad copy/paste, Sorry for not catching that during the
code review.

Fixes PR41855.

Reviewers: niravd, gchatelet, EricWF

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360566 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-13 09:07:37 +00:00
Sanjay Patel
874cb15d79 [DAGCombiner] try to move bitcast after extract_subvector
I noticed that we were failing to narrow an x86 ymm math op in a case similar
to the 'madd' test diff. That is because a bitcast is sitting between the math
and the extract subvector and thwarting our pattern matching for narrowing:

       t56: v8i32 = add t59, t58
      t68: v4i64 = bitcast t56
    t73: v2i64 = extract_subvector t68, Constant:i64<2>
  t96: v4i32 = bitcast t73

There are a few wins and neutral diffs in the other tests.

Differential Revision: https://reviews.llvm.org/D61806

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360541 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-12 14:43:20 +00:00
Jordan Rupprecht
fc8687d94d Revert [DAGCombiner] Avoid creating large tokenfactors in visitTokenFactor
This reverts r360171 (git commit a9d6c32eafc645c55b07eb50698c428e14c0bffd). A repro showing the asan/msan failures is forthcoming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360481 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-10 23:20:02 +00:00
Sanjay Patel
354c14e139 [DAGCombiner] reduce code duplication; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360462 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-10 20:02:30 +00:00