337 Commits

Author SHA1 Message Date
Zi Xuan Wu
704914973a recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374634 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-12 02:53:04 +00:00
Jinsong Ji
cf65f7210c Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"
Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"

This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a.
This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04.

The patch is breaking PowerPC internal build, checked with author, reverting
on behalf of him for now due to timezone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374091 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 17:32:56 +00:00
Zi Xuan Wu
ee8d82e802 [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374017 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 03:28:33 +00:00
Sanjay Patel
93fffbb1bb [LoopVectorize] add test that asserted after cost model change (PR43582); NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373913 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 14:48:27 +00:00
Philip Reames
3c63e9245b Remove a duplicate test
Turns out I'd already added exactly the same test under the name non_unit_stride.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371777 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 21:40:15 +00:00
Florian Hahn
a9994f8207 [LV] Update test case after r371768.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371769 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 20:07:17 +00:00
Philip Reames
e2b5847a9a Precommit tests for generalization of load dereferenceability in loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371747 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 17:09:01 +00:00
Philip Reames
f0e52d999f [LV] Support invariant addresses in speculation logic
Implement a TODO from rL371452, and handle loop invariant addresses in predicated blocks. If we can prove that the load is safe to speculate into the header, then we can avoid using a masked.load in favour of a normal load.

This is mostly about vectorization robustness. In the common case, it's generally expected that LICM/LoadStorePromotion would have eliminated such loads entirely.

Differential Revision: https://reviews.llvm.org/D67372



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371745 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 16:49:10 +00:00
Philip Reames
295b9f5b21 [Tests] Fix a typo in a test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371456 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-09 21:33:59 +00:00
Philip Reames
1d5e67fb85 [Tests] Precommit test case for D67372
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371455 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-09 21:32:16 +00:00
Philip Reames
33ce8a44c1 [LoopVectorize] Leverage speculation safety to avoid masked.loads
If we're vectorizing a load in a predicated block, check to see if the load can be speculated rather than predicated.  This allows us to generate a normal vector load instead of a masked.load.

To do so, we must prove that all bytes accessed on any iteration of the original loop are dereferenceable, and that all loads (across all iterations) are properly aligned.  This is equivelent to proving that hoisting the load into the loop header in the original scalar loop is safe.

Note: There are a couple of code motion todos in the code.  My intention is to wait about a day - to be sure this sticks - and then perform the NFC motion without furthe review.

Differential Revision: https://reviews.llvm.org/D66688



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371452 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-09 20:54:13 +00:00
Craig Topper
5a75b69d70 [X86] Replace -mcpu with -mattr on some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371260 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-06 21:48:44 +00:00
Ayal Zaks
cbb4ef7bbf [LV] Fold tail by masking - handle reductions
Allow vectorizing loops that have reductions when tail is folded by masking.
A select is introduced in VPlan, choosing between the last value carried by the
loop-exit/live-out instruction of the reduction, and the penultimate value
carried by the reduction phi, according to the "i < n" mask of fold-tail.
This select replaces the last value as the live-out value of the loop.

Differential Revision: https://reviews.llvm.org/D66720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370173 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-28 09:02:23 +00:00
Philip Reames
acc77dfffb Preland test cases for D66688 to make diffs clear.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369959 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-26 20:37:06 +00:00
Dorit Nuzman
ecfc353229 [LV] fold-tail predication should be respected even with assume_safety
assume_safety implies that loads under "if's" can be safely executed
speculatively (unguarded, unmasked). However this assumption holds only for the
original user "if's", not those introduced by the compiler, such as the
fold-tail "if" that guards us from loading beyond the original loop trip-count.
Currently the combination of fold-tail and assume-safety pragmas results in
ignoring the fold-tail predicate that guards the loads, generating unmasked
loads. This patch fixes this behavior.

Differential Revision: https://reviews.llvm.org/D66106

Reviewers: Ayal, hsaito, fhahn



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368973 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-15 07:12:14 +00:00
Dorit Nuzman
6c9a51ee41 [LV] Fold-tail flag
This is the compiler-flag equivalent of the Predicate pragma
(https://reviews.llvm.org/D65197), to direct the vectorizer to fold the
remainder-loop into the main-loop using predication.

Differential Revision: https://reviews.llvm.org/D66108

Reviewers: Ayal, hsaito, fhahn, SjoerdMeije



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368801 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-14 05:22:20 +00:00
Craig Topper
5a370c1ad8 [LoopVectorize][X86] Clamp interleave factor if we have a known constant trip count that is less than VF*interleave
If we know the trip count, we should make sure the interleave factor won't cause the vectorized loop to exceed it.

Improves one of the cases from PR42674

Differential Revision: https://reviews.llvm.org/D65896

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368215 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-07 21:44:14 +00:00
Craig Topper
be53a73d95 [LoopVectorize][X86] Add test case for missed vectorization from PR42674.
We do end vectorizing the code, but use an interleave factor that
is too high and causes the vector code to be dead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368197 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-07 19:07:10 +00:00
Jay Foad
51e75e563c [LV] Fix test failure in a Release build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367666 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-02 08:33:41 +00:00
Hideki Saito
9fcaf1b39a Moves the newly added test interleaved-accesses-waw-dependency.ll to X86 subdirectory.
ps4-buildslave1 reported a failure. The test has x86 triple.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367659 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-02 07:25:09 +00:00
Sjoerd Meijer
d527e64640 [LV] Tail-Loop Folding
This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.

Differential Revision: https://reviews.llvm.org/D65197


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367592 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-01 18:21:44 +00:00
Petr Hosek
bd47e3ac42 Revert "[IRBuilder] Fold consistently for or/and whether constant is LHS or RHS"
This reverts commit r365260 which broke the following tests:

    Clang :: CodeGenCXX/cfi-mfcall.cpp
    Clang :: CodeGenObjC/ubsan-nullability.m
    LLVM :: Transforms/LoopVectorize/AArch64/pr36032.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365284 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-07 22:12:01 +00:00
Philip Reames
531d8c4961 [IRBuilder] Fold consistently for or/and whether constant is LHS or RHS
Without this, we have the unfortunate property that tests are dependent on the order of operads passed the CreateOr and CreateAnd functions.  In actual usage, we'd promptly optimize them away, but it made tests slightly more verbose than they should have been.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365260 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-06 04:28:00 +00:00
Orlando Cazalet-Hyams
b1ba8a93bc [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 363046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363786 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 10:50:47 +00:00
Warren Ristow
31868b92df [LV] Suppress vectorization in some nontemporal cases
When considering a loop containing nontemporal stores or loads for
vectorization, suppress the vectorization if the corresponding
vectorized store or load with the aligment of the original scaler
memory op is not supported with the nontemporal hint on the target.

This adds two new functions:
  bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
  bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

to TTI, leaving the target independent default implementation as
returning true, but with overriding implementations for X86 that
check the legality based on available Subtarget features.

This fixes https://llvm.org/PR40759

Differential Revision: https://reviews.llvm.org/D61764


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363581 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 17:20:08 +00:00
Bjorn Pettersson
784000b4c2 [LV] Deny irregular types in interleavedAccessCanBeWidened
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.

Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.

Reviewers: fhahn, Ayal, craig.topper

Reviewed By: fhahn

Subscribers: hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63386

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363547 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 12:02:24 +00:00
Fangrui Song
1002960b9d [lit] Delete empty lines at the end of lit.local.cfg NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363538 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 09:51:07 +00:00
Sam Parker
d021f415d1 [SCEV] Pass NoWrapFlags when expanding an AddExpr
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.

This is the first re-commit of the functional changes from rL362687,
which was previously reverted.

Differential Revision: https://reviews.llvm.org/D61934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363364 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-14 09:19:41 +00:00
Orlando Cazalet-Hyams
f36c1c031e Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion"
This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7.
See phabricator thread for D60831.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363132 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 08:34:51 +00:00
Orlando Cazalet-Hyams
6c60c71a4e [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363046 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-11 10:37:20 +00:00
Benjamin Kramer
600f7b5a8d Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362699 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 12:35:46 +00:00
Sam Parker
6892c31cf9 [SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.

Differential Revision: https://reviews.llvm.org/D61934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362687 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 08:56:26 +00:00
Simon Pilgrim
3748537115 [CostModel][X86] Improve masked load/store AVX1/AVX2 costs
A mixture of internal tests and review of the scheduler models indicates we're overestimating the cost of a masked load, which we're estimating at 4x regular memory ops - more realistic values indicates that its closer to 2x. Masked stores costs are a lot more diverse but 8x is roughly in the middle of the range.

e.g. SandyBridge
defm : X86WriteRes<WriteFMaskedLoad, [SBPort23,SBPort05], 8, [1,2], 3>;
defm : X86WriteRes<WriteFMaskedLoadY, [SBPort23,SBPort05], 9, [1,2], 3>;
defm : X86WriteRes<WriteFMaskedStore, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>;
defm : X86WriteRes<WriteFMaskedStoreY, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>;

e.g. Btver2
defm : X86WriteRes<WriteFMaskedLoad, [JLAGU, JFPU01, JFPX], 6, [1, 2, 2], 1>;
defm : X86WriteRes<WriteFMaskedLoadY, [JLAGU, JFPU01, JFPX], 6, [2, 4, 4], 2>;
defm : X86WriteRes<WriteFMaskedStore, [JSAGU, JFPU01, JFPX], 6, [1, 1, 4], 1>;
defm : X86WriteRes<WriteFMaskedStoreY, [JSAGU, JFPU01, JFPX], 6, [2, 2, 4], 2>;

Differential Revision: https://reviews.llvm.org/D61257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362338 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-02 20:37:02 +00:00
Craig Topper
8c0b3da9d4 [LoopVectorize] Add FNeg instruction support
Differential Revision: https://reviews.llvm.org/D62510

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362124 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 18:19:35 +00:00
Craig Topper
13480b8846 [LoopVectorize] Precommit tests for D62510. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362060 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-30 06:48:13 +00:00
Sanjay Patel
1a7c631eb2 [LoopVectorizer] fix test file to not run the entire -O3 pipeline
This test file has a long history of edits from changes outside
of vectorization, and it would happen again with the proposal in
D61726.

End-to-end testing shouldn't be happening in a test file that is
specifically checking for vector masked load/store ops.
Larger-scale testing goes in PhaseOrdering or the test-suite.

I've hopefully preserved the intent by taking what was completely
unoptimized IR in some tests and passing that through the -O1
pipeline. That becomes the input IR, and now we just run the loop
vectorizer and verify that the vector masked ops are produced as
expected.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360340 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-09 13:43:22 +00:00
Kostya Serebryany
53dcce4331 revert r360162 as it breaks most of the buildbots
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360190 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-07 20:57:11 +00:00
Orlando Cazalet-Hyams
8c101025dc [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel

Reviewed By: hfinkel

Subscribers: bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360162 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-07 15:37:38 +00:00
Keno Fischer
35a5cc8b49 [SCEV] Add explicit representations of umin/smin
Summary:
Currently we express umin as `~umax(~x, ~y)`. However, this becomes
a problem for operands in non-integral pointer spaces, because `~x`
is not something we can compute for `x` non-integral. However, since
comparisons are generally still allowed, we are actually able to
express `umin(x, y)` directly as long as we don't try to express is
as a umax. Support this by adding an explicit umin/smin representation
to SCEV. We do this by factoring the existing getUMax/getSMax functions
into a new function that does all four. The previous two functions were
largely identical.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D50167

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@360159 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-07 15:28:47 +00:00
Alina Sbirlea
1e33097313 [PassManagerBuilder] Add option for interleaved loops, for loop vectorize.
Summary:
Match NewPassManager behavior: add option for interleaved loops in the
old pass manager, and use that instead of the flag used to disable loop unroll.
No changes in the defaults.

Reviewers: chandlerc

Subscribers: mehdi_amini, jlebar, dmgreen, hsaito, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61030

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359615 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-30 21:29:20 +00:00
Alina Sbirlea
6ac87016ff Enable LoopVectorization by default.
Summary:
When refactoring vectorization flags, vectorization was disabled by default in the new pass manager.
This patch re-enables is for both managers, and changes the assumptions opt makes, based on the new defaults.
Comments in opt.cpp should clarify the intended use of all flags to enable/disable vectorization.

Reviewers: chandlerc, jgorbe

Subscribers: jlebar, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61091

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359167 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-25 04:49:48 +00:00
Eric Christopher
598198edbc Revert "Temporarily Revert "Add basic loop fusion pass.""
The reversion apparently deleted the test/Transforms directory.

Will be re-reverting again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358552 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-17 04:52:47 +00:00
Eric Christopher
02cc44c1b9 Temporarily Revert "Add basic loop fusion pass."
As it's causing some bot failures (and per request from kbarton).

This reverts commit r358543/ab70da07286e618016e78247e4a24fcb84077fda.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358546 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-17 02:12:23 +00:00
Florian Hahn
3259337afd [VPlan] Determine Vector Width programmatically.
With this change, the VPlan native path is triggered with the directive:

   #pragma clang loop vectorize(enable)

There is no need to specify the vectorize_width(N) clause.

Patch by Francesco Petrogalli <francesco.petrogalli@arm.com>

Differential Revision: https://reviews.llvm.org/D57598

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357156 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-28 10:37:12 +00:00
Simon Pilgrim
ce23689b40 [SLPVectorizer] reorderInputsAccordingToOpcode - remove non-Instruction canonicalization
Remove attempts to commute non-Instructions to the LHS - the codegen changes appear to rely on chance more than anything else and also have a tendency to fight existing instcombine canonicalization which moves constants to the RHS of commutable binary ops.

This is prep work towards:
(a) reusing reorderInputsAccordingToOpcode for alt-shuffles and removing the similar reorderAltShuffleOperands
(b) improving reordering to optimized cases with commutable and non-commutable instructions to still find splat/consecutive ops.

Differential Revision: https://reviews.llvm.org/D59738

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356913 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 15:53:55 +00:00
Craig Topper
8ff8e3360b [InstCombine] Don't transform ((C1 OP zext(X)) & C2) -> zext((C1 OP X) & C2) if either zext or OP has another use.
If they have other users we'll just end up increasing the instruction count.

We might be able to weaken this to only one of them having a single use if we can prove that the and will be removed.

Fixes PR41164.

Differential Revision: https://reviews.llvm.org/D59630

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356690 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-21 17:50:49 +00:00
Nikita Popov
cd6f62dcd4 [ValueTracking] Use computeConstantRange() for unsigned add/sub overflow
Improve computeOverflowForUnsignedAdd/Sub in ValueTracking by
intersecting the computeConstantRange() result into the ConstantRange
created from computeKnownBits(). This allows us to detect some
additional never/always overflows conditions that can't be determined
from known bits.

This revision also adds basic handling for constants to
computeConstantRange(). Non-splat vectors will be handled in a followup.

The signed case will also be handled in a followup, as it needs some
more groundwork.

Differential Revision: https://reviews.llvm.org/D59386

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356489 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-19 17:53:56 +00:00
Sanjoy Das
2d9ad10711 Reland "Relax constraints for reduction vectorization"
Change from original commit: move test (that uses an X86 triple) into the X86
subdirectory.

Original description:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355889 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-12 01:31:44 +00:00
Florian Hahn
b11b60da2d [InterleavedAccessAnalysis] Fix integer overflow in insertMember.
Without checking for integer overflow, invalid members can be added
 e.g. if the calculated key overflows, becomes positive and the largest key.

This fixes
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=7560
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13128
      https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=13229

Reviewers: Ayal, anna, hsaito, efriedma

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D55538

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355613 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-07 17:50:16 +00:00
Nikita Popov
29ba81ded0 [ValueTracking] More accurate unsigned sub overflow detection
Second part of D58593.

Compute precise overflow conditions based on all known bits, rather
than just the sign bits. Unsigned a - b overflows iff a < b, and we
can determine whether this always/never happens based on the minimal
and maximal values achievable for a and b subject to the known bits
constraint.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@355109 91177308-0d34-0410-b5e6-96231b3b80d8
2019-02-28 18:04:20 +00:00