864 Commits

Author SHA1 Message Date
Sam Parker
3a4bfa616e [DAGCombine][ARM] Enable extending masked loads
Add generic DAG combine for extending masked loads.

Allow us to generate sext/zext masked loads which can access v4i8,
v8i8 and v4i16 memory to produce v4i32, v8i16 and v4i32 respectively.

Differential Revision: https://reviews.llvm.org/D68337

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375085 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-17 07:55:55 +00:00
Zi Xuan Wu
704914973a recommit: [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374634 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-12 02:53:04 +00:00
Sjoerd Meijer
0625648970 [LV] Emitting SCEV checks with OptForSize
When optimising for size and SCEV runtime checks need to be emitted to check
overflow behaviour, the loop vectorizer can run in this assert:

  LoopVectorize.cpp:2699: void llvm::InnerLoopVectorizer::emitSCEVChecks(
  llvm::Loop *, llvm::BasicBlock *): Assertion `!BB->getParent()->hasOptSize()
  && "Cannot SCEV check stride or overflow when opt

We should not generate predicates while optimising for size because
code will be generated for predicates such as these SCEV overflow runtime
checks.

This should fix PR43371.

Differential Revision: https://reviews.llvm.org/D68082

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374166 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-09 13:19:41 +00:00
Jinsong Ji
cf65f7210c Revert "[LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize"
Also Revert "[LoopVectorize] Fix non-debug builds after rL374017"

This reverts commit 9f41deccc0e648a006c9f38e11919f181b6c7e0a.
This reverts commit 18b6fe07bcf44294f200bd2b526cb737ed275c04.

The patch is breaking PowerPC internal build, checked with author, reverting
on behalf of him for now due to timezone.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374091 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 17:32:56 +00:00
Zi Xuan Wu
ed9288e602 [NFC] Add REQUIRES for r374017 in testcase
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374027 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 08:49:15 +00:00
Zi Xuan Wu
ee8d82e802 [LoopVectorize][PowerPC] Estimate int and float register pressure separately in loop-vectorize
In loop-vectorize, interleave count and vector factor depend on target register number. Currently, it does not
estimate different register pressure for different register class separately(especially for scalar type,
float type should not be on the same position with int type), so it's not accurate. Specifically,
it causes too many times interleaving/unrolling, result in too many register spills in loop body and hurting performance.

So we need classify the register classes in IR level, and importantly these are abstract register classes,
and are not the target register class of backend provided in td file. It's used to establish the mapping between
the types of IR values and the number of simultaneous live ranges to which we'd like to limit for some set of those types.

For example, POWER target, register num is special when VSX is enabled. When VSX is enabled, the number of int scalar register is 32(GPR),
float is 64(VSR), but for int and float vector register both are 64(VSR). So there should be 2 kinds of register class when vsx is enabled,
and 3 kinds of register class when VSX is NOT enabled.

It runs on POWER target, it makes big(+~30%) performance improvement in one specific bmk(503.bwaves_r) of spec2017 and no other obvious degressions.

Differential revision: https://reviews.llvm.org/D67148


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374017 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-08 03:28:33 +00:00
Sanjay Patel
93fffbb1bb [LoopVectorize] add test that asserted after cost model change (PR43582); NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373913 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-07 14:48:27 +00:00
Sjoerd Meijer
c2d414b4da [LV] Forced vectorization with runtime checks and OptForSize
When vectorisation is forced with a pragma, we optimise for min size, and we
need to emit runtime memory checks, then allow this code growth and don't run
in an assert like we currently do.

This is the result of D65197 and D66803, and was a use-case not really
considered before. If this now happens, we emit an optimisation remark warning
about the code-size expansion, which can be avoided by not forcing
vectorisation or possibly source-code modifications.

Differential Revision: https://reviews.llvm.org/D67764

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372694 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-24 08:03:34 +00:00
Sjoerd Meijer
4da58da41b [LV] Add ARM MVE tail-folding tests
Now that the vectorizer can do tail-folding (rL367592), and the ARM backend
understands MVE masked loads/stores (rL371932), it's time to add the MVE
tail-folding equivalent of the X86 tests that I added.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371996 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 14:56:26 +00:00
David Green
f343dcd487 [ARM] Masked loads and stores
Masked loads and store fit naturally with MVE, the instructions being easily
predicated. This adds lowering for the simple cases of masked loads and stores.
It does not yet deal with widening/narrowing or pre/post inc, and so is
currently behind an option.

The llvm masked load intrinsic will accept a "passthru" value, dictating the
values used for the zero masked lanes. In MVE the instructions write 0 to the
zero predicated lanes, so we need to match a passthru that isn't 0 (or undef)
with a select instruction to pull in the correct data after the load.

Differential Revision: https://reviews.llvm.org/D67186


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371932 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-15 14:14:47 +00:00
Philip Reames
3c63e9245b Remove a duplicate test
Turns out I'd already added exactly the same test under the name non_unit_stride.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371777 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 21:40:15 +00:00
Florian Hahn
a9994f8207 [LV] Update test case after r371768.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371769 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 20:07:17 +00:00
Philip Reames
e2b5847a9a Precommit tests for generalization of load dereferenceability in loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371747 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 17:09:01 +00:00
Philip Reames
f0e52d999f [LV] Support invariant addresses in speculation logic
Implement a TODO from rL371452, and handle loop invariant addresses in predicated blocks. If we can prove that the load is safe to speculate into the header, then we can avoid using a masked.load in favour of a normal load.

This is mostly about vectorization robustness. In the common case, it's generally expected that LICM/LoadStorePromotion would have eliminated such loads entirely.

Differential Revision: https://reviews.llvm.org/D67372



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371745 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-12 16:49:10 +00:00
Philip Reames
295b9f5b21 [Tests] Fix a typo in a test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371456 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-09 21:33:59 +00:00
Philip Reames
1d5e67fb85 [Tests] Precommit test case for D67372
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371455 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-09 21:32:16 +00:00
Philip Reames
33ce8a44c1 [LoopVectorize] Leverage speculation safety to avoid masked.loads
If we're vectorizing a load in a predicated block, check to see if the load can be speculated rather than predicated.  This allows us to generate a normal vector load instead of a masked.load.

To do so, we must prove that all bytes accessed on any iteration of the original loop are dereferenceable, and that all loads (across all iterations) are properly aligned.  This is equivelent to proving that hoisting the load into the loop header in the original scalar loop is safe.

Note: There are a couple of code motion todos in the code.  My intention is to wait about a day - to be sure this sticks - and then perform the NFC motion without furthe review.

Differential Revision: https://reviews.llvm.org/D66688



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371452 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-09 20:54:13 +00:00
Craig Topper
5a75b69d70 [X86] Replace -mcpu with -mattr on some tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@371260 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-06 21:48:44 +00:00
Bjorn Pettersson
b915bf3d8a [LV] Fix miscompiles by adding non-header PHI nodes to AllowedExit
Summary:
Fold-tail currently supports reduction last-vector-value live-out's,
but has yet to support last-scalar-value live-outs, including
non-header phi's. As it relies on AllowedExit in order to detect
them and bail out we need to add the non-header PHI nodes to
AllowedExit, otherwise we end up with miscompiles.

Solves https://bugs.llvm.org/show_bug.cgi?id=43166

Reviewers: fhahn, Ayal

Reviewed By: fhahn, Ayal

Subscribers: anna, hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67074

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370721 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-03 09:33:55 +00:00
Bjorn Pettersson
e101e036b2 [LV] Precommit test case showing miscompile from PR43166. NFC
Summary:  Precommit test case showing miscompile from PR43166.

Reviewers: fhahn, Ayal

Reviewed By: fhahn

Subscribers: rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67072

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370720 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-03 09:33:40 +00:00
Ayal Zaks
cbb4ef7bbf [LV] Fold tail by masking - handle reductions
Allow vectorizing loops that have reductions when tail is folded by masking.
A select is introduced in VPlan, choosing between the last value carried by the
loop-exit/live-out instruction of the reduction, and the penultimate value
carried by the reduction phi, according to the "i < n" mask of fold-tail.
This select replaces the last value as the live-out value of the loop.

Differential Revision: https://reviews.llvm.org/D66720

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@370173 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-28 09:02:23 +00:00
Philip Reames
acc77dfffb Preland test cases for D66688 to make diffs clear.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369959 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-26 20:37:06 +00:00
David Green
8039189876 [ARM] Don't pretend we know how to generate MVE VLDn
We don't yet know how to generate these instructions for MVE. And in the case
of VLD3, we don't even have the instruction. For the moment don't tell the
vectoriser that we have VLD4, just to end up serialising the results.

Differential Revision: https://reviews.llvm.org/D66009


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369101 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-16 13:06:49 +00:00
Dorit Nuzman
ecfc353229 [LV] fold-tail predication should be respected even with assume_safety
assume_safety implies that loads under "if's" can be safely executed
speculatively (unguarded, unmasked). However this assumption holds only for the
original user "if's", not those introduced by the compiler, such as the
fold-tail "if" that guards us from loading beyond the original loop trip-count.
Currently the combination of fold-tail and assume-safety pragmas results in
ignoring the fold-tail predicate that guards the loads, generating unmasked
loads. This patch fixes this behavior.

Differential Revision: https://reviews.llvm.org/D66106

Reviewers: Ayal, hsaito, fhahn



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368973 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-15 07:12:14 +00:00
Dorit Nuzman
6c9a51ee41 [LV] Fold-tail flag
This is the compiler-flag equivalent of the Predicate pragma
(https://reviews.llvm.org/D65197), to direct the vectorizer to fold the
remainder-loop into the main-loop using predication.

Differential Revision: https://reviews.llvm.org/D66108

Reviewers: Ayal, hsaito, fhahn, SjoerdMeije



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368801 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-14 05:22:20 +00:00
David Green
6cc3211b70 [ARM] Permit auto-vectorization using MVE
With enough codegen complete, we can now correctly report the number and size
of vector registers for MVE, allowing auto vectorisation. This also allows FP
auto-vectorization for MVE without -Ofast/-ffast-math, due to support for IEEE
FP arithmetic and parity between scalar and vector FP behaviour.

Patch by David Sherwood.

Differential Revision: https://reviews.llvm.org/D63728


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368529 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-11 08:42:57 +00:00
Craig Topper
5a370c1ad8 [LoopVectorize][X86] Clamp interleave factor if we have a known constant trip count that is less than VF*interleave
If we know the trip count, we should make sure the interleave factor won't cause the vectorized loop to exceed it.

Improves one of the cases from PR42674

Differential Revision: https://reviews.llvm.org/D65896

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368215 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-07 21:44:14 +00:00
Craig Topper
be53a73d95 [LoopVectorize][X86] Add test case for missed vectorization from PR42674.
We do end vectorizing the code, but use an interleave factor that
is too high and causes the vector code to be dead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368197 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-07 19:07:10 +00:00
Hideki Saito
057a373e1c [LV][NFC] Share the LV illegality reporting with LoopVectorize.
Reviewers: hsaito, fhahn, rengolin
 
Reviewed By: rengolin
 
Patch by psamolysov, thanks!
 
Differential Revision: https://reviews.llvm.org/D62997



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367980 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-06 06:08:48 +00:00
Jay Foad
51e75e563c [LV] Fix test failure in a Release build.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367666 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-02 08:33:41 +00:00
Hideki Saito
9fcaf1b39a Moves the newly added test interleaved-accesses-waw-dependency.ll to X86 subdirectory.
ps4-buildslave1 reported a failure. The test has x86 triple.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367659 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-02 07:25:09 +00:00
Hideki Saito
71b98e0841 [LV] Avoid building interleaved group in presence of WAW dependency
Reviewers: hsaito, Ayal, fhahn, anna, mkazantsev

Reviewed By: hsaito

Patch by evrevnov, thanks!

Differential Revision: https://reviews.llvm.org/D63981



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367654 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-02 06:31:50 +00:00
Sjoerd Meijer
d527e64640 [LV] Tail-Loop Folding
This allows folding of the scalar epilogue loop (the tail) into the main
vectorised loop body when the loop is annotated with a "vector predicate"
metadata hint. To fold the tail, instructions need to be predicated (masked),
enabling/disabling lanes for the remainder iterations.

Differential Revision: https://reviews.llvm.org/D65197


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367592 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-01 18:21:44 +00:00
Florian Hahn
bbc7752ec4 [LoopVectorize] Pass unfiltered list of arguments to getIntrinsicInstCost.
We do not compute the scalarization overhead in getVectorIntrinsicCost
and TTI::getIntrinsicInstrCost requires the full arguments list.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366049 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-15 08:48:47 +00:00
Florian Hahn
3748d783e0 [LV] Exclude loop-invariant inputs from scalar cost computation.
Loop invariant operands do not need to be scalarized, as we are using
the values outside the loop. We should ignore them when computing the
scalarization overhead.

Fixes PR41294

Reviewers: hsaito, rengolin, dcaballe, Ayal

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D59995

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366030 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-14 20:12:36 +00:00
Petr Hosek
bd47e3ac42 Revert "[IRBuilder] Fold consistently for or/and whether constant is LHS or RHS"
This reverts commit r365260 which broke the following tests:

    Clang :: CodeGenCXX/cfi-mfcall.cpp
    Clang :: CodeGenObjC/ubsan-nullability.m
    LLVM :: Transforms/LoopVectorize/AArch64/pr36032.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365284 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-07 22:12:01 +00:00
Philip Reames
531d8c4961 [IRBuilder] Fold consistently for or/and whether constant is LHS or RHS
Without this, we have the unfortunate property that tests are dependent on the order of operads passed the CreateOr and CreateAnd functions.  In actual usage, we'd promptly optimize them away, but it made tests slightly more verbose than they should have been.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365260 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-06 04:28:00 +00:00
Orlando Cazalet-Hyams
b1ba8a93bc [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

llvm-svn: 363046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363786 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 10:50:47 +00:00
Warren Ristow
31868b92df [LV] Suppress vectorization in some nontemporal cases
When considering a loop containing nontemporal stores or loads for
vectorization, suppress the vectorization if the corresponding
vectorized store or load with the aligment of the original scaler
memory op is not supported with the nontemporal hint on the target.

This adds two new functions:
  bool isLegalNTStore(Type *DataType, unsigned Alignment) const;
  bool isLegalNTLoad(Type *DataType, unsigned Alignment) const;

to TTI, leaving the target independent default implementation as
returning true, but with overriding implementations for X86 that
check the legality based on available Subtarget features.

This fixes https://llvm.org/PR40759

Differential Revision: https://reviews.llvm.org/D61764


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363581 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 17:20:08 +00:00
Bjorn Pettersson
784000b4c2 [LV] Deny irregular types in interleavedAccessCanBeWidened
Summary:
Avoid that loop vectorizer creates loads/stores of vectors
with "irregular" types when interleaving. An example of
an irregular type is x86_fp80 that is 80 bits, but that
may have an allocation size that is 96 bits. So an array
of x86_fp80 is not bitcast compatible with a vector
of the same type.

Not sure if interleavedAccessCanBeWidened is the best
place for this check, but it solves the problem seen
in the added test case. And it is the same kind of check
that already exists in memoryInstructionCanBeWidened.

Reviewers: fhahn, Ayal, craig.topper

Reviewed By: fhahn

Subscribers: hiraditya, rkruppe, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D63386

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363547 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 12:02:24 +00:00
Fangrui Song
1002960b9d [lit] Delete empty lines at the end of lit.local.cfg NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363538 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 09:51:07 +00:00
Sam Parker
d021f415d1 [SCEV] Pass NoWrapFlags when expanding an AddExpr
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.

This is the first re-commit of the functional changes from rL362687,
which was previously reverted.

Differential Revision: https://reviews.llvm.org/D61934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363364 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-14 09:19:41 +00:00
Sander de Smalen
cdca5bb4d5 Improve reduction intrinsics by overloading result value.
This patch uses the mechanism from D62995 to strengthen the
definitions of the reduction intrinsics by letting the scalar
result/accumulator type be overloaded from the vector element type.

For example:

  ; The LLVM LangRef specifies that the scalar result must equal the
  ; vector element type, but this is not checked/enforced by LLVM.
  declare i32 @llvm.experimental.vector.reduce.or.i32.v4i32(<4 x i32> %a)

This patch changes that into:

  declare i32 @llvm.experimental.vector.reduce.or.v4i32(<4 x i32> %a)

Which has the type-constraint more explicit and causes LLVM to check
the result type with the vector element type.

Reviewers: RKSimon, arsenm, rnk, greened, aemerson

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D62996


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363240 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-13 09:37:38 +00:00
Matt Arsenault
e04248b7f0 LoopDistribute/LAA: Add tests to catch regressions
I broke 2 of these with a patch, but were not covered by existing
tests.

https://reviews.llvm.org/D63035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363158 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 13:15:59 +00:00
Orlando Cazalet-Hyams
f36c1c031e Revert "[DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion"
This reverts commit 1a0f7a2077b70c9864faa476e15b048686cf1ca7.
See phabricator thread for D60831.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363132 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 08:34:51 +00:00
Orlando Cazalet-Hyams
6c60c71a4e [DebugInfo@O2][LoopVectorize] pr39024: Vectorized code linenos step through loop even after completion
Summary:
Bug: https://bugs.llvm.org/show_bug.cgi?id=39024

The bug reports that a vectorized loop is stepped through 4 times and each step through the loop seemed to show a different path. I found two problems here:

A) An incorrect line number on a preheader block (for.body.preheader) instruction causes a step into the loop before it begins.
B) Instructions in the middle block have different line numbers which give the impression of another iteration.

In this patch I give all of the middle block instructions the line number of the scalar loop latch terminator branch. This seems to provide the smoothest debugging experience because the vectorized loops will always end on this line before dropping into the scalar loop. To solve problem A I have altered llvm::SplitBlockPredecessors to accommodate loop header blocks.

I have set up a separate review D61933 for a fix which is required for this patch.

Reviewers: samsonov, vsk, aprantl, probinson, anemet, hfinkel, jmorse

Reviewed By: hfinkel, jmorse

Subscribers: jmorse, javed.absar, eraman, kcc, bjope, jmellorcrummey, hfinkel, gbedwell, hiraditya, zzheng, llvm-commits

Tags: #llvm, #debug-info

Differential Revision: https://reviews.llvm.org/D60831

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363046 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-11 10:37:20 +00:00
Benjamin Kramer
600f7b5a8d Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362699 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 12:35:46 +00:00
Sam Parker
6892c31cf9 [SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.

Differential Revision: https://reviews.llvm.org/D61934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362687 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 08:56:26 +00:00
Nemanja Ivanovic
c0b8c8f0c4 Initial support for IBM MASS vector library
This is the LLVM portion of patch https://reviews.llvm.org/D59881.
The clang portion is to follow.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362568 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-05 01:31:43 +00:00
Simon Pilgrim
3748537115 [CostModel][X86] Improve masked load/store AVX1/AVX2 costs
A mixture of internal tests and review of the scheduler models indicates we're overestimating the cost of a masked load, which we're estimating at 4x regular memory ops - more realistic values indicates that its closer to 2x. Masked stores costs are a lot more diverse but 8x is roughly in the middle of the range.

e.g. SandyBridge
defm : X86WriteRes<WriteFMaskedLoad, [SBPort23,SBPort05], 8, [1,2], 3>;
defm : X86WriteRes<WriteFMaskedLoadY, [SBPort23,SBPort05], 9, [1,2], 3>;
defm : X86WriteRes<WriteFMaskedStore, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>;
defm : X86WriteRes<WriteFMaskedStoreY, [SBPort4,SBPort01,SBPort23], 5, [1,1,1], 3>;

e.g. Btver2
defm : X86WriteRes<WriteFMaskedLoad, [JLAGU, JFPU01, JFPX], 6, [1, 2, 2], 1>;
defm : X86WriteRes<WriteFMaskedLoadY, [JLAGU, JFPU01, JFPX], 6, [2, 4, 4], 2>;
defm : X86WriteRes<WriteFMaskedStore, [JSAGU, JFPU01, JFPX], 6, [1, 1, 4], 1>;
defm : X86WriteRes<WriteFMaskedStoreY, [JSAGU, JFPU01, JFPX], 6, [2, 2, 4], 2>;

Differential Revision: https://reviews.llvm.org/D61257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362338 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-02 20:37:02 +00:00