709 Commits

Author SHA1 Message Date
Gabor Buella
0aae914817 NFC - Various typo fixes in tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336268 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 13:28:39 +00:00
Anastasis Grammenos
fb31c6481f [DebugInfo][LoopVectorize] Preserve DL in generated phi instruction
When creating `phi` instructions to resume at the scalar part of the loop,
copy the DebugLoc from the original phi over to the new one.

Differential Revision: https://reviews.llvm.org/D48769

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336256 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-04 10:16:55 +00:00
Max Kazantsev
6ac04e25a3 [InstCombine] Delay foldICmpUsingKnownBits until simple transforms are done
This patch changes order of transform in InstCombineCompares to avoid
performing transforms based on ranges which produce complex bit arithmetics
before more simple things (like folding with constants) are done. See PR37636
for the motivating example.

Differential Revision: https://reviews.llvm.org/D48584
Reviewed By: spatel, lebedev.ri


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336172 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-03 06:23:57 +00:00
Adhemerval Zanella
751c17bfa0 [AArch64] Add custom lowering for v4i8 trunc store
This patch adds a custom trunc store lowering for v4i8 vector types.
Since there is not v.4b register, the v4i8 is promoted to v4i16 (v.4h)
and default action for v4i8 is to extract each element and issue 4
byte stores.

A better strategy would be to extended the promoted v4i16 to v8i16
(with undef elements) and extract and store the word lane which
represents the v4i8 subvectores. The construction:

  define void @foo(<4 x i16> %x, i8* nocapture %p) {
    %0 = trunc <4 x i16> %x to <4 x i8>
    %1 = bitcast i8* %p to <4 x i8>*
    store <4 x i8> %0, <4 x i8>* %1, align 4, !tbaa !2
    ret void
  }

Can be optimized from:

  umov    w8, v0.h[3]
  umov    w9, v0.h[2]
  umov    w10, v0.h[1]
  umov    w11, v0.h[0]
  strb    w8, [x0, #3]
  strb    w9, [x0, #2]
  strb    w10, [x0, #1]
  strb    w11, [x0]
  ret

To:

  xtn     v0.8b, v0.8h
  str     s0, [x0]
  ret

The patch also adjust the memory cost for autovectorization, so the C
code:

  void foo (const int *src, int width, unsigned char *dst)
  {
    for (int i = 0; i < width; i++)
       *dst++ = *src++;
  }

can be vectorized to:

  .LBB0_4:                                // %vector.body
                                          // =>This Inner Loop Header: Depth=1
        ldr     q0, [x0], #16
        subs    x12, x12, #4            // =4
        xtn     v0.4h, v0.4s
        xtn     v0.8b, v0.8h
        st1     { v0.s }[0], [x2], #4
        b.ne    .LBB0_4

Instead of byte operations.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335735 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 13:58:46 +00:00
Florian Hahn
26ac94ddde Revert r335513: [SCEVExp] Advance found insertion point
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335522 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 20:55:26 +00:00
Florian Hahn
a972a2e1ff Force vector width for scev-expander-debug.ll test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335520 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 20:40:50 +00:00
Florian Hahn
c1803abd1b [SCEVExp] Advance found insertion point until we find a non-dbg instruction.
This avoids creating unnecessary casts if the IP used to be a dbg info
intrinsic. Fixes PR37727.

Reviewers: vsk, aprantl, sanjoy, efriedma

Reviewed By: vsk, efriedma

Differential Revision: https://reviews.llvm.org/D47874



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335513 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 19:17:29 +00:00
Sanjay Patel
baa9ccee64 [LoopVectorize] regenerate full checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335257 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-21 16:54:32 +00:00
Diego Caballero
3a32fa4e62 Move redundant-vf2-cost.ll test to X86 directory
redundant-vf2-cost.ll is X86 specific. Moved from
test/Transforms/LoopVectorize/redundant-vf2-cost.ll to
test/Transforms/LoopVectorize/X86/redundant-vf2-cost.ll



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334854 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-15 18:46:03 +00:00
Diego Caballero
ff160b6333 [LV] Prevent LV to run cost model twice for VF=2
This is a minor fix for LV cost model, where the cost for VF=2 was
computed twice when the vectorization of the loop was forced without
specifying a VF.

Reviewers: xusx595, hsaito, fhahn, mkuper

Reviewed By: hsaito, xusx595

Differential Revision: https://reviews.llvm.org/D48048


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334840 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-15 16:21:35 +00:00
Roman Shirokiy
42f7ad099a [LV] Fix PR36983. For a given recurrence, fix all phis in exit block
There could be more than one PHIs in exit block using same loop recurrence.
Don't assume there is only one and fix each user.

Differential Revision: https://reviews.llvm.org/D47788


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334271 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-08 08:21:20 +00:00
Sanjay Patel
bd9be7a1ed [TargetLibraryInfo] add mappings from LLVM sin/cos intrinsics to SVML calls
These weren't included in D19544 - probably just an oversight.
D40044 made it more likely that we'll have LLVM math intrinsics rather 
than libcalls, so this bug was more easily exposed.
As the tests/code show, we already have the complete mappings for pow/exp/log.

I don't have any experience with SVML, so I don't know if anything else is 
missing. It's also not clear to me that we should be doing this transform in 
IR rather than DAG/isel, but that's a separate issue.

Differential Revision: https://reviews.llvm.org/D47610


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334211 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-07 18:21:24 +00:00
Karl-Johan Karlsson
ad388af414 [ConstantFold] Disallow folding vector geps into bitcasts
Summary:
Getelementptr returns a vector of pointers, instead of a single address,
when one or more of its arguments is a vector. In such case it is not
possible to simplify the expression by inserting a bitcast of operand(0)
into the destination type, as it will create a bitcast between different
sizes.

Reviewers: majnemer, mkuper, mssimpso, spatel

Reviewed By: spatel

Subscribers: lebedev.ri, llvm-commits

Differential Revision: https://reviews.llvm.org/D46379

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333783 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-01 19:34:35 +00:00
Sanjay Patel
6da40ce2e2 [LoopVectorize, x86] add tests to show missing SVML transforms; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333707 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-31 22:31:02 +00:00
Sanjay Patel
5a3ab98eb8 [LoopVectorize, x86] regenerate checks; NFC
I removed the 'fast' flag from the calls because that's not required.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333695 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-31 21:30:36 +00:00
Diego Caballero
6152bb94fd [VPlan] Reland r332654 and silence unused func warning
r332654 was reverted due to an unused function warning in
release build. This commit includes the same code with the
warning silenced.

Differential Revision: https://reviews.llvm.org/D44338



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332860 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-21 18:14:23 +00:00
Amara Emerson
a7e652ad1a Delete a test that was missed in the revert r332747.
r332747 originally reverted r332654 which added this test.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332755 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-18 19:21:40 +00:00
Alexander Ivchenko
1fcee954a4 [X86][CET] Changing -fcf-protection behavior to comply with gcc (LLVM part)
This patch aims to match the changes introduced in gcc by
https://gcc.gnu.org/ml/gcc-cvs/2018-04/msg00534.html. The
IBT feature definition is removed, with the IBT instructions
being freely available on all X86 targets. The shadow stack
instructions are also being made freely available, and the
use of all these CET instructions is controlled by the module
flags derived from the -fcf-protection clang option. The hasSHSTK
option remains since clang uses it to determine availability of
shadow stack instruction intrinsics, but it is no longer directly used.

Comes with a clang patch (D46881).

Patch by mike.dvoretsky

Differential Revision: https://reviews.llvm.org/D46882



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332705 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-18 11:58:25 +00:00
Diego Caballero
1f4de6bc0a [LV][VPlan] Build plain CFG with simple VPInstructions for outer loops.
Patch #3 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

Expected to be NFC for the current inner loop vectorization path. It
introduces the basic algorithm to build the VPlan plain CFG (single-level
CFG, no hierarchical CFG (H-CFG), yet) in the VPlan-native vectorization
path using VPInstructions. It includes:
  - VPlanHCFGBuilder: Main class to build the VPlan H-CFG (plain CFG without nested regions, for now).
  - VPlanVerifier: Main class with utilities to check the consistency of a H-CFG.
  - VPlanBlockUtils: Main class with utilities to manipulate VPBlockBases in VPlan.

Reviewers: rengolin, fhahn, mkuper, mssimpso, a.elovikov, hfinkel, aprantl.

Differential Revision: https://reviews.llvm.org/D44338



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332654 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-17 19:24:47 +00:00
Karl-Johan Karlsson
ac455c2fd0 [LV] Add lit testcase for bitcast problem. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331878 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-09 13:34:57 +00:00
Shiva Chen
a8a13bc662 [DebugInfo] Add DILabel metadata and intrinsic llvm.dbg.label.
In order to set breakpoints on labels and list source code around
labels, we need collect debug information for labels, i.e., label
name, the function label belong, line number in the file, and the
address label located. In order to keep these information in LLVM
IR and to allow backend to generate debug information correctly.
We create a new kind of metadata for labels, DILabel. The format
of DILabel is

!DILabel(scope: !1, name: "foo", file: !2, line: 3)

We hope to keep debug information as much as possible even the
code is optimized. So, we create a new kind of intrinsic for label
metadata to avoid the metadata is eliminated with basic block.
The intrinsic will keep existing if we keep it from optimized out.
The format of the intrinsic is

llvm.dbg.label(metadata !1)

It has only one argument, that is the DILabel metadata. The
intrinsic will follow the label immediately. Backend could get the
label metadata through the intrinsic's parameter.

We also create DIBuilder API for labels to be used by Frontend.
Frontend could use createLabel() to allocate DILabel objects, and use
insertLabel() to insert llvm.dbg.label intrinsic in LLVM IR.

Differential Revision: https://reviews.llvm.org/D45024

Patch by Hsiangkai Wang.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331841 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-09 02:40:45 +00:00
Hideki Saito
3ce25b339f [LV] Fix for PR37248, Broadcast codegen incorrectly assumed vector loop body is single basic block
Summary:
Broadcast code generation emitted instructions in pre-header, while the instruction they are dependent on in the vector loop body.
This resulted in an IL verification error ---- value used before defined.


Reviewers: rengolin, fhahn, hfinkel

Reviewed By: rengolin, fhahn

Subscribers: dcaballe, Ka-Ka, llvm-commits

Differential Revision: https://reviews.llvm.org/D46302

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331799 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-08 18:57:34 +00:00
Daniel Neilson
31d3d2138a [LV] Move test/Transforms/LoopVectorize/pr23997.ll
Summary:
This fixes a build break with r331269.

test/Transforms/LoopVectorize/pr23997.ll

should be in:

test/Transforms/LoopVectorize/X86/pr23997.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331281 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-01 16:40:45 +00:00
Daniel Neilson
974321bd67 [LV] Preserve inbounds on created GEPs
Summary:
This is a fix for PR23997.

The loop vectorizer is not preserving the inbounds property of GEPs that it creates.
This is inhibiting some optimizations. This patch preserves the inbounds property in
the case where a load/store is being fed by an inbounds GEP.

Reviewers: mkuper, javed.absar, hsaito

Reviewed By: hsaito

Subscribers: dcaballe, hsaito, llvm-commits

Differential Revision: https://reviews.llvm.org/D46191

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331269 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-01 15:35:08 +00:00
Diego Caballero
e8b06de378 [LV][VPlan] Detect outer loops for explicit vectorization.
Patch #2 from VPlan Outer Loop Vectorization Patch Series #1
(RFC: http://lists.llvm.org/pipermail/llvm-dev/2017-December/119523.html).

This patch introduces the basic infrastructure to detect, legality check
and process outer loops annotated with hints for explicit vectorization.
All these changes are protected under the feature flag
-enable-vplan-native-path. This should make this patch NFC for the existing
inner loop vectorizer.

Reviewers: hfinkel, mkuper, rengolin, fhahn, aemerson, mssimpso.

Differential Revision: https://reviews.llvm.org/D42447


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330739 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-24 17:04:17 +00:00
Krzysztof Parzyszek
c6388fd9c7 [LV] Introduce TTI::getMinimumVF
The function getMinimumVF(ElemWidth) will return the minimum VF for
a vector with elements of size ElemWidth bits. This value will only
apply to targets for which TTI::shouldMaximizeVectorBandwidth returns
true. The value of 0 indicates that there is no minimum VF.

Differential Revision: https://reviews.llvm.org/D45271


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330062 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-13 20:16:32 +00:00
Sebastian Pop
0c137c5660 [InstCombine] reassociate loop invariant GEP chains to enable LICM
This change brings performance of zlib up by 10%. The example below is from a
hot loop in longest_match() from zlib.

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 %idx.ext1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 -1

In this example %idx.ext1 is a loop invariant. It will be moved above the use of
loop induction variable %idx.ext such that it can be hoisted out of the loop by
LICM. The operands that have dependences carried by the loop will be sinked down
in the GEP chain. This patch will produce the following output:

do.body:
  %cur_match.addr.0 = phi i32 [ %cur_match, %entry ], [ %2, %do.cond ]
  %idx.ext = zext i32 %cur_match.addr.0 to i64
  %add.ptr = getelementptr inbounds i8, i8* %win, i64 %idx.ext1
  %add.ptr2 = getelementptr inbounds i8, i8* %add.ptr, i64 -1
  %add.ptr3 = getelementptr inbounds i8, i8* %add.ptr2, i64 %idx.ext

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328539 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-26 16:19:31 +00:00
Evgeny Stupachenko
ca64890b85 Revert r325687 (workaround for PR36032).
Summary:
Revert r325687 workaround for PR36032 since
 a fix was committed in r326154.

Reviewers: sbaranga

Differential Revision: http://reviews.llvm.org/D44768

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328257 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-22 22:04:39 +00:00
Andrei Elovikov
afd766bf9c [LV] Let recordVectorLoopValueForInductionCast to check if IV was created from the cast.
Summary:
It turned out to be error-prone to expect the callers to handle that - better to
leave the decision to this routine and make the required data to be explicitly
passed to the function.

This handles the case that was missed in the r322473 and fixes the assert
mentioned in PR36524.

Reviewers: dorit, mssimpso, Ayal, dcaballe

Reviewed By: dcaballe

Subscribers: Ka-Ka, hiraditya, dneilson, hsaito, llvm-commits

Differential Revision: https://reviews.llvm.org/D43812

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327960 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-20 09:04:39 +00:00
Renato Golin
5da3464401 [LV] Adding test for r327109
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327155 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-09 18:02:36 +00:00
Max Kazantsev
1eba8752d7 [SCEV] Smart range calculation for SCEVUnknown Phis
The range of SCEVUnknown Phi which merges values `X1, X2, ..., XN`
can be evaluated as `U(Range(X1), Range(X2), ..., Range(XN))`.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D43810


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326418 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-01 06:56:48 +00:00
Sanjay Patel
69cad52177 [ARM] add loop vectorizer test based on 482.sphinx3 from SPEC2006; NFC
This is a slight reduction of one of the benchmarks
that suffered with D43079. Cost model changes should
not cause this test to remain scalarized.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326221 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-27 18:33:24 +00:00
Adam Nemet
c5865b7982 Make test agnostic to cost model
This was causing bot failures on greendragon

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326169 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-27 05:41:16 +00:00
Evgeny Stupachenko
f0454dcc5c Fix r326154 buildbots test fail
Summary:

Add specific mtriples to tests added in r326154.

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326158 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-27 01:33:11 +00:00
Evgeny Stupachenko
e47bd78db1 Fix PR36032, PR35432
Summary:

The change fix an assert fail at ScalarEvolutionExpander.cpp:
  assert(ExitCount != SE.getCouldNotCompute() && "Invalid loop count");

Reviewers: sbaranga

Differential Revision: http://reviews.llvm.org/D42604

From: Evgeny Stupachenko <evstupac@gmail.com>
                         <evgeny.v.stupachenko@intel.com>


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326154 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-27 00:17:31 +00:00
Renato Golin
3f10059feb [LV] Move isLegalMasked* functions from Legality to CostModel
All SIMD architectures can emulate masked load/store/gather/scatter
through element-wise condition check, scalar load/store, and
insert/extract. Therefore, bailing out of vectorization as legality
failure, when they return false, is incorrect. We should proceed to cost
model and determine profitability.

This patch is to address the vectorizer's architectural limitation
described above. As such, I tried to keep the cost model and
vectorize/don't-vectorize behavior nearly unchanged. Cost model tuning
should be done separately.

Please see
http://lists.llvm.org/pipermail/llvm-dev/2018-January/120164.html for
RFC and the discussions.

Closes D43208.

Patch by: Hideki Saito <hideki.saito@intel.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326079 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-26 11:06:36 +00:00
Alexey Bataev
bde3a91811 [LV] Fix test checks, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325699 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-21 16:48:23 +00:00
Silviu Baranga
1f383654e0 [SCEV] Temporarily disable loop versioning for the purpose
of turning SCEVUnknowns of PHIs into AddRecExprs.

This feature is now hidden behind the -scev-version-unknown flag.

Fixes PR36032 and PR35432.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325687 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-21 15:20:32 +00:00
Sanjay Patel
1c629279f1 revert r325515: [TTI CostModel] change default cost of FP ops to 1 (PR36280)
There are too many perf regressions resulting from this, so we need to 
investigate (and add tests for) targets like ARM and AArch64 before 
trying to reinstate.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325658 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-21 01:42:52 +00:00
Alexey Bataev
1918edb644 [LV] Fix test checks, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325617 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-20 19:49:25 +00:00
Sanjay Patel
1fceaab028 [TTI CostModel] change default cost of FP ops to 1 (PR36280)
This change was mentioned at least as far back as:
https://bugs.llvm.org/show_bug.cgi?id=26837#c26
...and I found a real program that is harmed by this: 
Himeno running on AMD Jaguar gets 6% slower with SLP vectorization:
https://bugs.llvm.org/show_bug.cgi?id=36280
...but the change here appears to solve that bug only accidentally.

The div/rem costs for x86 look very wrong in some cases, but that's already true, 
so we can fix those in follow-up patches. There's also evidence that more cost model
changes are needed to solve SLP problems as shown in D42981, but that's an independent 
problem (though the solution may be adjusted after this change is made).

Differential Revision: https://reviews.llvm.org/D43079


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325515 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-19 16:11:44 +00:00
Mircea Trofin
757ef7ca81 [LV] Fix analyzeInterleaving when -pass-remarks enabled
Summary:
If -pass-remarks=loop-vectorize, atomic ops will be seen by
analyzeInterleaving(), even though canVectorizeMemory() == false. This
is because we are requesting extra analysis instead of bailing out.

In such a case, we end up with a Group in both Load- and StoreGroups,
and then we'll try to access freed memory when traversing LoadGroups after having had released the Group when  iterating over StoreGroups.

The fix is to include mayWriteToMemory() when validating that two
instructions are the same kind of memory operation.

Reviewers: mssimpso, davidxl

Reviewed By: davidxl

Subscribers: hsaito, fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D43064

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324786 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-10 00:07:45 +00:00
Sanjay Patel
a69986c1fa [LoopVectorize] auto-generate complete checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324611 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-08 15:13:47 +00:00
Mircea Trofin
19265c931f Verify profile data confirms large loop trip counts.
Summary:
Loops with inequality comparers, such as:

   // unsigned bound
   for (unsigned i = 1; i < bound; ++i) {...}

have getSmallConstantMaxTripCount report a large maximum static
trip count - in this case, 0xffff fffe. However, profiling info
may show that the trip count is much smaller, and thus
counter-recommend vectorization.

This change:
- flips loop-vectorize-with-block-frequency on by default.
- validates profiled loop frequency data supports vectorization,
  when static info appears to not counter-recommend it. Absence
  of profile data means we rely on static data, just as we've
  done so far.

Reviewers: twoh, mkuper, davidxl, tejohnson, Ayal

Reviewed By: davidxl

Subscribers: bkramer, llvm-commits

Differential Revision: https://reviews.llvm.org/D42946

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324543 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-07 23:29:52 +00:00
Max Kazantsev
0170158e0d [NFC] Add tests for PR35743
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324209 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-05 08:09:49 +00:00
Chad Rosier
451a188a25 [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking
The type-shrinking logic in reduction detection, although narrow in scope, is
also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies
the approach to rely on the demanded bits and value tracking analyses, if
available. We currently perform type-shrinking separately for reductions and
other instructions in the loop. Long-term, we should probably think about
computing minimal bit widths in a more complete way for the loops we want to
vectorize.

PR35734
Differential Revision: https://reviews.llvm.org/D42309

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324195 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-04 15:42:24 +00:00
Craig Topper
2f472871bc [X86] Add support for passing 'prefer-vector-width' function attribute into X86Subtarget and exposing via X86's getRegisterWidth TTI interface.
This will cause the vectorizers to do some limiting of the vector widths they create. This is not a strict limit. There are reasons I know of that the loop vectorizer will generate larger vectors for.

I've written this in such a way that the interface will only return a properly supported width(0/128/256/512) even if the attribute says something funny like 384 or 10.

This has been split from D41895 with the remainder in a follow up commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323015 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-20 00:26:08 +00:00
Craig Topper
36049c5e9c [X86] Use vmovdqu64/vmovdqa64 for unmasked integer vector stores for consistency with loads.
Previously we used 64 for vXi64 stores and 32 for everything else. This change uses 64 for everything just like do for loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322820 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-18 07:44:09 +00:00
Andrei Elovikov
9bfe6e5ae2 [LV] Don't call recordVectorLoopValueForInductionCast for newly-created IV from a trunc.
Summary:
This method is supposed to be called for IVs that have casts in their use-def
chains that are completely ignored after vectorization under PSE. However, for
truncates of such IVs the same InductionDescriptor is used during
creation/widening of both original IV based on PHINode and new IV based on
TruncInst.

This leads to unintended second call to recordVectorLoopValueForInductionCast
with a VectorLoopVal set to the newly created IV for a trunc and causes an
assert due to attempt to store new information for already existing entry in the
map. This is wrong and should not be done.

Fixes PR35773.

Reviewers: dorit, Ayal, mssimpso

Reviewed By: dorit

Subscribers: RKSimon, dim, dcaballe, hsaito, llvm-commits, hiraditya

Differential Revision: https://reviews.llvm.org/D41913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322473 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-15 10:56:07 +00:00
Florian Hahn
f605692d39 [LV] Remove unnecessary DoExtraAnalysis guard (silent bug)
canVectorize is only checking if the loop has a normalized pre-header if DoExtraAnalysis is true.
This doesn't make sense to me because reporting analysis information shouldn't alter legality
checks. This is probably the result of a last minute minor change before committing (?).

Patch by Diego Caballero.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D40973


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321172 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-20 13:28:38 +00:00