212 Commits

Author SHA1 Message Date
Simon Pilgrim
05803957d4 [SLPVectorizer][X86] Regenerated SEXT/ZEXT cast vectorization tests
Added 256-bit vector test as well

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268811 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 22:22:18 +00:00
Simon Pilgrim
ffe5d4f88c [SLPVectorizer][X86] Added BSWAP/BITREVERSE vectorization tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268803 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 21:41:55 +00:00
Simon Pilgrim
874ffbec05 [SLPVectorizer][X86] Added CTPOP/CTLZ/CTTZ vectorization tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268800 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-06 21:33:01 +00:00
David Majnemer
419fc9c644 [SLPVectorizer] Add operand bundles to vectorized functions
SLPVectorizing a call site should result in further propagation of its
bundles.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 07:09:51 +00:00
Arch D. Robison
e95eedcc04 [SLPVectorizer] Extend SLP Vectorizer to deal with aggregates.
The refactoring portion part was done as r267748.

http://reviews.llvm.org/D14185



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267899 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 16:11:45 +00:00
Matthew Simpson
d0229876a9 [TTI] Add hook for vector extract with extension
This change adds a new hook for estimating the cost of vector extracts followed
by zero- and sign-extensions. The motivating example for this change is the
SMOV and UMOV instructions on AArch64. These instructions move data from vector
to general purpose registers while performing the corresponding extension
(sign-extend for SMOV and zero-extend for UMOV) at the same time. For these
operations, TargetTransformInfo can assume the extensions are free and only
report the cost of the vector extract. The SLP vectorizer has been updated to
make use of the new hook.

Differential Revision: http://reviews.llvm.org/D18523

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267725 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-27 15:20:21 +00:00
Adrian Prantl
4eeaa0da04 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266446 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-15 15:57:41 +00:00
David Majnemer
8b680c27be [SLPVectorizer] Vectorizing the libm sqrt to llvm's sqrt intrinsic requires nnan
To quote the langref "Unlike sqrt in libm, however, llvm.sqrt has
undefined behavior for negative numbers other than -0.0 (which allows
for better optimization, because there is no need to worry about errno
being set). llvm.sqrt(-0.0) is defined to return -0.0 like IEEE sqrt."

This means that it's unsafe to replace sqrt with llvm.sqrt unless the
call is annotated with nnan.

Thanks to Hal Finkel for pointing this out!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265521 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 07:04:53 +00:00
David Majnemer
731666ee90 [SLPVectorizer] Vectorize libcalls of sqrt
We didn't realize that we could transform the libcall into a vectorized
intrinsic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265493 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-06 00:14:59 +00:00
David Majnemer
3f538b527e [SLPVectorizer] Don't insert an extractelement before a catchswitch
A catchswitch cannot be preceded by another instruction in the same
basic block (other than a PHI node).

Instead, insert the extract element right after the materialization of
the vectorized value.  This isn't optimal but is a reasonable compromise
given the constraints of WinEH.

This fixes PR27163.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265157 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-01 17:28:15 +00:00
Adrian Prantl
7876f64bc3 testcase gardening: update the emissionKind enum to the new syntax. (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265081 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-01 00:16:49 +00:00
Adrian Prantl
39bb84a097 Move the DebugEmissionKind enum from DIBuilder into DICompileUnit.
This mostly cosmetic patch moves the DebugEmissionKind enum from DIBuilder
into DICompileUnit. DIBuilder is not the right place for this enum to live
in — a metadata consumer should not have to include DIBuilder.h.
I also added a Verifier check that checks that the emission kind of a
DICompileUnit is actually legal.

http://reviews.llvm.org/D18612
<rdar://problem/25427165>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265077 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-31 23:56:58 +00:00
Paul Robinson
45648a4496 Fix tests that used CHECK-NEXT-NOT and CHECK-DAG-NOT.
FileCheck actually doesn't support combo suffixes.

Differential Revision: http://reviews.llvm.org/D17588


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262054 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-26 19:40:34 +00:00
Matthew Simpson
fb46056c3a Reapply commit r259357 with a fix for PR26629
Commit r259357 was reverted because it caused PR26629. We were assuming all
roots of a vectorizable tree could be truncated to the same width, which is not
the case in general. This commit reapplies the patch along with a fix and a new
test case to ensure we don't regress because of this issue again. This should
fix PR26629.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261212 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-18 14:14:40 +00:00
David Majnemer
fcc16ed65e Revert "Reapply commit r258404 with fix."
This reverts commit r259357, it caused PR26629.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261137 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-17 19:02:36 +00:00
Matthew Simpson
7f7276a903 Add test case missing from r259357 (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259385 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-01 19:09:24 +00:00
Matthew Simpson
5c7e8a999b Reapply commit r258404 with fix.
The previous patch caused PR26364. The fix is to ensure that we don't enter a
cycle when iterating over use-def chains.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259357 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-01 13:38:29 +00:00
David Majnemer
208a5cc2b0 Revert "Reapply commit r258404 with fix"
This reverts commit r258929, it caused PR26364.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259148 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-29 02:43:22 +00:00
Matthew Simpson
e470293402 Reapply commit r258404 with fix
This patch is the second attempt to reapply commit r258404. There was bug in
the initial patch and subsequent fix (mentioned below).

The initial patch caused an assertion because we were computing smaller type
sizes for instructions that cannot be demoted. The fix first determines the
instructions that will be demoted, and then applies the smaller type size to
only those instructions.

This should fix PR26239 and PR26307.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258929 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 13:43:27 +00:00
Matthew Simpson
9124889505 Revert "Reapply commit r258404 with fix"
This commit exposes a crash in computeKnownBits on the Chromium buildbots.
Reverting to investigate.

Reference: https://llvm.org/bugs/show_bug.cgi?id=26307

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258812 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-26 15:45:49 +00:00
Matthew Simpson
ceb1e843a0 Reapply commit r25804 with fix
We were hitting an assertion because we were computing smaller type sizes for
instructions that cannot be demoted. The fix first determines the instructions
that will be demoted, and then applies the smaller type size to only those
instructions.

This should fix PR26239.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258705 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-25 19:24:29 +00:00
Matthew Simpson
895661cc6d Revert "[SLP] Truncate expressions to minimum required bit width"
This reverts commit r258404.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258408 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-21 17:17:20 +00:00
Matthew Simpson
9549f8f7fa [SLP] Truncate expressions to minimum required bit width
This change attempts to produce vectorized integer expressions in bit widths
that are narrower than their scalar counterparts. The need for demotion arises
especially on architectures in which the small integer types (e.g., i8 and i16)
are not legal for scalar operations but can still be used in vectors. Like
similar work done within the loop vectorizer, we rely on InstCombine to perform
the actual type-shrinking. We use the DemandedBits analysis and
ComputeNumSignBits from ValueTracking to determine the minimum required bit
width of an expression.

Differential revision: http://reviews.llvm.org/D15815

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258404 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-21 16:31:55 +00:00
Matthew Simpson
8fa16f95ff Reapply r257800 with fix
The fix uniques the bundle of getelementptr indices we are about to vectorize
since it's possible for the same index to be used by multiple instructions.
The original commit message is below.

[SLP] Vectorize the index computations of getelementptr instructions.

This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257918 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-15 18:51:51 +00:00
Matthew Simpson
df4b806e4d Revert "[SLP] Vectorize the index computations of getelementptr instructions."
This reverts commit r257800.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257888 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-15 13:10:46 +00:00
Keno Fischer
621b821be3 Reapply r257105 "[Verifier] Check that debug values have proper size"
I originally reapplied this in 257550, but had to revert again due to bot
breakage. The only change in this version is to allow either the TypeSize
or the TypeAllocSize of the variable to be the one represented in debug info
(hopefully in the future we can figure out how to encode the difference).
Additionally, several bot failures following r257550, were due to
optimizer bugs now fixed in r257787 and r257795.

r257550 commit message was:

```
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
``
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
``

```

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257850 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-15 00:46:17 +00:00
Matthew Simpson
bdd1452784 [SLP] Vectorize the index computations of getelementptr instructions.
This patch seeds the SLP vectorizer with getelementptr indices. The primary
motivation in doing so is to vectorize gather-like idioms beginning with
consecutive loads (e.g., g[a[0] - b[0]] + g[a[1] - b[1]] + ...). While these
cases could be vectorized with a top-down phase, seeding the existing bottom-up
phase with the index computations avoids the complexity, compile-time, and
phase ordering issues associated with a full top-down pass. Only bundles of
single-index getelementptrs with non-constant differences are considered for
vectorization.

Differential Revision: http://reviews.llvm.org/D14829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257800 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-14 20:46:27 +00:00
Keno Fischer
6c1dec62d2 Re-Revert r257105 (Verifier debug info changes)
While I investigate some new buildbot failures. This was originally reapplied
as r257550 and r257558.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257563 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-13 02:31:14 +00:00
Keno Fischer
99220ce3fc Reapply r257105 "[Verifier] Check that debug values have proper size"
The follow extra changes were made to test cases:

Manually making the variable be the actual type instead of a pointer
to avoid pointer-size differences in generic code:

    LLVM :: DebugInfo/Generic/2010-03-24-MemberFn.ll
    LLVM :: DebugInfo/Generic/2010-04-06-NestedFnDbgInfo.ll
    LLVM :: DebugInfo/Generic/2010-05-03-DisableFramePtr.ll
    LLVM :: DebugInfo/Generic/varargs.ll

Delete sizing information from debug info for the same reason
(but the presence of the pointer was important to the test case):

    LLVM :: DebugInfo/Generic/restrict.ll
    LLVM :: DebugInfo/Generic/tu-composite.ll
    LLVM :: Linker/type-unique-type-array-a.ll
    LLVM :: Linker/type-unique-simple2.ll

Fixing an incorrect DW_OP_deref

    LLVM :: DebugInfo/Generic/2010-05-03-OriginDIE.ll

Fixing a missing DW_OP_deref

    LLVM :: DebugInfo/Generic/incorrect-variable-debugloc.ll

Additionally, clang should no longer complain during bootstrap should no
longer happen after r257534.

The original commit message was:
```
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref
```

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257550 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-13 00:31:44 +00:00
Keno Fischer
c950114021 Temporarily revert r257105 "[Verifier] Check that debug values have proper size"
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257107 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 22:39:11 +00:00
Keno Fischer
97515eb97b [Verifier] Check that debug values have proper size
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257105 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 22:18:37 +00:00
Charlie Turner
35c68def46 [NFC] Update horizontal reduction test cases.
These testcases no longer need to specify -slp-vectorize-hor, since it was
enabled by default in r252733.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255783 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-16 17:22:24 +00:00
Mehdi Amini
7d84928143 Fix SLPVectorizer commutativity reordering
The SLPVectorizer had a very crude way of trying to benefit
from associativity: it tried to optimize for splat/broadcast
or in order to have the same operator on the same side.
This is benefitial to the cost model and allows more vectorization
to occur.
This patch improve the logic and make the detection optimal (locally,
we don't look at the full tree but only at the immediate children).

Should fix https://llvm.org/bugs/show_bug.cgi?id=25247

Reviewers: mzolotukhin

Differential Revision: http://reviews.llvm.org/D13996

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252337 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 20:17:51 +00:00
Peter Collingbourne
5f220beefc DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252219 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 22:03:56 +00:00
Charlie Turner
63fe1641e4 [SLP] Be more aggressive about reduction width selection.
Summary:
This change could be way off-piste, I'm looking for any feedback on whether it's an acceptable approach.

It never seems to be a problem to gobble up as many reduction values as can be found, and then to attempt to reduce the resulting tree. Some of the workloads I'm looking at have been aggressively unrolled by hand, and by selecting reduction widths that are not constrained by a vector register size, it becomes possible to profitably vectorize. My test case shows such an unrolling which SLP was not vectorizing (on neither ARM nor X86) before this patch, but with it does vectorize.

I measure no significant compile time impact of this change when combined with D13949 and D14063. There are also no significant performance regressions on ARM/AArch64 in SPEC or LNT.

The more principled approach I thought of was to generate several candidate tree's and use the cost model to pick the cheapest one. That seemed like quite a big design change (the algorithms seem very much one-shot), and would likely be a costly thing for compile time. This seemed to do the job at very little cost, but I'm worried I've misunderstood something!

Reviewers: nadav, jmolloy

Subscribers: mssimpso, llvm-commits, aemerson

Differential Revision: http://reviews.llvm.org/D14116

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251428 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 17:59:03 +00:00
Charlie Turner
751d6ddd6d [SLP] Try a bit harder to find reduction PHIs
Summary:
Currently, when the SLP vectorizer considers whether a phi is part of a reduction, it dismisses phi's whose incoming blocks are not the same as the block containing the phi. For the patterns I'm looking at, extending this rule to allow phis whose incoming block is a containing loop latch allows me to vectorize certain workloads.

There is no significant compile-time impact, and combined with D13949, no performance improvement measured in ARM/AArch64 in any of SPEC2000, SPEC2006 or LNT.

Reviewers: jmolloy, mcrosier, nadav

Subscribers: mssimpso, nadav, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D14063

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251425 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 17:54:16 +00:00
Charlie Turner
b13834ec71 [SLP] Treat SelectInsts as reduction values.
Summary:
Certain workloads, in particular sum-of-absdiff loops, can be vectorized using SLP if it can treat select instructions as reduction values.

The test case is a bit awkward. The AArch64 cost model needs some tuning to not be so pessimistic about selects. I've had to tweak the SLP threshold here.

Reviewers: jmolloy, mzolotukhin, spatel, nadav

Subscribers: nadav, mssimpso, aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D13949

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251424 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 17:49:11 +00:00
Michael Zolotukhin
ad0be616bd [SLP] Don't vectorize loads of non-packed types (like i1, i2).
Summary:
Given an array of i2 elements, 4 consecutive scalar loads will be lowered to
i8-sized loads and thus will access 4 consecutive bytes in memory. If we
vectorize these loads into a single <4 x i2> load, it'll access only 1 byte in
memory. Hence, we should prohibit vectorization in such cases.

PS: Initial patch was proposed by Arnold.

Reviewers: aschwaighofer, nadav, hfinkel

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13277

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248943 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 21:05:43 +00:00
Erik Eckstein
0d973b565c SLPVectorizer: add a test to check if the minimum region size works.
This is an addition to rL248917.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248923 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 17:28:19 +00:00
Erik Eckstein
e2b6175820 SLPVectorizer: limit the scheduling region size per basic block.
Usually large blocks are not a problem. But if a large block (> 10k instructions)
contains many (potential) chains of vector instructions, and those chains are
spread over a wide range of instructions, then scheduling becomes a compile time problem.
This change introduces a limit for the accumulate scheduling region size of a block.
For real-world functions this limit will never be exceeded (it's about 10x larger than
the maximum value seen in the test-suite and external test suite).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248917 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-30 17:00:44 +00:00
Duncan P. N. Exon Smith
a5ae7c1c9f DI: Require subprogram definitions to be distinct
As a follow-up to r246098, require `DISubprogram` definitions
(`isDefinition: true`) to be 'distinct'.  Specifically, add an assembler
check, a verifier check, and bitcode upgrading logic to combat testcase
bitrot after the `DIBuilder` change.

While working on the testcases, I realized that
test/Linker/subprogram-linkonce-weak-odr.ll isn't relevant anymore.  Its
purpose was to check for a corner case in PR22792 where two subprogram
definitions match exactly and share the same metadata node.  The new
verifier check, requiring that subprogram definitions are 'distinct',
precludes that possibility.

I updated almost all the IR with the following script:

    git grep -l -E -e '= !DISubprogram\(.* isDefinition: true' |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's/= \(!DISubprogram(.*, isDefinition: true\)/= distinct \1/'

Likely some variant of would work for out-of-tree testcases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246327 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-28 20:26:49 +00:00
Michael Zolotukhin
3dc9abd7c5 [SLP] Add one more test case for propagating 'nontemporal' attributes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245644 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-21 00:08:39 +00:00
Michael Zolotukhin
9cd73adba0 [SLP] Propagate 'nontemporal' attribute into vectorized instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245633 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-20 22:28:15 +00:00
Silviu Baranga
1127a30ffe [CostModel][AArch64] Increase cost of vector insert element and add missing cast costs
Summary:
Increase the estimated costs for insert/extract element operations on
AArch64. This is motivated by results from benchmarking interleaved
accesses.

Add missing costs for zext/sext/trunc instructions and some integer to
floating point conversions. These costs were previously calculated
by scalarizing these operation and were affected by the cost increase of
the insert/extract element operations.

Reviewers: rengolin

Subscribers: mcrosier, aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D11939

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245226 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-17 16:05:09 +00:00
Duncan P. N. Exon Smith
c61bc48acb DI: Disallow uniquable DICompileUnits
Since r241097, `DIBuilder` has only created distinct `DICompileUnit`s.
The backend is liable to start relying on that (if it hasn't already),
so make uniquable `DICompileUnit`s illegal and automatically upgrade old
bitcode.  This is a nice cleanup, since we can remove an unnecessary
`DenseSet` (and the associated uniquing info) from `LLVMContextImpl`.

Almost all the testcases were updated with this script:

    git grep -e '= !DICompileUnit' -l -- test |
    grep -v test/Bitcode |
    xargs sed -i '' -e 's,= !DICompileUnit,= distinct !DICompileUnit,'

I imagine something similar should work for out-of-tree testcases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243885 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-03 17:26:41 +00:00
Duncan P. N. Exon Smith
bf2040f00c DI: Remove DW_TAG_arg_variable and DW_TAG_auto_variable
Remove the fake `DW_TAG_auto_variable` and `DW_TAG_arg_variable` tags,
using `DW_TAG_variable` in their place Stop exposing the `tag:` field at
all in the assembly format for `DILocalVariable`.

Most of the testcase updates were generated by the following sed script:

    find test/ -name "*.ll" -o -name "*.mir" |
    xargs grep -l 'DILocalVariable' |
    xargs sed -i '' \
      -e 's/tag: DW_TAG_arg_variable, //' \
      -e 's/tag: DW_TAG_auto_variable, //'

There were only a handful of tests in `test/Assembly` that I needed to
update by hand.

(Note: a follow-up could change `DILocalVariable::DILocalVariable()` to
set the tag to `DW_TAG_formal_parameter` instead of `DW_TAG_variable`
(as appropriate), instead of having that logic magically in the backend
in `DbgVariable`.  I've added a FIXME to that effect.)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243774 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-31 18:58:39 +00:00
Wei Mi
4db6728b3a [SLP vectorizer]: Choose the best consecutive candidate to pair with a store instruction.
The patch changes the SLPVectorizer::vectorizeStores to choose the immediate
succeeding or preceding candidate for a store instruction when it has multiple
consecutive candidates. In this way it has better chance to find more slp
vectorization opportunities.

Differential Revision: http://reviews.llvm.org/D10445


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243666 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-30 17:40:39 +00:00
Sanjay Patel
c1c43c15cc [SLPVectorizer] Try different vectorization factors for store chains
...and set max vector register size based on target 

This patch is based on discussion on the llvmdev mailing list:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-July/087405.html

and also solves:
https://llvm.org/bugs/show_bug.cgi?id=17170

Several FIXME/TODO items are noted in comments as potential improvements.

Differential Revision: http://reviews.llvm.org/D10950



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241760 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-08 23:40:55 +00:00
Sanjay Patel
ff7b255377 change CHECK to CHECK-LABEL for more precision
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241422 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-05 23:19:16 +00:00
Sanjay Patel
a92598d339 remove unnecessary test specifications
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241419 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-05 22:37:51 +00:00