Commit Graph

451 Commits

Author SHA1 Message Date
Elena Demikhovsky
407fc99045 Fixed a bug in vectorizing GEP before gather/scatter intrinsic.
Vectorizing GEP was incorrect and broke SSA in some cases.
 
The patch fixes PR27997 https://llvm.org/bugs/show_bug.cgi?id=27997.

Differential revision: http://reviews.llvm.org/D22035



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274735 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-07 06:06:46 +00:00
Michael Kuperstein
c7432f9ad3 [TTI] The cost model should not assume vector casts get completely scalarized
The cost model should not assume vector casts get completely scalarized, since
on targets that have vector support, the common case is a partial split up to
the legal vector size. So, when a vector cast  gets split, the resulting casts
end up legal and cheap.

Instead of pessimistically assuming scalarization, base TTI can use the costs
the concrete TTI provides for the split vector, plus a fudge factor to account
for the cost of the split itself. This fudge factor is currently 1 by default,
except on AMDGPU where inserts and extracts are considered free.

Differential Revision: http://reviews.llvm.org/D21251


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274642 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-06 17:30:56 +00:00
Matthew Simpson
59cdf4670f [LV] Don't widen trivial induction variables
We currently always vectorize induction variables. However, if an induction
variable is only used for counting loop iterations or computing addresses with
getelementptr instructions, we don't need to do this. Vectorizing these trivial
induction variables can create vector code that is difficult to simplify later
on. This is especially true when the unroll factor is greater than one, and we
create vector arithmetic when computing step vectors. With this patch, we check
if an induction variable is only used for counting iterations or computing
addresses, and if so, scalarize the arithmetic when computing step vectors
instead. This allows for greater simplification.

This patch addresses the suboptimal pointer arithmetic sequence seen in
PR27881.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27881
Differential Revision: http://reviews.llvm.org/D21620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274627 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-06 14:26:59 +00:00
Matt Arsenault
6bcca1a915 SLPVectorizer: Move propagateMetadata to VectorUtils
This will be re-used by the LoadStoreVectorizer.

Fix handling of range metadata and testcase by Justin Lebar.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274281 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 21:17:59 +00:00
Wei Mi
693c332887 Refine the set of UniformAfterVectorization instructions.
Except the seed uniform instructions (conditional branch and consecutive ptr
instructions), dependencies to be added into uniform set should only be used
by existing uniform instructions or intructions outside of current loop.

Differential Revision: http://reviews.llvm.org/D21755


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274262 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-30 18:42:56 +00:00
Elena Demikhovsky
f7282f8ee5 Reverted patch 273864
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-29 10:01:06 +00:00
Artur Pilipenko
48917c9e44 Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-28 18:27:25 +00:00
Artur Pilipenko
be0da39a48 Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273895 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:54:33 +00:00
Artur Pilipenko
9227558e8e Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273892 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:29:26 +00:00
Elena Demikhovsky
98cea819c1 Removed extra test from the prev commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273865 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 11:40:49 +00:00
Elena Demikhovsky
1abadbff39 Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)

  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Re-commit rL273257 - revision: http://reviews.llvm.org/D20789



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273864 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 11:19:23 +00:00
Matthew Simpson
1d3ab3afe1 [LV] Preserve order of dependences in interleaved accesses analysis
The interleaved access analysis currently assumes that the inserted run-time
pointer aliasing checks ensure the absence of dependences that would prevent
its instruction reordering. However, this is not the case.

Issues can arise from how code generation is performed for interleaved groups.
For a load group, all loads in the group are essentially moved to the location
of the first load in program order, and for a store group, all stores in the
group are moved to the location of the last store. For groups having members
involved in a dependence relation with any other instruction in the loop, this
reordering can violate the dependence.

This patch teaches the interleaved access analysis how to avoid breaking such
dependences, and should fix PR27626.

An assumption of the original analysis was that the accesses had been collected
in "program order". The analysis was then simplified by visiting the accesses
bottom-up. However, this ordering was never guaranteed for anything other than
single basic block loops. Thus, this patch also enforces the desired ordering.

Reference: https://llvm.org/bugs/show_bug.cgi?id=27626
Differential Revision: http://reviews.llvm.org/D19984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273687 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 15:33:25 +00:00
Elena Demikhovsky
6f247766ed reverted the prev commit due to assertion failure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273258 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-21 12:10:11 +00:00
Elena Demikhovsky
8ca9d2a8ad Fixed consecutive memory access detection in Loop Vectorizer.
It did not handle correctly cases without GEP.

The following loop wasn't vectorized:

for (int i=0; i<len; i++)
  *to++ = *from++;

I use getPtrStride() to find Stride for memory access and return 0 is the Stride is not 1 or -1.

Differential revision: http://reviews.llvm.org/D20789



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273257 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-21 11:32:01 +00:00
Michael Kuperstein
932889b1a1 Recommit [LV] Enable vectorization of loops where the IV has an external use
r272715 broke libcxx because it did not correctly handle cases where the
last iteration of one IV is the second-to-last iteration of another.

Original commit message:
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272742 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-15 00:35:26 +00:00
Michael Kuperstein
f1f9f2c316 Reverting r272715 since it broke libcxx.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272730 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 22:30:41 +00:00
Michael Kuperstein
4d190088e9 [LV] Enable vectorization of loops where the IV has an external use
Vectorizing loops with "escaping" IVs has been disabled since r190790, due to
PR17179. This re-enables it, with support for external use of both
"post-increment" (last iteration) and "pre-increment" (second-to-last iteration)
IVs.

Differential Revision: http://reviews.llvm.org/D21048


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272715 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 21:27:27 +00:00
Matthew Simpson
570bf9da46 Reapply "[TTI] Refine default cost for interleaved load groups with gaps"
This reapplies commit r272385 with a fix. The build was failing when compiled
with gcc, but not with clang. With the fix, we now get the data layout from the
current TTI implementation, which will hopefully solve the issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272395 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 14:33:30 +00:00
Matthew Simpson
f8ad75dbd3 Revert "[TTI] Refine default cost for interleaved load groups with gaps"
This reverts commit r272385. This commit broke the build. I'm temporarily
reverting to investigate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272391 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 12:41:33 +00:00
Matthew Simpson
8f626cdbc5 [TTI] Refine default cost for interleaved load groups with gaps
This patch refines the default cost for interleaved load groups having gaps. If
a load group has gaps, the legalized instructions corresponding to the unused
elements will be dead. Thus, we don't need to account for them in the cost
model. Instead, we only need to account for the fraction of legalized loads
that will actually be used.

Differential Revision: http://reviews.llvm.org/D20873

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272385 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-10 11:27:51 +00:00
Michael Kuperstein
26a9a137a4 [LV] Use vector phis for some secondary induction variables
Previously, we materialized secondary vector IVs from the primary scalar IV,
by offseting the primary to match the correct start value, and then broadcasting
it - inside the loop body. Instead, we can use a real vector IV, like we do for
the primary.

This enables using vector IVs for secondary integer IVs whose type matches the
type of the primary.

Differential Revision: http://reviews.llvm.org/D20932


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272283 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-09 18:03:15 +00:00
Andrey Turetskiy
54dfe5f991 Quick fix for the test from rL272014 "[LAA] Improve non-wrapping pointer
detection by handling loop-invariant case" (s couple of buildbots failed).

Patch by Roman Shirokiy.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272019 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-07 15:52:35 +00:00
Andrey Turetskiy
a87b055656 [LAA] Improve non-wrapping pointer detection by handling loop-invariant case.
This fixes PR26314. This patch adds new helper “isNoWrap” with detection of
loop-invariant pointer case.

Patch by Roman Shirokiy.

Ref: https://llvm.org/bugs/show_bug.cgi?id=26314

Differential Revision: http://reviews.llvm.org/D17268


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272014 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-07 14:55:27 +00:00
Michael Kuperstein
6954e6256d [InstCombine] scalarizePHI should not assume the code it sees has been CSE'd
scalarizePHI only looked for phis that have exactly two uses - the "latch"
use, and an extract. Unfortunately, we can not assume all equivalent extracts
are CSE'd, since InstCombine itself may create an extract which is a duplicate
of an existing one. This extends it to handle several distinct extracts from
the same index.

This should fix at least some of the  performance regressions from PR27988.

Differential Revision: http://reviews.llvm.org/D20983


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271961 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-06 23:38:33 +00:00
Daniel Berlin
aed422107d Revert "Claim NoAlias if two GEPs index different fields of the same struct"
This reverts commit 2d5d6493f43eb68493a3852b8c226ac9fafdc7eb.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271422 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-01 18:55:32 +00:00
Daniel Berlin
7e49342f2d Claim NoAlias if two GEPs index different fields of the same struct
Patch by Taewook Oh

Summary: Patch for Bug 27478. Make BasicAliasAnalysis claims NoAlias if two GEPs index different fields of the same structure.

Reviewers: hfinkel, dberlin

Subscribers: dberlin, mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D20665

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271415 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-01 18:12:01 +00:00
Michael Kuperstein
01d6c3dbf9 [LV] For some IVs, use vector phis instead of widening in the loop body
Previously, whenever we needed a vector IV, we would create it on the fly,
by splatting the scalar IV and adding a step vector. Instead, we can create a
real vector IV. This tends to save a couple of instructions per iteration.

This only changes the behavior for the most basic case - integer primary
IVs with a constant step.

Differential Revision: http://reviews.llvm.org/D20315


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271410 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-01 17:16:46 +00:00
Tim Northover
5b363367fe Move test to X86 directory: I think it depends on X86 TTI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271019 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 16:56:54 +00:00
Tim Northover
e1ebc76f2d Vectorizer: track non-fast FP instructions through phis when finding reductions.
When we traced through a phi node looking for floating-point reductions, we
forgot whether we'd ever seen an instruction without fast-math flags (that
would block vectorization). This propagates it through to the end.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271015 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 16:40:27 +00:00
Hal Finkel
d86e7af14a Look for a loop's starting location in the llvm.loop metadata
Getting accurate locations for loops is important, because those locations are
used by the frontend to generate optimization remarks. Currently, optimization
remarks for loops often appear on the wrong line, often the first line of the
loop body instead of the loop itself. This is confusing because that line might
itself be another loop, or might be somewhere else completely if the body was
inlined function call. This happens because of the way we find the loop's
starting location. First, we look for a preheader, and if we find one, and its
terminator has a debug location, then we use that. Otherwise, we look for a
location on an instruction in the loop header.

The fallback heuristic is not bad, but will almost always find the beginning of
the body, and not the loop statement itself. The preheader location search
often fails because there's often not a preheader, and even when there is a
preheader, depending on how it was formed, it sometimes carries the location of
some preceeding code.

I don't see any good theoretical way to fix this problem. On the other hand,
this seems like a straightforward solution: Put the debug location in the
loop's llvm.loop metadata. A companion Clang patch will cause Clang to insert
llvm.loop metadata with appropriate locations when generating debugging
information. With these changes, our loop remarks have much more accurate
locations.

Differential Revision: http://reviews.llvm.org/D19738

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270771 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 21:42:37 +00:00
Sanjay Patel
cab076f44c [x86] avoid code explosion from LoopVectorizer for gather loop (PR27826)
By making pointer extraction from a vector more expensive in the cost model,
we avoid the vectorization of a loop that is very likely to be memory-bound:
https://llvm.org/bugs/show_bug.cgi?id=27826

There are still bugs related to this, so we may need a more general solution
to avoid vectorizing obviously memory-bound loops when we don't have HW gather
support.

Differential Revision: http://reviews.llvm.org/D20601



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270729 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-25 17:27:54 +00:00
Wei Mi
7aaac1e6e2 Recommit r255691 since PR26509 has been fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270113 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-19 20:38:03 +00:00
Matthew Simpson
12427adbc9 [LAA] Rename forwarding conflict detection option (NFC)
This patch renames the option enabling the store-to-load forwarding conflict
detection optimization. This change was requested in the review of D20241.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269668 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-16 17:00:56 +00:00
Matthew Simpson
20ebcfc5d1 [LV] Ensure safe VF for loops with interleaved accesses
The selection of the vectorization factor currently doesn't consider
interleaved accesses. The vectorization factor is based on the maximum safe
dependence distance computed by LAA. However, for loops with interleaved
groups, we should instead base the vectorization factor on the maximum safe
dependence distance divided by the maximum interleave factor of all the
interleaved groups. Interleaved accesses not in a group will be scalarized.

Differential Revision: http://reviews.llvm.org/D20241

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269659 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-16 15:08:20 +00:00
Sanjay Patel
81a5b32238 [InstCombine] canonicalize* LE/GE vector integer comparisons to LT/GT (PR26701, PR26819)
*We don't currently handle the  edge case constants (min/max values), so it's not a complete
canonicalization.

To fully solve the motivating bugs, we need to enhance this to recognize a zero vector
too because that's a ConstantAggregateZero which is a ConstantData, not a ConstantVector
or a ConstantDataVector.

Differential Revision: http://reviews.llvm.org/D17859 




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269426 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-13 15:10:46 +00:00
James Molloy
3a22301784 Revert "[VectorUtils] Query number of sign bits to allow more truncations"
This was a fairly simple patch but on closer inspection was seriously flawed and caused PR27690.

This reverts commit r268921.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269051 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 12:27:23 +00:00
Elena Demikhovsky
b6e58d8bd0 [LoopVectorize] Handling induction variable with non-constant step.
Allow vectorization when the step is a loop-invariant variable.
This is the loop example that is getting vectorized after the patch:

 int int_inc;
 int bar(int init, int *restrict A, int N) {

  int x = init;
  for (int i=0;i<N;i++){
    A[i] = x;
    x += int_inc;
  }
  return x;
 }

"x" is an induction variable with *loop-invariant* step.
But it is not a primary induction. Primary induction variable with non-constant step is not handled yet.

Differential Revision: http://reviews.llvm.org/D19258



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269023 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 07:33:35 +00:00
Adam Nemet
1b5ab63915 [LV] Hint at the new loop distribution pragma in optimization remark
When we encounter unsafe memory dependencies, loop distribution could
help.

Even though, the diagnostics is in LAA, it's only currently emitted in
the vectorizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268987 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 23:03:44 +00:00
James Molloy
ec0b9c8745 [VectorUtils] Query number of sign bits to allow more truncations
When deciding if a vector calculation can be done in a smaller bitwidth, use sign bit information from ValueTracking to add more information and allow more truncations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268921 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-09 14:32:30 +00:00
Silviu Baranga
debb790c77 [LV] Identify more induction PHIs by coercing expressions to AddRecExprs
Summary:
Some PHIs can have expressions that are not AddRecExprs due to the presence
of sext/zext instructions. In order to prevent the Loop Vectorizer from
bailing out when encountering these PHIs, we now coerce the SCEV
expressions to AddRecExprs using SCEV predicates (when possible).

We only do this when the alternative would be to not vectorize.

Reviewers: mzolotukhin, anemet

Subscribers: mssimpso, sanjoy, mzolotukhin, llvm-commits

Differential Revision: http://reviews.llvm.org/D17153

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268633 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-05 15:20:39 +00:00
David Majnemer
a89ddf6e7c [LoopVectorize] Add operand bundles to vectorized functions
Also, do not crash when calculating a cost model for loop-invariant
token values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268003 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 07:09:48 +00:00
Michael Zolotukhin
bf6113b8c0 [PR25281] Remove AAResultsWrapper from preserved analyses of loop vectorizer.
We don't preserve AAResults, because, for one, we don't preserve SCEV-AA.
That fixes PR25281.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267980 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 03:31:25 +00:00
Hal Finkel
8628001526 [LoopVectorize] Keep hints from original loop on the vector loop
We need to keep loop hints from the original loop on the new vector loop.
Failure to do this meant that, for example:

  void foo(int *b) {
  #pragma clang loop unroll(disable)
    for (int i = 0; i < 16; ++i)
      b[i] = 1;
  }

this loop would be unrolled. Why? Because we'd vectorize it, thus dropping the
hints that unrolling should be disabled, and then we'd unroll it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 01:27:40 +00:00
Matthew Simpson
facf17cd03 [LV] Reallow positive-stride interleaved load groups with gaps
We previously disallowed interleaved load groups that may cause us to
speculatively access memory out-of-bounds (r261331). We did this by ensuring
each load group had an access corresponding to the first and last member.
Instead of bailing out for these interleaved groups, this patch enables us to
peel off the last vector iteration, ensuring that we execute at least one
iteration of the scalar remainder loop. This solution was proposed in the
review of the previous patch.

Differential Revision: http://reviews.llvm.org/D19487

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267751 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-27 18:21:36 +00:00
Elena Demikhovsky
b7f92d0916 Masked Store in Loop Vectorizer - bugfix
Fixed a bug in loop vectorization with conditional store.

Differential Revision: http://reviews.llvm.org/D19532



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267597 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 20:18:04 +00:00
Hal Finkel
681428ed7d [LoopVectorize] Don't consider conditional-load dereferenceability for marked parallel loops
I really thought we were doing this already, but we were not. Given this input:

void Test(int *res, int *c, int *d, int *p) {
  for (int i = 0; i < 16; i++)
    res[i] = (p[i] == 0) ? res[i] : res[i] + d[i];
}

we did not vectorize the loop. Even with "assume_safety" the check that we
don't if-convert conditionally-executed loads (to protect against
data-dependent deferenceability) was not elided.

One subtlety: As implemented, it will still prefer to use a masked-load
instrinsic (given target support) over the speculated load. The choice here
seems architecture specific; the best option depends on how expensive the
masked load is compared to a regular load. Ideally, using the masked load still
reduces unnecessary memory traffic, and so should be preferred. If we'd rather
do it the other way, flipping the order of the checks is easy.

The LangRef is updated to make explicit that llvm.mem.parallel_loop_access also
implies that if conversion is okay.

Differential Revision: http://reviews.llvm.org/D19512

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267514 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-26 02:00:36 +00:00
Renato Golin
d30dbb6c8c [ARM] AArch32 v8 NEON is still not IEEE-754 compliant
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266603 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-18 12:06:47 +00:00
Adrian Prantl
422c22e3d3 Convert this sample-based-profiling testcase to use a NoDebug CU.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266481 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-15 22:05:38 +00:00
Adrian Prantl
4eeaa0da04 [PR27284] Reverse the ownership between DICompileUnit and DISubprogram.
Currently each Function points to a DISubprogram and DISubprogram has a
scope field. For member functions the scope is a DICompositeType. DIScopes
point to the DICompileUnit to facilitate type uniquing.

Distinct DISubprograms (with isDefinition: true) are not part of the type
hierarchy and cannot be uniqued. This change removes the subprograms
list from DICompileUnit and instead adds a pointer to the owning compile
unit to distinct DISubprograms. This would make it easy for ThinLTO to
strip unneeded DISubprograms and their transitively referenced debug info.

Motivation
----------

Materializing DISubprograms is currently the most expensive operation when
doing a ThinLTO build of clang.

We want the DISubprogram to be stored in a separate Bitcode block (or the
same block as the function body) so we can avoid having to expensively
deserialize all DISubprograms together with the global metadata. If a
function has been inlined into another subprogram we need to store a
reference the block containing the inlined subprogram.

Attached to https://llvm.org/bugs/show_bug.cgi?id=27284 is a python script
that updates LLVM IR testcases to the new format.

http://reviews.llvm.org/D19034
<rdar://problem/25256815>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266446 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-15 15:57:41 +00:00
Vedant Kumar
891e9f4096 [test] Require 'asserts' for a test which uses -debug-only
Without this line, bots which run check-all on Release compilers will
break.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266386 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 23:32:40 +00:00