373 Commits

Author SHA1 Message Date
Andrea Di Biagio
2fdf2bfd1f [InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic.
This fixes a similar issue to the one already fixed by r280804
(revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts
in the extrq/extrqi combining logic. However, it turns out that even the
insertq/insertqi logic was affected by the same problem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280807 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:47:53 +00:00
Andrea Di Biagio
fd11478c26 [InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls.
This patch fixes an assertion failure caused by unsafe dynamic casts on the
constant operands of sse4a intrinsic calls to extrq/extrqi

The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently
checks if the input operands are constants. Internally, that logic relies on
dyn_casts of values returned by calls to method Constant::getAggregateElement.
However, method getAggregateElemet may return nullptr if the constant element
cannot be retrieved. So, all the dyn_casts can potentially fail. This is what
happens for example if a constexpr value is passed in input to an extrq/extrqi
intrinsic call.

This patch fixes the problem by using a dyn_cast_or_null (instead of a simple
dyn_cast) on the result of each call to Constant::getAggregateElement.

Added reproducible test cases to x86-sse4a.ll.

Differential Revision: https://reviews.llvm.org/D24256


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:03:03 +00:00
Dorit Nuzman
cc6b354faf [InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing
memcpy with ld/st.

When InstCombine replaces a memcpy with loads+stores it does not copy over the
llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes
that.

Differential Revision: https://reviews.llvm.org/D23499



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280617 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-04 07:49:39 +00:00
Dorit Nuzman
e99c574578 Test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280615 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-04 07:06:00 +00:00
Matt Arsenault
6acb49abca AMDGPU: Do basic folding of class intrinsic
This allows more of the OCML builtin library to be
constant folded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280586 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 07:06:58 +00:00
Amaury Sechet
c43ec38718 Make cltz and cttz zero undef when the operand cannot be zero in InstCombine
Summary: Also add popcount(n) == bitsize(n)  -> n == -1 transformation.

Reviewers: majnemer, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23134

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279141 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-18 20:43:50 +00:00
Justin Bogner
7d7a23e700 Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 20:30:52 +00:00
Eugene Zelenko
c986ab210b Fix some Clang-tidy modernize and Include What You Use warnings.
Differential revision: https://reviews.llvm.org/D23291


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278364 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 17:20:18 +00:00
Sanjay Patel
16fc409d35 fix comment; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:23:56 +00:00
Sanjay Patel
583b94db9f use auto* with dyn_cast ; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:21:21 +00:00
Sanjay Patel
4636b0c422 getParent()->getParent() == getFunction() ; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278339 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:16:06 +00:00
Sanjay Patel
4b55580c72 [InstCombine] refactor ctlz/cttz folds (NFCI)
Note that this fold really belongs in InstSimplify.
Refactoring here anyway as an intermediate step because
there's a planned addition to this function in D23134.

Differential Revision: https://reviews.llvm.org/D23223



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277883 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-05 22:42:46 +00:00
Justin Bogner
afba697b6c InstCombine: Replace some never-null pointers with references. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277792 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-05 01:06:44 +00:00
Vitaly Buka
2328afe0b5 Do not remove empty lifetime.start/lifetime.end ranges
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.

This is possible for code like this:
for (int i = 0; i < 3; i++) {
  int x;
  p = &x;
}

"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.

PR27453

Reviewers: eugenis

Differential Revision: https://reviews.llvm.org/D22842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277072 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:59:03 +00:00
Vitaly Buka
3ec908dfba Should be committed as one CL.
This reverts commits r277068 r277067 r277066.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277071 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:59:01 +00:00
Vitaly Buka
bbc7125e70 Do not remove empty lifetime.start/lifetime.end ranges
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.

This is possible for code like this:
for (int i = 0; i < 3; i++) {
  int x;
  p = &x;
}

"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.

PR27453

Reviewers: eugenis

Differential Revision: https://reviews.llvm.org/D22842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277068 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:50:48 +00:00
Vitaly Buka
1e88bcadc3 maned
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277067 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:50:45 +00:00
Vitaly Buka
e0d8738c93 range
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277066 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:50:43 +00:00
David Majnemer
84f3bec422 [InstCombine] Masked loads with undef masks can fold to normal loads
We were able to fold masked loads with an all-ones mask to a normal
load.  However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275380 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-14 06:58:42 +00:00
David Majnemer
d2dd1e2fd6 Move a transform from InstCombine to InstSimplify.
This transform doesn't require any new instructions, it can safely live
in InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275344 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-13 23:32:53 +00:00
Sean Silva
e82e4ddb87 Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274455 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-02 23:47:27 +00:00
Matt Arsenault
d158fa7c9d InstCombine: Don't strip convergent from intrinsic callsites
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273188 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 19:04:44 +00:00
Craig Topper
e2cf2e02a7 [IR] Require ArrayRef of 'uint32_t' instead of 'int' for the mask argument for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272491 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 00:41:19 +00:00
Benjamin Kramer
04a303b821 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272126 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-08 10:01:20 +00:00
Simon Pilgrim
c9bc1a4d8d [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).

If the shift amount is constant we can sometimes convert these instructions to native shifts:

1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.

In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.

Differential Revision: http://reviews.llvm.org/D19675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271996 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-07 10:27:15 +00:00
Simon Pilgrim
4d1d561550 [InstCombine][SSE] Add MOVMSK constant folding (PR27982)
This patch adds support for folding undef/zero/constant inputs to MOVMSK instructions.

The SSE/AVX versions can be fully folded, but the MMX version can only handle undef inputs.

Differential Revision: http://reviews.llvm.org/D20998

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271990 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-07 08:18:35 +00:00
Craig Topper
2463f3bdaa [X86] Remove SSE/AVX unaligned store intrinsics as clang no longer uses them. Auto upgrade to native unaligned store instructions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271236 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-30 23:15:56 +00:00
Simon Pilgrim
687467768e [X86][SSE] (Reapplied) Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

Reapplied now that the the companion patch (D20684) removes/auto-upgrade the clang intrinsics has been committed.

Differential Revision: http://reviews.llvm.org/D20686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271131 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-28 18:03:41 +00:00
Simon Pilgrim
3808026456 Revert: r270973 - [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270976 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 09:02:25 +00:00
Simon Pilgrim
337a02bbe0 [X86][SSE] Replace (V)PMOVSX and (V)PMOVZX integer extension intrinsics with generic IR (llvm)
This patch removes the llvm intrinsics VPMOVSX and (V)PMOVZX sign/zero extension intrinsics and auto-upgrades to SEXT/ZEXT calls instead. We already did this for SSE41 PMOVSX sometime ago so much of that implementation can be reused.

A companion patch (D20684) removes/auto-upgrade the clang intrinsics.

Differential Revision: http://reviews.llvm.org/D20686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270973 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-27 08:49:15 +00:00
Craig Topper
4aac6f32a0 [X86] Add the AVX storeu intrinsics to InstCombine and LoopStrengthReduce in the same places that the SSE/SSE2 storeu intrinsics appear.
I don't really know how to test this. Just seemed like we should be consistent.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270819 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-26 04:28:45 +00:00
Arnaud A. de Grandmaison
5e24513c1f [InstCombine] Remove trivially empty va_start/va_end and va_copy/va_end ranges.
When a va_start or va_copy is immediately followed by a va_end (ignoring
debug information or other start/end in between), then it is safe to
remove the pair. As this code shares some commonalities with the lifetime
markers, this has been factored to helper functions.

This InstCombine pattern kicks-in 3 times when running the LLVM test
suite.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 09:24:49 +00:00
Simon Pilgrim
ef96d1f545 [InstCombine][SSE] Added support to VPERMD/VPERMPS to shuffle combine to accept UNDEF elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268206 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-01 20:43:02 +00:00
Simon Pilgrim
c244765b2c [InstCombine][SSE] Added support to VPERMILVAR to shuffle combine to accept UNDEF elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268204 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-01 20:22:42 +00:00
Simon Pilgrim
a1c0cfe5e4 [InstCombine][SSE] Added support to PSHUFB to shuffle combine to accept UNDEF elements.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268202 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-01 19:26:21 +00:00
Simon Pilgrim
95730ae9bd [InstCombine][AVX2] Combine VPERMD/VPERMPS intrinsics with constant masks to shufflevector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268199 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-01 16:41:22 +00:00
Simon Pilgrim
99e30751ef [InstCombine][AVX] VPERMILVAR to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268158 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-30 07:23:30 +00:00
Simon Pilgrim
63a5b4c179 [InstCombine][SSE] PSHUFB to shuffle combine to use general aggregate elements. NFCI.
Make use of Constant::getAggregateElement instead of checking constant types - first step towards adding support for UNDEF mask elements.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268115 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 21:34:54 +00:00
David Majnemer
1929f2aa4d [InstCombine] Propagate operand bundles
We neglected to transfer operand bundles for some transforms.  These
were found via inspection, I'll try to come up with some test cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268010 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 08:07:20 +00:00
Simon Pilgrim
0a660a2b80 [InstCombine][SSE] Demanded vector elements for scalar intrinsics (Part 1 of 2)
This patch improves support for determining the demanded vector elements through SSE scalar intrinsics:

1 - recognise that we only need the lowest element of the second input for binary scalar operations (and all the elements of the first input)

2 - recognise that the roundss/roundsd intrinsics use the lowest element of the second input and the remaining elements from the first input

Differential Revision: http://reviews.llvm.org/D17490

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267356 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 18:12:42 +00:00
Simon Pilgrim
db74d7c300 [InstCombine] Avoid updating argument demanded elements in separate passes.
As discussed on D17490, we should attempt to update an intrinsic's arguments demanded elements in one pass if we can.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267355 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 17:57:27 +00:00
Simon Pilgrim
c4d2c7c24e [X86][InstCombine] Tidyup VPERMILVAR -> shufflevector conversion to helper function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267352 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 17:23:46 +00:00
Simon Pilgrim
a3ce49106c [X86][InstCombine] Tidyup PSHUFB -> shufflevector conversion to helper function. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267351 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-24 17:00:34 +00:00
Sanjay Patel
9078f559b2 [x86, InstCombine] fix masked load pass-through operand to be a zero vector
This bug was introduced with:
http://reviews.llvm.org/rL262269

AVX masked loads are specified to set vector lanes to zero when the high bit of the mask
element for that lane is zero:
"If the mask is 0, the corresponding data element is set to zero in the load form of these
instructions, and unmodified in the store form." --Intel manual

Differential Revision: http://reviews.llvm.org/D19017



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266148 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 23:16:23 +00:00
George Burgess IV
274105b8b5 Add the allocsize attribute to LLVM.
`allocsize` is a function attribute that allows users to request that
LLVM treat arbitrary functions as allocation functions.

This patch makes LLVM accept the `allocsize` attribute, and makes
`@llvm.objectsize` recognize said attribute.

The review for this was split into two patches for ease of reviewing:
D18974 and D14933. As promised on the revisions, I'm landing both
patches as a single commit.

Differential Revision: http://reviews.llvm.org/D14933


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266032 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-12 01:05:35 +00:00
David Majnemer
479e6941d7 [InstCombine] Add a peephole for redundant assumes
Two or more identical assumes are occasionally next to each other in a
basic block.
While our generic machinery will turn a redundant assume into a no-op,
it is not super cheap.
We can perform a simpler check to achieve the same result for this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@265801 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-08 16:37:12 +00:00
Matt Arsenault
e8cd894c28 AMDGPU: Add frexp_exp intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264944 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-30 22:28:52 +00:00
Matt Arsenault
f965621c5f AMDGPU: Constant folding for frexp_mant
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264943 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-30 22:28:26 +00:00
Justin Lebar
dd68c9c6c5 [attrs] Handle convergent CallSites.
Summary:
Previously we had a notion of convergent functions but not of convergent
calls.  This is insufficient to correctly analyze calls where the target
is unknown, e.g. indirect calls.

Now a call is convergent if it targets a known-convergent function, or
if it's explicitly marked as convergent.  As usual, we can remove
convergent where we can prove that no convergent operations are
performed in the call.

Originally landed as r261544, then reverted in r261544 for (incidental)
build breakage.  Re-landed here with no changes.

Reviewers: chandlerc, jingyue

Subscribers: llvm-commits, tra, jhen, hfinkel

Differential Revision: http://reviews.llvm.org/D17739

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263481 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-14 20:18:54 +00:00
Sanjay Patel
b1ec9c2aaf [x86, InstCombine] delete x86 SSE2 masked store with zero mask
This follows up on the related AVX instruction transforms, but this
one is too strange to do anything more with. Intel's behavioral
description of this instruction in its Software Developer's Manual
is tragi-comic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-12 15:16:59 +00:00