448 Commits

Author SHA1 Message Date
Craig Topper
3900e637a2 [AVX-512][InstCombine] Teach InstCombine to turn packed add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290559 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-27 00:23:16 +00:00
Simon Pilgrim
915b45f09f [InstCombine][X86] Add DemandedElts support for PMULDQ/PMULUDQ instructions
PMULDQ/PMULUDQ vXi64 instructions only use the even numbered v2Xi32 input elements which SimplifyDemandedVectorElts should try and use.

Differential Revision: https://reviews.llvm.org/D28119

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290554 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-26 23:28:17 +00:00
Craig Topper
a7c4539a02 [AVX-512][InstCombine] Teach InstCombine to turn scalar add/sub/mul/div with rounding intrinsics into normal IR operations if the rounding mode is CUR_DIRECTION.
Summary:
I only do this for unmasked cases for now because isel is failing to fold the mask. I'll try to fix that soon.

I'll do the same thing for packed add/sub/mul/div in a future patch.

Reviewers: delena, RKSimon, zvi, craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27879

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290535 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-26 06:33:19 +00:00
Craig Topper
e6e82fb3ab [AVX-512][InstCombine] Teach InstCombine to converted masked vpermv intrinsics into shufflevector instructions
Summary:
This patch adds support for converting the masked vpermv intrinsics into shufflevector instructions if the indices are constants.

We also need to wrap a select instruction around the shuffle to take care of the masking part. InstCombine will take care of optimizing the select if the mask is constant so I didn't bother checking for that.

Reviewers: zvi, delena, spatel, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290530 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-25 23:58:57 +00:00
George Burgess IV
1ced44b92a [Analysis] Centralize objectsize lowering logic.
We're currently doing nearly the same thing for @llvm.objectsize in
three different places: two of them are missing checks for overflow,
and one of them could subtly break if InstCombine gets much smarter
about removing alloc sites. Seems like a good idea to not do that.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290214 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-20 23:46:36 +00:00
Daniel Jasper
8de3a54f07 Revert @llvm.assume with operator bundles (r289755-r289757)
This creates non-linear behavior in the inliner (see more details in
r289755's commit thread).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@290086 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-19 08:22:17 +00:00
Craig Topper
d941d3f22f [AVX-512][InstCombine] Add masked scalar FMA intrinsics to SimplifyDemandedVectorElts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289759 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 03:49:45 +00:00
Hal Finkel
bffeba468d Remove the AssumptionCache
After r289755, the AssumptionCache is no longer needed. Variables affected by
assumptions are now found by using the new operand-bundle-based scheme. This
new scheme is more computationally efficient, and also we need much less
code...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289756 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 03:02:15 +00:00
Hal Finkel
fe647d2183 Make processing @llvm.assume more efficient by using operand bundles
There was an efficiency problem with how we processed @llvm.assume in
ValueTracking (and other places). The AssumptionCache tracked all of the
assumptions in a given function. In order to find assumptions relevant to
computing known bits, etc. we searched every assumption in the function. For
ValueTracking, that means that we did O(#assumes * #values) work in InstCombine
and other passes (with a constant factor that can be quite large because we'd
repeat this search at every level of recursion of the analysis).

Several of us discussed this situation at the last developers' meeting, and
this implements the discussed solution: Make the values that an assume might
affect operands of the assume itself. To avoid exposing this detail to
frontends and passes that need not worry about it, I've used the new
operand-bundle feature to add these extra call "operands" in a way that does
not affect the intrinsic's signature. I think this solution is relatively
clean. InstCombine adds these extra operands based on what ValueTracking, LVI,
etc. will need and then those passes need only search the users of the values
under consideration. This should fix the computational-complexity problem.

At this point, no passes depend on the AssumptionCache, and so I'll remove
that as a follow-up change.

Differential Revision: https://reviews.llvm.org/D27259

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289755 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-15 02:53:42 +00:00
Craig Topper
b1003083d7 [X86][InstCombine] Handle demanded elements for operand of AVX-512 scalar floating point to integer conversion intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289639 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 07:46:12 +00:00
Craig Topper
2eef9bcab6 [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle masked scalar add/sub/mul/div/max/min intrinsics better.
Now we can remove these intrinsics if element 0 isn't used. Also fix undef element tracking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289636 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 06:06:58 +00:00
Craig Topper
98bf16eccb [X86][InstCombine] Handle scalar fmadd intrinsics correctly in SimplifyDemandedVectorElts.
Now we pass a modified version of DemandedElts to each operand and we calculate undef elts correctly.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289632 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 05:43:05 +00:00
Craig Topper
23156f1924 [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar round intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused. Similarly we clear bit 0 for optimizing operand 0.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289629 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 03:17:30 +00:00
Craig Topper
52ed6069ee [X86][InstCombine] Teach SimplifyDemandedVectorElts to handle scalar min/max/cmp intrinsics more correctly.
Now we only pass bit 0 of the DemandedElts to optimize operand 1 as we recurse since the upper bits are unused.

Also calculate UndefElts correctly.

Simplify InstCombineCalls for these instrinics to just call SimplifyDemandedVectorElts for the call instrution to reuse this support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289628 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-14 03:17:27 +00:00
Craig Topper
f19ce9bb49 [X86][InstCombine] Fix SimplifyDemandedVectorElts to handle frcz scalar intrinsics correctly.
Only the lower bits of the input element are used. And only the lower element can be undef since the upper bits are zeroed.

Have InstCombineCalls call SimplifyDemandedVectorElts for these intrinsics to reuse this support.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289523 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-13 07:45:45 +00:00
Craig Topper
d07981b634 [X86][InstCombine] Teach InstCombineCalls to simplify demanded elements for scalar FMA intrinsics.
These intrinsics don't read the upper bits of their second and third inputs so we can try to simplify them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289372 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 07:42:06 +00:00
Craig Topper
0a2fc781e3 [AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded for scalar cmp intrinsics with masking and rounding.
These intrinsics don't read the upper elements of their first and second input. These are slightly different the the SSE version which does use the upper bits of its first element as passthru bits since the result goes to an XMM register. For AVX-512 the result goes to a mask register instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289371 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 07:42:04 +00:00
Craig Topper
d469865e61 [AVX-512][InstCombine] Teach InstCombineCalls how to simplify demanded elements for scalar add,div,mul,sub,max,min intrinsics with masking and rounding.
These intrinsics don't read the upper bits of their second input. And the third input is the passthru for masking and that only uses the lower element as well.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289370 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 07:42:01 +00:00
Craig Topper
f3e3617e77 [AVX-512][InstCombine] Add 512-bit vpermilvar intrinsics to InstCombineCalls to match 128 and 256-bit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289354 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 01:59:36 +00:00
Craig Topper
7c8796bdf5 [X86][InstCombine] Teach InstCombineCalls to turn pshufb intrinsic into a shufflevector if the indices are constant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-11 00:23:50 +00:00
David Majnemer
6634682067 Replace some callers of setTailCall with setTailCallKind
We were a little sloppy with adding tailcall markers.  Be more
consistent by using setTailCallKind instead of setTailCall.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287955 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-25 22:35:09 +00:00
Craig Topper
35b775d13a [InstCombine][AVX-512] Teach InstCombineCalls how to handle the intrinsics for variable shift with 16-bit elements.
This is a straightforward extension of the existing support for 32/64-bit element types. Just needed to add the additional instrinsics to the switches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287316 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-18 06:04:33 +00:00
Craig Topper
614df99de4 [X86] Remove the scalar intrinsics for fadd/fsub/fdiv/fmul
Summary: These intrinsics have been unused for clang for a while. This patch removes them. We auto upgrade them to extractelements, a scalar operation and then an insertelement. This matches the sequence used by clangs intrinsic file.

Reviewers: zvi, delena, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D26660

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-16 05:24:10 +00:00
Craig Topper
e59ef16ac1 [InstCombine][AVX-512] Teach InstCombineCalls to handle the new unmasked AVX-512 variable shift intrinsics.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286755 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-13 07:26:19 +00:00
Craig Topper
eb14a74d58 [InstCombine][AVX-512] Expand vector shift handling to work on the AVX-512 shift by immediate and shift by single value.
This does not include support for the AVX-512 variable shifts. That will be coming in a future patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286739 91177308-0d34-0410-b5e6-96231b3b80d8
2016-11-13 01:51:55 +00:00
Andrea Di Biagio
2fdf2bfd1f [InstCombine][SSE4a] Fix assertion failure in the insertq/insertqi combining logic.
This fixes a similar issue to the one already fixed by r280804
(revieved in D24256). Revision 280804 fixed the problem with unsafe dyn_casts
in the extrq/extrqi combining logic. However, it turns out that even the
insertq/insertqi logic was affected by the same problem.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280807 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:47:53 +00:00
Andrea Di Biagio
fd11478c26 [InstCombine][SSE4a] Fix assertion failure caused by unsafe dyn_casts on the operands of extrq/extrqi intrinsic calls.
This patch fixes an assertion failure caused by unsafe dynamic casts on the
constant operands of sse4a intrinsic calls to extrq/extrqi

The combine logic that simplifies sse4a extrq/extrqi intrinsic calls currently
checks if the input operands are constants. Internally, that logic relies on
dyn_casts of values returned by calls to method Constant::getAggregateElement.
However, method getAggregateElemet may return nullptr if the constant element
cannot be retrieved. So, all the dyn_casts can potentially fail. This is what
happens for example if a constexpr value is passed in input to an extrq/extrqi
intrinsic call.

This patch fixes the problem by using a dyn_cast_or_null (instead of a simple
dyn_cast) on the result of each call to Constant::getAggregateElement.

Added reproducible test cases to x86-sse4a.ll.

Differential Revision: https://reviews.llvm.org/D24256


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-07 12:03:03 +00:00
Dorit Nuzman
cc6b354faf [InstCombine] Preserve llvm.mem.parallel_loop_access metadata when replacing
memcpy with ld/st.

When InstCombine replaces a memcpy with loads+stores it does not copy over the
llvm.mem.parallel_loop_access from the memcpy instruction. This patch fixes
that.

Differential Revision: https://reviews.llvm.org/D23499



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280617 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-04 07:49:39 +00:00
Dorit Nuzman
e99c574578 Test commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280615 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-04 07:06:00 +00:00
Matt Arsenault
6acb49abca AMDGPU: Do basic folding of class intrinsic
This allows more of the OCML builtin library to be
constant folded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280586 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-03 07:06:58 +00:00
Amaury Sechet
c43ec38718 Make cltz and cttz zero undef when the operand cannot be zero in InstCombine
Summary: Also add popcount(n) == bitsize(n)  -> n == -1 transformation.

Reviewers: majnemer, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D23134

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279141 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-18 20:43:50 +00:00
Justin Bogner
7d7a23e700 Replace a few more "fall through" comments with LLVM_FALLTHROUGH
Follow up to r278902. I had missed "fall through", with a space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-17 20:30:52 +00:00
Eugene Zelenko
c986ab210b Fix some Clang-tidy modernize and Include What You Use warnings.
Differential revision: https://reviews.llvm.org/D23291


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278364 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 17:20:18 +00:00
Sanjay Patel
16fc409d35 fix comment; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278342 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:23:56 +00:00
Sanjay Patel
583b94db9f use auto* with dyn_cast ; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278340 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:21:21 +00:00
Sanjay Patel
4636b0c422 getParent()->getParent() == getFunction() ; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278339 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 15:16:06 +00:00
Sanjay Patel
4b55580c72 [InstCombine] refactor ctlz/cttz folds (NFCI)
Note that this fold really belongs in InstSimplify.
Refactoring here anyway as an intermediate step because
there's a planned addition to this function in D23134.

Differential Revision: https://reviews.llvm.org/D23223



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277883 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-05 22:42:46 +00:00
Justin Bogner
afba697b6c InstCombine: Replace some never-null pointers with references. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277792 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-05 01:06:44 +00:00
Vitaly Buka
2328afe0b5 Do not remove empty lifetime.start/lifetime.end ranges
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.

This is possible for code like this:
for (int i = 0; i < 3; i++) {
  int x;
  p = &x;
}

"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.

PR27453

Reviewers: eugenis

Differential Revision: https://reviews.llvm.org/D22842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277072 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:59:03 +00:00
Vitaly Buka
3ec908dfba Should be committed as one CL.
This reverts commits r277068 r277067 r277066.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277071 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:59:01 +00:00
Vitaly Buka
bbc7125e70 Do not remove empty lifetime.start/lifetime.end ranges
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.

This is possible for code like this:
for (int i = 0; i < 3; i++) {
  int x;
  p = &x;
}

"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.

PR27453

Reviewers: eugenis

Differential Revision: https://reviews.llvm.org/D22842

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277068 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:50:48 +00:00
Vitaly Buka
1e88bcadc3 maned
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277067 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:50:45 +00:00
Vitaly Buka
e0d8738c93 range
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277066 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 22:50:43 +00:00
David Majnemer
84f3bec422 [InstCombine] Masked loads with undef masks can fold to normal loads
We were able to fold masked loads with an all-ones mask to a normal
load.  However, we couldn't turn a masked load with a mask with mixed
ones and undefs into a normal load.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275380 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-14 06:58:42 +00:00
David Majnemer
d2dd1e2fd6 Move a transform from InstCombine to InstSimplify.
This transform doesn't require any new instructions, it can safely live
in InstSimplify.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275344 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-13 23:32:53 +00:00
Sean Silva
e82e4ddb87 Remove dead TLI arg of isKnownNonNull and propagate deadness. NFC.
This actually uncovered a surprisingly large chain of ultimately unused
TLI args.
From what I can gather, this argument is a remnant of when
isKnownNonNull would look at the TLI directly.
The current approach seems to be that InferFunctionAttrs runs early in
the pipeline and uses TLI to annotate the TLI-dependent non-null
information as return attributes.

This also removes the dependence of functionattrs on TLI altogether.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274455 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-02 23:47:27 +00:00
Matt Arsenault
d158fa7c9d InstCombine: Don't strip convergent from intrinsic callsites
Specific instances of intrinsic calls may want to be convergent, such
as certain register reads but the intrinsic declaration is not.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273188 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-20 19:04:44 +00:00
Craig Topper
e2cf2e02a7 [IR] Require ArrayRef of 'uint32_t' instead of 'int' for the mask argument for one of the signatures of CreateShuffleVector. This better emphasises that you can't use it for the -1 as undef behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272491 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-12 00:41:19 +00:00
Benjamin Kramer
04a303b821 Avoid copies of std::strings and APInt/APFloats where we only read from it
As suggested by clang-tidy's performance-unnecessary-copy-initialization.
This can easily hit lifetime issues, so I audited every change and ran the
tests under asan, which came back clean.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272126 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-08 10:01:20 +00:00
Simon Pilgrim
c9bc1a4d8d [InstCombine][AVX2] Add support for simplifying AVX2 per-element shifts to native shifts
Unlike native shifts, the AVX2 per-element shift instructions VPSRAV/VPSRLV/VPSLLV handle out of range shift values (logical shifts set the result to zero, arithmetic shifts splat the sign bit).

If the shift amount is constant we can sometimes convert these instructions to native shifts:

1 - if all shift amounts are in range then the conversion is trivial.
2 - out of range arithmetic shifts can be clamped to the (bitwidth - 1) (a legal shift amount) before conversion.
3 - logical shifts just return zero if all elements have out of range shift amounts.

In addition, UNDEF shift amounts are handled - either as an UNDEF shift amount in a native shift or as an UNDEF in the logical 'all out of range' zero constant special case for logical shifts.

Differential Revision: http://reviews.llvm.org/D19675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271996 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-07 10:27:15 +00:00