Commit Graph

20086 Commits

Author SHA1 Message Date
Wei Ding
1cfed01e02 AMDGPU : Update TrapCode based on Trap Handler ABI.
Differential Revision: http://reviews.llvm.org/D30232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295904 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:22:19 +00:00
Matt Arsenault
c2d34b5027 AMDGPU: Add replacement bfe intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295899 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 23:04:58 +00:00
Dylan McKay
ec26388916 [AVR] Disable integrated assembler for a few tests
Fixes the build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295895 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:41:13 +00:00
Krzysztof Parzyszek
e9d7ca1b92 [Hexagon] Implement @llvm.readcyclecounter()
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295892 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:28:47 +00:00
Matt Arsenault
206dfa3c0d AMDGPU: Don't add emergency stack slot if all spills are SGPR->VGPR
This should avoid reporting any stack needs to be allocated in the
case where no stack is truly used. An unused stack slot is still
left around in other cases where there are real stack objects
but no spilling occurs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295891 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 22:23:32 +00:00
Krzysztof Parzyszek
7af390a681 [Hexagon] Add intrinsics for masked vector stores
Patch by Harsha Jagasia.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295879 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 21:23:09 +00:00
Matt Arsenault
c1d17d5f71 AMDGPU: Don't look at chain users when adjusting writemask
Fixes not adjusting using new intrinsics with chains.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295878 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 21:16:41 +00:00
Matt Arsenault
138d429065 AMDGPU: Always allocate emergency stack slot at offset 0
This allows us to ensure that 0 is never a valid pointer
to a user object, and ensures that the offset is always legal
without needing a register to access it. This comes at the cost
of usable offsets and wasted stack space.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295877 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 21:05:25 +00:00
Matt Arsenault
1b020b3be5 AMDGPU: Change exp with compr bit printing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 20:37:12 +00:00
Wei Ding
9b1c9472f5 Revert "AMDGPU : Update TrapCode based on Trap Handler ABI."
This reverts commit r295867.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295871 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 20:29:22 +00:00
Wei Ding
d70493f450 AMDGPU : Update TrapCode based on Trap Handler ABI.
Differential Revision: http://reviews.llvm.org/D30232

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295867 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 20:05:06 +00:00
Matthias Braun
a0ae75470a Bring back 2>&1 redirection for this test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295864 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 19:16:33 +00:00
Geoff Berry
6b494b30f2 [AArch64] Extend AArch64RedundantCopyElimination to do simple copy propagation.
Summary:
Extend AArch64RedundantCopyElimination to catch cases where the register
that is known to be zero is COPY'd in the predecessor block.  Before
this change, this pass would catch cases like:

      CBZW %W0, <BB#1>
  BB#1:
      %W0 = COPY %WZR // removed

After this change, cases like the one below are also caught:

      %W0 = COPY %W1
      CBZW %W1, <BB#1>
  BB#1:
      %W0 = COPY %WZR // removed

This change results in a 4% increase in static copies removed by this
pass when compiling the llvm test-suite.  It also fixes regressions
caused by doing post-RA copy propagation (a separate change to be put up
for review shortly).

Reviewers: junbuml, mcrosier, t.p.northover, qcolombet, MatzeB

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D30113

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295863 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 19:10:45 +00:00
Matthias Braun
3f55d742b2 MIRTests: Remove unnecessary 2>&1 redirection
llc mir output goes to stdout nowadays, so the 2>&1 is not necessary
anymore for most tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295859 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 18:47:41 +00:00
Dan Gohman
61ce026358 [WebAssembly] Configure codegen to legalize f16 values.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295850 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 16:28:00 +00:00
Bill Seurer
6ef315bddb [DAGCombiner] revert r295336
r295336 causes a bootstrapped clang to fail for many compilations on
powerpc BE.  See 
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-multistage/builds/2315
for example.

Reverting as per the developer's request.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295849 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 16:27:33 +00:00
Igor Breger
8ae3570fa8 [X86][GlobalISel] Initial implementation , select G_ADD gpr, gpr
Summary: Initial implementation for X86InstructionSelector. Handle selection COPY and G_ADD/G_SUB gpr, gpr .

Reviewers: qcolombet, rovka, zvi, ab

Reviewed By: rovka

Subscribers: mgorny, dberris, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D29816

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295824 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 12:25:09 +00:00
Simon Pilgrim
0766c65e73 [X86] Regenerate CSE test with codegen instead of just the instruction count
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 10:12:46 +00:00
Roger Ferrer Ibanez
296acceca1 [ARM] Fix constant islands pass.
The pass tries to fix a spill of LR that turns out to be unnecessary.
So it removes the tPOP but forgets to remove tPUSH.
This causes the stack be misaligned upon returning the function.

Thus, remove the tPUSH as well in this case.

Differential Revision: https://reviews.llvm.org/D30207



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 09:06:21 +00:00
Javed Absar
6badf185b6 [ARM] Classification Improvements to ARM Sched-Models. NFCI.
This patch adds missing sched classes for Thumb2 instructions.
This has been missing so far, and as a consequence, machine
scheduler models for individual sub-targets have tended to
be larger than they needed to be. These patches should help
write schedulers better and faster in the future
for ARM sub-targets.

Reviewer: Diana Picus
Differential Revision: https://reviews.llvm.org/D29953



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295811 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 07:22:57 +00:00
Craig Topper
544076881b [AVX-512] Allow legacy scalar min/max intrinsics to select EVEX instructions when available
This patch introduces new X86ISD::FMAXS and X86ISD::FMINS opcodes. The legacy intrinsics now lower to this node. As do the AVX-512 masked intrinsics when the rounding mode is CUR_DIRECTION.

I've merged a copy of the tablegen multiclass avx512_fp_scalar into avx512_fp_scalar_sae. avx512_fp_scalar still needs to support CUR_DIRECTION appearing as a rounding mode for X86ISD::FADD_ROUND and others.

Differential revision: https://reviews.llvm.org/D30186

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295810 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 06:54:18 +00:00
Matt Arsenault
7d65faa5cc AMDGPU: Add cvt.pkrtz intrinsic
Convert llvm.SI.packf16 test uses

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295797 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 00:27:34 +00:00
Matt Arsenault
7d9379397a AMDGPU: Remove some uses of llvm.SI.export in tests
Merge some of the old, smaller tests into more complete versions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295792 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-22 00:02:21 +00:00
Matt Arsenault
6de2a82753 AMDGPU: Remove llvm.AMDGPU.clamp intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295789 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 23:46:04 +00:00
Matt Arsenault
aac82e218f AMDGPU: Redefine clamp node as clamp 0.0-1.0
Change implementation to use max instead of add.
min/max/med3 do not flush denormals regardless of the mode,
so it is OK to use it whether or not they are enabled.

Also allow using clamp with f16, and use knowledge
of dx10_clamp.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295788 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 23:35:48 +00:00
Artem Belevich
b1a24afa7a [NVPTX] Unify vectorization of load/stores of aggregate arguments and return values.
Original code only used vector loads/stores for explicit vector arguments.
It could also do more loads/stores than necessary (e.g v5f32 would
touch 8 f32 values). Aggregate types were loaded one element at a time,
even the vectors contained within.

This change attempts to generalize (and simplify) parameter space
loads/stores so that vector loads/stores can be used more broadly.
Functionality of the patch has been verified by compiling thrust
test suite and manually checking the differences between PTX
generated by llvm with and without the patch.

General algorithm:
* ComputePTXValueVTs() flattens input/output argument into a flat list
  of scalars to load/store and returns their types and offsets.
* VectorizePTXValueVTs() uses that data to create vectorization plan
  which returns an array of flags marking boundaries of vectorized
  load/stores. Scalars are represented as 1-element vectors.
* Code that generates loads/stores implements a simple state machine
  that constructs a vector according to the plan.

Differential Revision: https://reviews.llvm.org/D30011

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295784 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 22:56:05 +00:00
Evandro Menezes
bcd633da12 [AArch64] Add test case for fusion of literal generation
Add test case from https://reviews.llvm.org/D28698 that was somehow lost in
transit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295775 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 22:16:09 +00:00
Evandro Menezes
d9de925b22 [AArch64] Add test case for fusion of AES crypto operations
Add test case from https://reviews.llvm.org/D28491 that was somehow lost in
transit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295774 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 22:16:06 +00:00
Evgeniy Stepanov
2c0dd61dd0 Fix PR31896.
Address of an alias of a global with offset is incorrectly lowered as an address of the global (i.e. ignoring offset).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295762 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 20:17:34 +00:00
Matt Arsenault
8a7ccd7129 AMDGPU: Remove dead declarations in tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295757 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 19:31:33 +00:00
Matt Arsenault
d47c3f5b20 AMDGPU: Remove dead declarations from MIR tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295755 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 19:27:36 +00:00
Matt Arsenault
f2616d2fd3 AMDGPU: Remove llvm.AMDGPU.flbit intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 19:27:33 +00:00
Matt Arsenault
bcb6a77aca AMDGPU: Don't use stack space for SGPR->VGPR spills
Before frame offsets are calculated, try to eliminate the
frame indexes used by SGPR spills. Then we can delete them
after.

I think for now we can be sure that no other instruction
will be re-using the same frame indexes. It should be easy
to notice if this assumption ever breaks since everything
asserts if it tries to use a dead frame index later.

The unused emergency stack slot seems to still be left behind,
so an additional 4 bytes is still wasted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295753 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 19:12:08 +00:00
Geoff Berry
fc170d8f5d [CodeGenPrepare] Sink and duplicate more 'and' instructions.
Summary:
Rework the code that was sinking/duplicating (icmp and, 0) sequences
into blocks where they were being used by conditional branches to form
more tbz instructions on AArch64.  The new code is more general in that
it just looks for 'and's that have all icmp 0's as users, with a target
hook used to select which subset of 'and' instructions to consider.
This change also enables 'and' sinking for X86, where it is more widely
beneficial than on AArch64.

The 'and' sinking/duplicating code is moved into the optimizeInst phase
of CodeGenPrepare, where it can take advantage of the fact the
OptimizeCmpExpression has already sunk/duplicated any icmps into the
blocks where they are used.  One minor complication from this change is
that optimizeLoadExt needed to be updated to always mark 'and's it has
determined should be in the same block as their feeding load in the
InsertedInsts set to avoid an infinite loop of hoisting and sinking the
same 'and'.

This change fixes a regression on X86 in the tsan runtime caused by
moving GVNHoist to a later place in the optimization pipeline (see
PR31382).

Reviewers: t.p.northover, qcolombet, MatzeB

Subscribers: aemerson, mcrosier, sebpop, llvm-commits

Differential Revision: https://reviews.llvm.org/D28813

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295746 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 18:53:14 +00:00
Simon Pilgrim
0e35fcb104 [X86][AVX512] Update VPBROADCASTQ test to combine from VPERMQ instead of VPERMI2Q.
VPERMI2Q doesn't have shuffle decoding from re-materializable constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295736 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 17:04:11 +00:00
Simon Pilgrim
d0be5ae522 [X86][AVX] Rename shuffle combine tests to show combined shuffle type. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295735 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 16:45:31 +00:00
Simon Pilgrim
3f5e9f4627 [X86][AVX2] Fix VPBROADCASTQ folding on 32-bit targets.
As i64 isn't a value type on 32-bit targets, we need to fold the VZEXT_LOAD into VPBROADCASTQ.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295733 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 16:41:44 +00:00
Simon Pilgrim
77c8682840 [X86][AVX2] Add AVX512 test targets to AVX2 shuffle combines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295731 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 16:29:28 +00:00
Simon Pilgrim
a77fb3315a [X86][AVX] Add tests showing missed VPBROADCASTQ folding on 32-bit targets.
As i64 isn't a value type on 32-bit targets, we fail to fold the VZEXT_LOAD into VPBROADCASTQ.

Also shows that we're not decoding VPERMIV3 shuffles very well....

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295729 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 16:05:35 +00:00
Simon Pilgrim
d14347a186 [X86][SSE] Prefer to combine shuffles to VZEXT over VZEXT_MOVL.
This matches what is already done during shuffle lowering and helps prevent the need for a zero-vector in cases where shuffles match both patterns.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295723 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 15:09:00 +00:00
Simon Pilgrim
70dcac1118 [X86][SSE] Added SSE41 shuffle combining test file.
Currently just contains one case where we combine to VZEXT_MOVL instead of VZEXT which would avoid the need for a zero vector to be generated

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295721 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 14:51:15 +00:00
Igor Breger
253f60a6d8 [AVX512] Fix EXTRACT_VECTOR_ELT for v2i1/v4i1/v32i1/v64i1 with variable index.
Differential Revision: https://reviews.llvm.org/D30189



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295718 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 14:01:25 +00:00
Diana Picus
c247010cec [ARM] GlobalISel: Lower calls to void() functions
For now, we hardcode a BLX instruction, and generate an ADJCALLSTACKDOWN/UP pair
with amount 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295716 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 11:33:59 +00:00
Craig Topper
4e7e88abb9 [X86] Remove ssse3 intrinsic tests from the avx intrinsics test file.
They are all covered by the SSSE3 intrinsics test with SSSE3, AVX, and AVX512 command lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295708 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 08:06:08 +00:00
Craig Topper
f12d64f2d3 [X86] Remove sse4.2 intrinsic tests from the avx intrinsics test file. Fix some other consistency issues.
They are all covered by the SSE4.2 intrinsics test with SSE4.2, AVX, and AVX512 command lines.

Merge sse42.ll into the other intrinsics test. Rename sse42_64.ll to be named like other intrinsic tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295707 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 08:06:05 +00:00
Craig Topper
74f63b561d [X86] Remove sse4.1 intrinsic tests from the avx intrinsics test file.
They are all covered by the SSE4.1 intrinsics test with SSE4.1, AVX, and AVX512 command lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295706 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 08:06:02 +00:00
Craig Topper
c39fbcaf44 [X86] Remove sse3 intrinsic tests from the avx intrinsics test file.
They are all covered by the SSE3 intrinsics test with SSE2, AVX, and AVX512 command lines.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295705 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 08:05:59 +00:00
Craig Topper
f80e7ef629 [X86] Remove aes intrinsic tests from the avx intrinsics test file.
They are all covered by the AES intrinsics test with a legacy command line and an AVX command line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295702 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 07:32:18 +00:00
Craig Topper
068549b757 [X86] Add an AVX command line and regenerate AES intrinsics test using the update_llc_test_checks.py
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295701 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 07:32:14 +00:00
Craig Topper
4228b1201f [X86] Remove sse2 intrinsic tests from the avx intrinsics test file.
They are all covered by the SSE2 intrinsics test with SSE2, AVX, and AVX512 command lines.

Also remove an unneeded lfence intrinsic test since it was already covered.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@295700 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-21 07:32:11 +00:00