Commit Graph

17425 Commits

Author SHA1 Message Date
Chih-Hung Hsieh
f94271deae [X86] Accept SELECT op code for x86-64 fp128 type
DAGTypeLegalizer::CanSkipSoftenFloatOperand should allow
SELECT op code for x86_64 fp128 type for MME targets,
so SoftenFloatOperand does not abort on SELECT op code.

Differential Revision: http://reviews.llvm.org/D21758


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275818 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 17:20:09 +00:00
Simon Pilgrim
00a0a786d0 [X86][AVX2] Added tests that demonstrate duplicate broadcasts
We don't yet decode broadcasts as a target shuffle

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275808 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 16:17:34 +00:00
Krzysztof Parzyszek
a7c00b136c [Hexagon] Enable .cur formation in MISched for Hexagon V60
Schedule a load and its use in the same packet in MISched. Previously,
isResourceAvailable was returning false for dependences in the same
packet, which prevented MISched from packetizing a load and its use in
the same packet for v60.

Patch by Ikhlas Ajbar.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 16:05:27 +00:00
Nemanja Ivanovic
fe4ad6d3ea [PowerPC] Remove redundant direct moves when extracting integers and converting to FP
This patch corresponds to review:
https://reviews.llvm.org/D21354

We use direct moves for extracting integer elements from vectors. We also use
direct moves when converting integers to FP. When these operations are chained,
we get a direct move out of a VSR followed by a direct move back into a VSR.
These are redundant - all we need to do is line up the element and convert.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275796 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 15:30:00 +00:00
Krzysztof Parzyszek
56af121d06 [Hexagon] Use timing class info as tie-breaker in machine scheduler
Patch by Sirish Pande.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275794 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 15:17:10 +00:00
Krzysztof Parzyszek
98b655feba [Hexagon] HexagonMachineScheduler should account for resources
The machine scheduler needs to account for available resources
more accurately in order to avoid scheduling an instruction that
forces a new packet to be created.

This occurs in two ways: First, an instruction without an available
resource may have a large priority due to other metrics and be
scheduled when there are other instructions with available resources.
Second, an instruction with a non-zero latency may become available
prematurely. In both these cases, we attempt change the priority
in order to allow a better instruction to be scheduled.

Patch by Brendon Cahoon.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275793 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 14:52:13 +00:00
Krzysztof Parzyszek
9547556e81 [Hexagon] Fix zero latency instructions with multiple predecessors
An instruction may have multiple predecessors that are candidates
for using .cur. However, only one of them can use .cur in the
packet. When this case occurs, we need to make sure that only
one of the dependences gets a 0 latency value.

Patch by Brendon Cahoon.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275790 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 14:23:10 +00:00
Simon Dardis
5ebefb8fd2 [inlineasm] Propagate operand constraints to the backend
When SelectionDAGISel transforms a node representing an inline asm
block, memory constraint information is not preserved. This can cause
constraints to be broken when a memory offset is of the form:

offset + frame index

when the frame is resolved.

By propagating the constraints all the way to the backend, targets can
enforce memory operands of inline assembly to conform to their constraints.

For MIPSR6, some instructions had their offsets reduced to 9 bits from
16 bits such as ll/sc. This becomes problematic when using inline assembly
to perform atomic operations, as an offset can generated that is too big to
encode in the instruction.

Reviewers: dsanders, vkalintris

Differential Review: https://reviews.llvm.org/D21615


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275786 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 13:17:31 +00:00
Nicolai Haehnle
0c05ce4746 AMDGPU: Disable AMDGPUPromoteAlloca pass for shader calling conventions.
Summary:
The work item intrinsics are not available for the shader
calling conventions. And even if we did hook them up most
shader stages haves some extra restrictions on the amount
of available LDS.

Reviewers: tstellarAMD, arsenm

Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D20728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275779 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 09:02:47 +00:00
Diana Picus
7e13fe031b [ARM] Update test to use CHECK-LABEL. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275777 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 07:48:42 +00:00
Diana Picus
ef2833b8f3 [ARM] Skip inline asm memory operands in DAGToDAGISel
The current logic for handling inline asm operands in DAGToDAGISel interprets
the operands by looking for constants, which should represent the flags
describing the kind of operand we're dealing with (immediate, memory, register
def etc). The operands representing actual data are skipped only if they are
non-const, with the exception of immediate operands which are skipped explicitly
when a flag describing an immediate is found.

The oversight is that memory operands may be const too (e.g. for device drivers
reading a fixed address), so we should explicitly skip the operand following a
flag describing a memory operand. If we don't, we risk interpreting that
constant as a flag, which is definitely not intended.

Fixes PR26038

Differential Revision: https://reviews.llvm.org/D22103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275776 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 07:35:14 +00:00
Craig Topper
c9ba7aa68b [AVX512] Add EVEX versions of scalar ADD/SUB/MUL/DIV to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275775 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:49:32 +00:00
Craig Topper
4052e7231f [X86] Fix test checks to include leading 'v' on avx mnemonic names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275774 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:49:29 +00:00
Diana Picus
d504c85ea6 [ARM] Honour ABI for rem under -O0 for EABI, GNUEABI, Android and Musl
At higher optimization levels, we generate the libcall for DIVREM_Ix, which is
fine: aeabi_{u|i}divmod. At -O0 we generate the one for REM_Ix, which is the
default {u}mod{q|h|s|d}i3.

This commit makes sure that we don't generate REM_Ix calls for ABIs that
don't support them (i.e. where we need to use DIVREM_Ix instead). This is
achieved by bailing out of FastISel, which can't handle non-double multi-reg
returns, and letting the legalization infrastructure expand the REM_Ix calls.

It also updates the divmod-eabi.ll test to run under -O0 as well, and adds some
Windows checks to it to make sure we don't break things for it.

Fixes PR27068

Differential Revision: https://reviews.llvm.org/D21926

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275773 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:48:25 +00:00
Craig Topper
fefffbf697 [X86] Add VPADD instructions to X86InstrInfo::isAssociativeAndCommutative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:54 +00:00
Craig Topper
81c3344bd0 [X86] Add floating point packed logical ops to X86InstrInfo::isAssociativeAndCommutative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275768 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:50 +00:00
Craig Topper
4388ffce8e [X86] Add AVX512 instructions to X86InstrInfo::isAssociativeAndCommutative.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275767 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:47 +00:00
Craig Topper
3305a40150 [X86] Add AVX512 load opcodes and a couple AVX load opcodes to X86InstrInfo::areLoadsFromSameBasePtr.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:43 +00:00
Craig Topper
e70f2b66e1 [X86] Add more opcodes to isFrameLoadOpcode/isFrameStoreOpcode. Mainly AVX-512 related.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275764 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:39 +00:00
Craig Topper
0c4677f3cc [AVX512] Use VMOVAPSZ128rr/VMOVAPS256rr for VR128X/VR256X physreg moves when VLX is supported.
Ideally we would use VEX encoded moves instead of EVEX if the high 16 registers aren't referenced, but this a good first step.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275763 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 06:14:34 +00:00
Simon Pilgrim
5bf69d80fb [X86][AVX] Added VBROADCASTF128/VBROADCASTI128 tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275713 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-17 17:44:18 +00:00
Simon Pilgrim
0aff5d2239 [X86] Regenerated ctlz/cttz scalar tests for 32/64-bit targets with/without LZCNT/TZCNT support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275710 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-17 16:15:51 +00:00
Simon Pilgrim
506afb80d9 [X86] Regenerated popcnt scalar tests for 32/64-bit targets with/without POPCNT support
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275709 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-17 16:04:19 +00:00
Elena Demikhovsky
562b267d41 X86: Updated a test file. NFC.
This test shows subotimal code generated for AVX-512 vs PENTIUM4.
The issue will be fixed in an upcomming commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275702 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-17 07:03:13 +00:00
Hal Finkel
aa47729233 Disable this-return argument forwarding on ARM/AArch64
r275042 reverted function-attribute inference for the 'returned' attribute
because the feature triggered self-hosting failures on ARM and AArch64. James
Molloy determined that the this-return argument forwarding feature, which
directly ties the returned input argument to the returned value, was the cause.
It seems likely that this forwarding code contains, or triggers, a subtle bug.
Disabling for now until we can track that down.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275677 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 07:07:29 +00:00
Yaxun Liu
384c6423e5 Re-commit [AMDGPU] Add metadata for runtime
Attempting to fix lit test failure on ppc.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275676 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 05:09:21 +00:00
Matthias Braun
3346c15107 llc: Add support for -run-pass none
This does not schedule any passes besides the ones necessary to
construct and print the machine function. This is useful to test .mir
file reading and printing.

Differential Revision: http://reviews.llvm.org/D22432

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275664 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 02:24:59 +00:00
Matthias Braun
7e0a8cbfdc ARM/MIR: Move test from MIR to CodeGen/ARM directory
test/CodeGen/MIR/ARM/ARMLoadStoreDBG.mir is an actual test for the ARM
load store optimization pass and not a test of the mir parser/printer.

It belongs to test/CodeGen/ARM; This also updates the test to use the
new -run-pass llc syntax.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275662 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 02:24:13 +00:00
Matthias Braun
4d5c34d999 MIParser: reject subregister indexes on physregs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275658 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 01:36:18 +00:00
Matt Arsenault
e066e581b1 AMDGPU: Fix verifier error from partially undef copy
In this situation:

%VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11,
%VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use>
%VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3
%VGPR4<def> = COPY %VGPR2

The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1,
but VGPR4 is defined immediately after this copy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275635 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 22:32:02 +00:00
Michael Kuperstein
467260108e ExpandPostRAPseudos should transfer implicit uses, not only implicit defs
Previously, we would expand:
%BL<def> = COPY %DL<kill>, %EBX<imp-use,kill>, %EBX<imp-def>
Into:
%BL<def> = MOV8rr %DL<kill>, %EBX<imp-def>
Dropping the imp-use on the floor.

That confused CriticalAntiDepBreaker, which (correctly) assumes that if an
instruction defs but doesn't use a register, that register is dead immediately
before the instruction - while in this case, the high lanes of EBX can be very
much alive.

This fixes PR28560.

Differential Revision: https://reviews.llvm.org/D22425


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275634 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 22:31:14 +00:00
Matt Arsenault
35290cc53d AMDGPU: Remove brev intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275620 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:27:13 +00:00
Matt Arsenault
5fecfa22e5 AMDGPU: Fix TargetPrefix for remaining r600 intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275619 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:27:08 +00:00
Matt Arsenault
a47e87a336 AMDGPU: Remove AMDGPU.ldexp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275618 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:26:56 +00:00
Matt Arsenault
7150fbf236 AMDGPU: Remove legacy rsq.clamped intrinsic
Mesa still has a use of llvm.AMDGPU.rsq.f64 remaining.

Also fix mismatch with non-IEEE rsq selecting to IEEE rsq.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275617 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:26:52 +00:00
Saleem Abdulrasool
1083a5297a CodeGen: avoid emitting unnecessary CFI
Remove unnecessary clutter in assembly output.  When using SjLj EH, the CFI is
not actually used for anything.  Do not emit the CFI needlessly.  The minor test
adjustments are interesting.  The prologue test was just overzealous matcching.
The interesting case is the LSDA change.  It was originally added to ensure that
various compilations did not mangle the name (it explicitly checked the name!).
However, subsequent cleanups made it more reliant on the CFI to find the name.
Parse the generated code flow to generically find the label still.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275614 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:10:29 +00:00
Nico Weber
2cef100c63 Teach fast isel about the win64 calling convention.
This mostly just works.

Vectorcall rets are still not supported.

The win64_eh test change is because fast isel doesn't use rsi for temporary
computations, so it doesn't need to be pushed. The test case I'm changing was
originally added to test pushes, but by now there are other test cases in that
file exercising that code path.

https://reviews.llvm.org/D22422


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275607 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 20:18:37 +00:00
Vitaly Buka
a6cb7108c4 Revert "[AMDGPU] Add metadata for runtime"
This reverts commit r275566.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275599 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 19:14:57 +00:00
Krzysztof Parzyszek
733cec8f05 [Hexagon] Improve patterns with stack-based addressing
- Treat bitwise OR with a frame index as an ADD wherever possible, fold it
  into addressing mode.
- Extend patterns for memops to allow memops with frame indexes as address
  operands.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275569 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 15:35:52 +00:00
Nico Weber
6e06ef3736 In dag-optnone.ll, use varargs instead of win64 to fast SDIsel.
The test used to rely on targeting win64 to disable fast isel,
but I'd like to teach fast isel about win64 rets.  Change the
test to use varargs to disable fast isel.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275568 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 15:30:18 +00:00
Yaxun Liu
6b0141c6fb [AMDGPU] Add metadata for runtime
Added emitting metadata to elf for runtime.

Runtime requires certain information (metadata) about kernels to be able to execute and query them. Such information is emitted to an elf section as a key-value pair stream.

Differential Revision: https://reviews.llvm.org/D21849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275566 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 14:58:21 +00:00
Simon Pilgrim
c08f35ca62 [X86][AVX] Added shuffle tests for UNPCK+PERMUTE
lowerVectorShuffleAsPermuteAndUnpack could solve this if it worked with 256-bit vectors

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275554 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 11:51:46 +00:00
Simon Pilgrim
02e653effb [X86][AVX2] Added a memory version of test_mm256_broadcastsi128_si256
This should lower to vbroadcasti128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275552 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 11:40:27 +00:00
Simon Pilgrim
be745b9c8c [X86][AVX2] Improve lowerShuffleAsRepeatedMaskAndLanePermute permutation of 64-bit sub-lanes
As discussed on PR28136, lowerShuffleAsRepeatedMaskAndLanePermute was attempting to match repeated masks at the 128-bit level and then permute the resultant lanes at the 128-bit (AVX1) or 64-bit (AVX2) sub-lane level.

This change allows us to create the repeated masks at the sub-lane level (and then concat them together to create a 128-bit repeated mask) and then select which sub-lane to permute. This has no effect on the AVX1 codegen.

Fixes PR28136.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275543 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 09:49:12 +00:00
James Molloy
cf85ddcdf1 [Thumb-1] Select post-increment load and store where possible
Thumb-1 doesn't have post-inc or pre-inc load or store instructions. However the LDM/STM instructions with writeback can function as post-inc load/store:

  ldm r0!, {r1}  @ load from r0 into r1 and increment r0 by 4

Obviously, this only works if the post increment is 4.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275540 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 08:03:56 +00:00
James Molloy
decaafe6c2 [ARM] Prefer indirect calls in minsize mode
... When we emit several calls to the same function in the same basic block.

An indirect call uses a "BLX r0" instruction which has a 16-bit encoding. If many calls are made to the same target, this can enable significant code size reductions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275537 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 07:55:21 +00:00
Matt Arsenault
beff7fe056 AMDGPU: Fix not expanding control flow after some kill blocks
Also stop trying to insert skip blocks at end_cf. This
was inserting them at the end of the block which doesn't make
sense. The skip should be inserted at the beginning of the block
right after the end cf. Just remove this for now since no tests
seem to stress this and I think this can be handled more generally
later.

Fixes bug 28550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275510 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:15 +00:00
Matt Arsenault
011dcf3d90 AMDGPU: Fix trying to skip from a block with no successors
Found while reducing bug 28550

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275509 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:13 +00:00
Matt Arsenault
435a4467a3 AMDGPU: Fix splitting kill blocks with defs before kill
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275508 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 00:58:09 +00:00
Simon Pilgrim
db8566250f [X86][AVX2] Allow VPERMPD/VPERMQ shuffles to call combineShuffle (reapplied)
This improves the situation discussed in D19228 where we were forcing VPERMPD/VPERMQ where VPERM2F128/VPERM2I128 would have been better.

This was incorrectly reverted in rL275421 during triage of PR28552.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275497 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-14 23:05:09 +00:00