Commit Graph

19156 Commits

Author SHA1 Message Date
Tim Northover
06bfcf3df9 GlobalISel: simplify MachineIRBuilder interface.
MachineIRBuilder had weird before/after and beginning/end flags for the insert
point. Unfortunately the non-default means that instructions will be inserted
in reverse order which is almost never what anyone wants.

Really, I think we just want (like IRBuilder has) the ability to insert at any
C++ iterator-style point (i.e. before any instruction or before MBB.end()). So
this fixes MIRBuilders to behave like IRBuilders in this respect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288980 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 21:05:38 +00:00
Michael Kuperstein
30fb413876 [X86] Skip over DEBUG_VALUE while looking for start of call sequence
If we don't skip over DEBUG_VALUEs, we get differences between -g and non-g
code.

This fixes PR31242.

Differential Revision: https://reviews.llvm.org/D27485


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288965 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 19:31:08 +00:00
Michael Kuperstein
3ffda498ec [X86] Do not assume "ri" instructions always have an immediate operand
The second operand of an "ri" instruction may be an immediate, but it may
also be a globalvariable, so we should make any assumptions.

This fixes PR31271.

Differential Revision: https://reviews.llvm.org/D27481


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288964 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 19:29:18 +00:00
Simon Pilgrim
aee7c6e2f5 [SelectionDAG] Add knownbits support for vector demandedelts in SMAX/SMIN/UMAX/UMIN opcodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288926 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 17:54:00 +00:00
Simon Pilgrim
7b24dd44c8 [X86] Add knownbits vector UMAX test
In preparation for demandedelts support

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288920 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 17:21:13 +00:00
Simon Pilgrim
def95b9c34 [SelectionDAG] Add knownbits support for EXTRACT_VECTOR_ELT opcodes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288916 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 16:28:21 +00:00
Simon Pilgrim
b23f6fe812 [X86] Add test to show missed opportunities to calculate knownbits in INSERT_VECTOR_ELT
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288912 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 15:27:18 +00:00
Simon Pilgrim
5ab68dcc0b [X86][SSE] Fix vpextrd/vpextrq checks
They were testing for the pre-vex versions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288911 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 15:10:05 +00:00
Simon Pilgrim
c57b7e50fc [X86][SSE] Force execution domain of 32-bit extractps/pextrd in the stack folding tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288910 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 15:06:14 +00:00
Simon Pilgrim
03f619110f [X86][SSE] Regenerate test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288906 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 13:05:04 +00:00
Dylan McKay
038449d896 [AVR] Expand 'SELECT_CC' nodes whereever possible
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288905 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 12:34:47 +00:00
Simon Pilgrim
d2a4d816a1 [X86][SSE] Consistently set MOVD/MOVQ load/store/move instructions to integer domain
We are being inconsistent with these instructions (and all their variants.....) with a random mix of them using the default float domain.

Differential Revision: https://reviews.llvm.org/D27419

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288902 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 12:10:49 +00:00
Dylan McKay
f7450dd95f [AVR] Move a pseudo expansion test into a folder
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288899 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 11:21:45 +00:00
Simon Pilgrim
714162bb4f [X86][XOP] Fix VPERMIL2 non-constant pool shuffle decoding (PR31296)
The non-constant pool version of DecodeVPERMIL2PMask was not offsetting correctly for the second input. I've updated the code to match the implementation in the constant-pool version.

Annoyingly this bug was hidden for so long as it's tricky to combine to useful variable shuffle masks that don't become constant-pool entries.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288898 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 11:19:00 +00:00
Dylan McKay
0007d14057 [AVR] Allow loading from stack slots where src and dest registers are identical
Fixes PR 31256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288897 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 11:08:56 +00:00
Tom Stellard
c53f76cc0b AMDGPU : Add S_SETREG instructions to fix fdiv precision issues.
Patch By: Wei Ding

Summary: This patch fixes the fdiv precision issues.

Reviewers: b-sumner, cfang, wdng, arsenm

Subscribers: kzhuravl, nhaehnle, yaxunl, tony-tye

Differential Revision: https://reviews.llvm.org/D26424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288879 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-07 02:42:15 +00:00
Tom Stellard
89ce12495b AMDGPU: Add llvm.amdgcn.interp.mov intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D26725

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288865 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 23:52:13 +00:00
Matt Arsenault
9bdddbab7d AMDGPU: Fix crash on i16 constant expression
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288861 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 23:18:06 +00:00
Simon Pilgrim
8dc8a8bead [X86][XOP] Add test case for PR31296
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288858 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 22:50:13 +00:00
Eli Friedman
0e51e5e82d [CodeGen] Fix result type for SMULO/UMULO legalization
On some platforms (like MSP430) the second element of the result
structure for SMULO/UMULO may have a shorter type than the one
returned by SetCC. We need to truncate it to the right type, or
else some incorrect code may be generated later on.

This fixes issue https://github.com/rust-lang/rust/issues/37829

Patch by Vadzim Dambrouski!

Differential Revision: https://reviews.llvm.org/D27154



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288857 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 22:49:36 +00:00
Tom Stellard
2fff37f710 AMDGPU/SI: Set correct value for amd_kernel_code_t::kernarg_segment_alignment
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27416

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288852 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 21:53:10 +00:00
Tom Stellard
4fae32e28b AMDGPU/SI: Don't move copies of immediates to the VALU
Summary:
If we write an immediate to a VGPR and then copy the VGPR to an
SGPR, we can replace the copy with a S_MOV_B32 sgpr, imm, rather than
moving the copy to the SALU.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D27272

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288849 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 21:13:30 +00:00
Tim Northover
2c23a5b605 GlobalISel: correctly handle small args via memory.
We were rounding size in bits down rather than up, leading to 0-sized slots for
i1 (assert!) and bugs for other types not byte-aligned.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288848 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 21:02:19 +00:00
Zvi Rackover
42694a3a7e [X86] Prefer reduced width multiplication over pmulld on Silvermont
Summary:
Prefer expansions such as: pmullw,pmulhw,unpacklwd,unpackhwd over pmulld.
On Silvermont [source: Optimization Reference Manual]:
PMULLD has a throughput of 1/11 [instruction/cycles].
PMULHUW/PMULHW/PMULLW have a throughput of 1/2 [instruction/cycles].

Fixes pr31202.

Analysis of this issue was done by Fahana Aleen.

Reviewers: wmi, delena, mkuper

Subscribers: RKSimon, llvm-commits

Differential Revision: https://reviews.llvm.org/D27203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288844 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 19:35:20 +00:00
Simon Pilgrim
6e9255f2d0 [DAGCombine] Add (sext_in_reg (zext x)) -> (sext x) combine
Handle the case where a sign extension has ended up being split into separate stages (typically to get around vector legal ops) and a zext + sext_in_reg gets inserted.

Differential Revision: https://reviews.llvm.org/D27461

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288842 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 19:09:37 +00:00
Tim Northover
9c3d059fa2 GlobalISel: fall back gracefully when we hit unhandled legalizer default.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288840 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 19:02:15 +00:00
Simon Pilgrim
3074f79595 [SelectionDAG] We can ignore knownbits from an undef shuffle vector index if we don't actually demand that element
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288839 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 18:58:25 +00:00
Tim Northover
3a783f8716 GlobalISel: handle G_SEQUENCE fallbacks gracefully.
There were two problems:
  + AArch64 was reusing random data from its binary op tables, which is
    complete nonsense for G_SEQUENCE.
  + Even when AArch64 gave up and said it couldn't handle G_SEQUENCE,
    the generic code asserted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288836 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 18:38:38 +00:00
Tim Northover
22c48aa20e GlobalISel: allow G_SELECT instructions for pointers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288835 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 18:38:34 +00:00
Tim Northover
a5cd8a603e GlobalISel: stop the legalizer from trying to handle oddly-sized types.
It'll almost immediately fail because it always tries to half/double the size
until it finds a legal one. Unfortunately, this triggers an assertion
preventing the DAG fallback from being possible.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288834 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 18:38:29 +00:00
Simon Pilgrim
057100dd3a [X86][SSE] Add knownbits test demonstrating demandedelts not ignoring undef shuffle elements
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288825 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 17:00:47 +00:00
Simon Pilgrim
06ec4e5b99 [X86][SSE] Added vector sext_in_reg combine tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288819 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 15:57:26 +00:00
Simon Pilgrim
42fe8f58c5 [X86] Improve UMAX/UMIN knownbits test
Test the sequential effect of each op

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288815 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 15:17:50 +00:00
Ayman Musa
c700b40b54 [X86][AVX512] Detect repeated constant patterns in BUILD_VECTOR suitable for broadcasting.
Check if a build_vector node includes a repeated constant pattern and replace it with a broadcast of that pattern.
For example:
"build_vector <0, 1, 2, 3, 0, 1, 2, 3>" would be replaced by "broadcast <0, 1, 2, 3>"

Differential Revision: https://reviews.llvm.org/D26802



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 12:24:14 +00:00
Simon Pilgrim
01544ba3a3 [X86] Add tests to show missed opportunities to calculate knownbits in SMAX/SMIN/UMAX/UMIN
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288801 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 12:12:20 +00:00
Nemanja Ivanovic
3d641da82e [PowerPC] Improvements for BUILD_VECTOR Vol. 4
This is the final patch in the series of patches that improves
BUILD_VECTOR handling on PowerPC. This adds a few peephole optimizations
to remove redundant instructions. It also adds a large test case which
encompasses a large set of code patterns that build vectors - this test
case was the motivator for this series of patches.

Differential Revision: https://reviews.llvm.org/D26066


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288800 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 11:47:14 +00:00
Florian Hahn
c153f037fe [framelowering] Improve tracking of first CS pop instruction.
Summary: This patch makes sure FirstCSPop and MBBI never point to DBG_VALUE instructions, which affected the code generated.

Reviewers: mkuper, aprantl, MatzeB

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27343

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288794 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 10:24:55 +00:00
Craig Topper
8368f754a7 [X86] Remove another weird scalar sqrt/rcp/rsqrt pattern.
This pattern turned a vector sqrt/rcp/rsqrt operation of sse_load_f32/f64 into the the scalar instruction for the operation and put undef into the upper bits. For correctness, the resulting code should still perform the sqrt/rcp/rsqrt on the upper bits after the load is extended since that's what the operation asked for. Particularly in the case where the upper bits are 0, in that case we need calculate the sqrt/rcp/rsqrt of the zeroes and keep the result in the upper-bits. This implies we should be using the packed instruction still.

The only test case for this pattern is one I just added so there was no coverage of this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288784 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 08:08:12 +00:00
Craig Topper
10b8bf3510 [X86] Add test case demonstrating a case where a vector sqrt being passed (scalar_to_vector loadf64) uses a scalar sqrt instruction.
This occurs due to a pattern that uses sse_load_f32/f64 with vector sqrt/rcp/rsqrt operations and turns them into scalar instructions. Perhaps for the case were the upper bits come from undef this is ok.  I believe a (vzmovl load64) would do the same thing but those seems to become vzload instead and selectScalarSSELoad doesn't handle that today. In that case we should be performing the vector operation on the zeros in the upper bits which is not equivalent to using a scalar instruction.

I will remove this pattern in a follow up patch. There appears to be no other test content for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288783 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 08:08:09 +00:00
Craig Topper
0659581d34 [X86] Regenerate a test using update_llc_test_checks.py
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288782 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 08:08:07 +00:00
Craig Topper
c7da972d1d [X86] Remove bad pattern that caused 128-bit loads being used by scalar sqrt/rcp/rsqrt intrinsics to select the memory form of the corresponding instruction and violate the semantics of the intrinsic.
The intrinsics are supposed to pass the upper bits straight through to their output register. This means we need to make sure we still perform the 128-bit load to get those upper bits to pass to give to the instruction since the memory form of the instruction only reads 32 or 64 bits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288781 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 08:08:04 +00:00
Craig Topper
de4f9d5a4f [X86] Add test case that shows a scalar sqrtsd intrinsic of a 128-bit vector load using the load form of the sqrtsd instruction which violates the intrinsic semantics.
The sqrtsd instruction only loads 64-bits and writes bits 63:0 with the sqrt result. Bits 127:64 are preserved in the destination register. The semantics of the intrinsic indicate bits 127:64 should come from the intrinsic argument which in this case is a 128-bit load. So the generated code should have a 128-bit load and use a register form of sqrtsd.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288780 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 08:08:01 +00:00
Craig Topper
64ddc83afd [X86] Correct pattern for VSQRTSSr_Int, VSQRTSDr_Int, VRCPSSr_Int, and VRSQRTSSr_Int to not have an IMPLICIT_DEF on the first input. The semantics of the intrinsic are clear and not undefined.
The intrinsic takes one argument, the lower bits are affected by the operation and the upper bits should be passed through. The instruction itself takes two operands, the high bits of the first operand are passed through and the low bits of the second operand are modified by the operation. To match this to the intrinsic we should pass the single intrinsic input to both operands.

I had to remove the stack folding test for these instructions since they depended on the incorrect behavior. The same register is now used for both inputs so the load can't be folded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288779 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 08:07:58 +00:00
Craig Topper
25dd36e1c9 [X86] Remove scalar logical op alias instructions. Just use COPY_FROM/TO_REGCLASS and the normal packed instructions instead
Summary:
This patch removes the scalar logical operation alias instructions. We can just use reg class copies and use the normal packed instructions instead. This removes the need for putting these instructions in the execution domain fixing tables as was done recently.

I removed the loadf64_128 and loadf32_128 patterns as DAG combine creates a narrower load for (extractelt (loadv4f32)) before we ever get to isel.

I plan to add similar patterns for AVX512DQ in a future commit to allow use of the larger register class when available.

Reviewers: spatel, delena, zvi, RKSimon

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D27401

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288771 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 04:58:39 +00:00
Matt Arsenault
a079dfc363 AMDGPU: Don't required structured CFG
The structured CFG is just an aid to inserting exec
mask modification instructions, once that is done
we don't really need it anymore. We also
do not analyze blocks with terminators that
modify exec, so this should only be impacting
true branches.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288744 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-06 01:02:51 +00:00
Weiming Zhao
943496ffc4 Summary: Currently there is no way to disable deprecated warning from asm like this
clang  -target arm deprecated-asm.s -c
  deprecated-asm.s:30:9: warning: use of SP or PC in the list is deprecated
       stmia   r4!, {r12-r14}

We have to have an option what can disable it.

Patched by Yin Ma!

Reviewers: joey, echristo, weimingz

Subscribers: llvm-commits, aemerson

Differential Revision: https://reviews.llvm.org/D27219

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288734 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-05 23:55:13 +00:00
Tim Northover
416ccca7e0 GlobalISel: avoid looking too closely at PHIs when we bail.
The function used to finish off PHIs by adding the relevant basic blocks can
fail if we're aborting and still don't actually have the needed
MachineBasicBlocks. So avoid trying in that case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288727 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-05 23:10:19 +00:00
Tim Northover
9fef274c6a GlobalISel: place constants correctly in the entry block.
When the entry block was empty after arg lowering, we were always placing
constants at the end. This is probably hamrless while translating the same
block, but horribly wrong once its terminator has been translated. So switch to
inserting at the beginning.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288720 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-05 22:40:13 +00:00
Tim Northover
75dfa0e7c6 GlobalISel: handle pointer arguments that get assigned to the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288717 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-05 22:20:32 +00:00
Tim Northover
e1db4f7b15 GlobalISel: translate constants larger than 64 bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@288713 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-05 21:54:17 +00:00