Commit Graph

18162 Commits

Author SHA1 Message Date
Nicolai Haehnle
01a133c760 AMDGPU: Do not clobber SCC in SIWholeQuadMode
Reviewers: arsenm, tstellarAMD, mareko

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D22198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281230 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 16:25:20 +00:00
James Molloy
0aa0c7d910 Revert "[ARM] Promote small global constants to constant pools"
This reverts commit r281213. It made a bot go bang: http://lab.llvm.org:8011/builders/clang-cmake-armv7-a15-full/builds/14625

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281228 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 16:18:23 +00:00
Ahmed Bougacha
59a2759391 [BranchFolding] Unique added live-ins after hoisting code.
We're not supposed to have duplicate live-ins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281224 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 16:05:31 +00:00
Ahmed Bougacha
8faa07b28d [X86] Copy imp-uses when folding tailcall into conditional branch.
r280832 added 32-bit support for emitting conditional tail-calls, but
dropped imp-used parameter registers.  This went unnoticed until
r281113, which added 64-bit support, as this is only exposed with
parameter passing via registers.

Don't drop the imp-used parameters.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281223 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 16:05:27 +00:00
Igor Breger
97c0650440 add select i1 test, reproduser pr30249.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281218 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 15:27:02 +00:00
James Molloy
91db09d0e8 [Thumb] Teach ISel how to lower compares of AND bitmasks efficiently
For the common pattern (CMPZ (AND x, #bitmask), #0), we can do some more efficient instruction selection if the bitmask is one consecutive sequence of set bits (32 - clz(bm) - ctz(bm) == popcount(bm)).

1) If the bitmask touches the LSB, then we can remove all the upper bits and set the flags by doing one LSLS.
2) If the bitmask touches the MSB, then we can remove all the lower bits and set the flags with one LSRS.
3) If the bitmask has popcount == 1 (only one set bit), we can shift that bit into the sign bit with one LSLS and change the condition query from NE/EQ to MI/PL (we could also implement this by shifting into the carry bit and branching on BCC/BCS).
4) Otherwise, we can emit a sequence of LSLS+LSRS to remove the upper and lower zero bits of the mask.

1-3 require only one 16-bit instruction and can elide the CMP. 4 requires two 16-bit instructions but can elide the CMP and doesn't require materializing a complex immediate, so is also a win.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281215 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 14:30:48 +00:00
James Molloy
30cc1de5a7 [ARM] Promote small global constants to constant pools
If a constant is unamed_addr and is only used within one function, we can save
on the code size and runtime cost of an indirection by changing the global's storage
to inside the constant pool. For example, instead of:

      ldr r0, .CPI0
      bl printf
      bx lr
    .CPI0: &format_string
    format_string: .asciz "hello, world!\n"

We can emit:

      adr r0, .CPI0
      bl printf
      bx lr
    .CPI0: .asciz "hello, world!\n"

This can cause significant code size savings when many small strings are used in one
function (4 bytes per string).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281213 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 13:42:16 +00:00
Pablo Barrio
ccaa95d2c6 Fix the Thumb test for vfloat intrinsics
Summary:
This test was not testing the intrinsics. A function like this:

define %v4f32 @test_v4f32.floor(%v4f32 %a){
...
        %1 = call %v4f32 @llvm.floor.v4f32(%v4f32 %a)
...
}

is transformed into the following assembly:

_test_v4f32.floor:              @ @test_v4f32.floor
...
        bl _floorf
...

In each function tested, there are two CHECK: one that checked
for the label and another one for the intrinsic that should be used
inside the function (in our case, "floor"). However, although the
first CHECK was matching the label, the second was not matching the
intrinsic, but the second "floor" in the same line as the label.

This is fixed by making the first CHECK match the entire line.

Reviewers: jmolloy, rengolin

Subscribers: rengolin, llvm-commits

Differential Revision: https://reviews.llvm.org/D24398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281211 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 13:14:14 +00:00
Tim Northover
5d592ae6b2 GlobalISel: support translation of global addresses.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281207 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 12:10:41 +00:00
Tim Northover
39f340dd0f GlobalISel: translate GEP instructions.
Unlike SDag, we use a separate G_GEP instruction (much simplified, only taking
a single byte offset) to preserve the pointer type information through
selection.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281205 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 11:20:22 +00:00
Tim Northover
7a92e735b6 GlobalISel: disambiguate types when printing MIR
Some generic instructions have multiple types. While in theory these always be
discovered by inspecting the single definition of each generic vreg, in
practice those definitions won't always be local and traipsing through a big
function to find them will not be fun.

So this changes MIRPrinter to print out the type of uses as well as defs, if
they're known to be different or not known to be the same.

On the parsing side, we're a little more flexible: provided each register is
given a type in at least one place it's mentioned (and all types are
consistent) we accept the MIR. This doesn't introduce ambiguity but makes
writing tests manually a bit less painful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281204 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 11:20:10 +00:00
Elena Demikhovsky
d186dd43ad AVX-512: Added a test case that should be optimized in the future. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281196 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 06:26:03 +00:00
NAKAMURA Takumi
56e56fbc57 llvm/test/CodeGen/AMDGPU/infinite-loop-evergreen.ll REQUIRES +Asserts.
This might not *crash* with -Asserts. I saw it caused infinite loop in the codegen.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281190 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-12 04:27:28 +00:00
James Molloy
16d9e13298 [AArch64] Fixup test after r281160
How I missed this locally is beyond me. I suspect llc didn't recompile. This is just changing the CHECK line back to what it was before r280364.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281161 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-11 08:24:04 +00:00
Craig Topper
d3eebe7daf [AVX-512] Add test cases to demonstrate opportunities for commuting vpternlog. Commuting will be added in a future commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281157 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-11 05:33:43 +00:00
Craig Topper
3679bfe301 [AVX-512] Add VPTERNLOG to load folding tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281156 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-11 05:33:40 +00:00
Craig Topper
88c0b88fbf [X86] Side effecting asm in AVX512 integer stack folding test should return 2 x i64 not 8 x i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281155 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-11 05:33:38 +00:00
Justin Lebar
877859e49f [NVPTX] Use ldg for explicitly invariant loads.
Summary:
With this change (plus some changes to prevent !invariant from being
clobbered within llvm), clang will be able to model the __ldg CUDA
builtin as an invariant load, rather than as a target-specific llvm
intrinsic.  This will let the optimizer play with these loads --
specifically, we should be able to vectorize them in the load-store
vectorizer.

Reviewers: tra

Subscribers: jholewinski, hfinkel, llvm-commits, chandlerc

Differential Revision: https://reviews.llvm.org/D23477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281152 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-11 01:39:04 +00:00
Arnold Schwaighofer
10c97f029a It should also be legal to pass a swifterror parameter to a call as a swifterror
argument.

rdar://28233388

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281147 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-10 19:42:53 +00:00
Arnold Schwaighofer
14e6992bf4 We also need to pass swifterror in R12 under swiftcc not only under ccc
rdar://28190687

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281138 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-10 14:16:55 +00:00
Matt Arsenault
3843a1382e AMDGPU: Fix immediate folding logic when shrinking instructions
If the literal is being folded into src0, it doesn't matter
if it's an SGPR because it's being replaced with the literal.

Also fixes initially selecting 32-bit versions of some instructions
which also confused commuting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281117 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 23:32:53 +00:00
Hans Wennborg
b6bf353d37 X86: Fold tail calls into conditional branches also for 64-bit (PR26302)
This extends the optimization in r280832 to also work for 64-bit. The only
quirk is that we can't do this for 64-bit Windows (yet).

Differential Revision: https://reviews.llvm.org/D24423

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281113 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 22:37:27 +00:00
Matt Arsenault
d5a5e9043a AMDGPU: Run LoadStoreVectorizer pass by default
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281112 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 22:29:28 +00:00
Simon Pilgrim
6bca1ae005 [X86][XOP] Fix VPERMIL2PD mask creation on 32-bit targets
Use getConstVector helper to correctly create v2i64/v4i64 constants on 32-bit targets

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281105 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 21:47:21 +00:00
Michael Kuperstein
b0cab112ed [X86] Regenerate test. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281099 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 21:36:17 +00:00
Arnold Schwaighofer
65ddbeccd7 Create phi nodes for swifterror values at the end of the phi instructions list
ISel makes assumption about the order of phi nodes.

rdar://28190150

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281095 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 21:18:47 +00:00
Justin Lebar
8458ab1e9c [NVPTX] Implement llvm.fabs.f32, llvm.max.f32, etc.
Summary:
Previously these only worked via NVPTX-specific intrinsics.

This change will allow us to convert these target-specific intrinsics
into the general LLVM versions, allowing existing LLVM passes to reason
about their behavior.

It also gets us some minor codegen improvements as-is, from situations
where we canonicalize code into one of these llvm intrinsics.

Reviewers: majnemer

Subscribers: llvm-commits, jholewinski, tra

Differential Revision: https://reviews.llvm.org/D24300

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281092 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 21:07:26 +00:00
Wei Ding
8dae05acf4 AMDGPU : Fix mqsad_u32_u8 instruction incorrect data type.
Differential Revision: http://reviews.llvm.org/D23700

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281081 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 19:31:51 +00:00
Tom Stellard
6ecd5004b4 AMDGPU/SI: Make sure llvm.amdgcn.implicitarg.ptr() is 8-byte aligned for HSA
Reviewers: arsenm

Subscribers: arsenm, wdng, nhaehnle, llvm-commits

Differential Revision: https://reviews.llvm.org/D24405

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281080 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 19:28:00 +00:00
Chris Dewhurst
b68584d6d7 [Sparc][LEON] Removed the parts of the errata fixes implemented using inline assembly as this is not the desired behaviour for end-users. Small change to a unit test to implement this without requiring the inline assembly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281047 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 14:16:51 +00:00
James Molloy
b2500d2186 [ARM] ADD with a negative offset can become SUB for free
So model that directly in TTI::getIntImmCost().

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281044 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 13:35:36 +00:00
James Molloy
f53c395722 [ARM] icmp %x, -C can be lowered to a simple ADDS or CMN
Tell TargetTransformInfo about this so ConstantHoisting is informed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281043 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 13:35:28 +00:00
Simon Pilgrim
31bcc1eeed [SelectionDAG] Ensure DAG::getZeroExtendInReg is called with a scalar type
Fixes issue with rL280927 identified by Mikael Holmén

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281042 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 13:31:52 +00:00
James Molloy
5349cafffb [Thumb] Select (CMPZ X, -C) -> (CMPZ (ADDS X, C), 0)
The CMPZ #0 disappears during peepholing, leaving just a tADDi3, tADDi8 or t2ADDri. This avoids having to materialize the expensive negative constant in Thumb-1, and allows a shrinking from a 32-bit CMN to a 16-bit ADDS in Thumb-2.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281040 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 12:52:24 +00:00
Tim Northover
3c6f3f0961 GlobalISel: remove G_TYPE and G_PHI
These instructions were only necessary when type information was stored in the
MachineInstr (because only generic MachineInstrs possessed a type). Now that
it's in MachineRegisterInfo, COPY and PHI work fine.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281037 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 11:47:31 +00:00
Tim Northover
59282d3fd2 GlobalISel: move type information to MachineRegisterInfo.
We want each register to have a canonical type, which means the best place to
store this is in MachineRegisterInfo rather than on every MachineInstr that
happens to use or define that register.

Most changes following from this are pretty simple (you need an MRI anyway if
you're going to be doing any transformations, so just check the type there).
But legalization doesn't really want to check redundant operands (when, for
example, a G_ADD only ever has one type) so I've made use of MCInstrDesc's
operand type field to encode these constraints and limit legalization's work.

As an added bonus, more validation is possible, both in MachineVerifier and
MachineIRBuilder (coming soon).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281035 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 11:46:34 +00:00
Simon Dardis
80f518b791 Revert "[mips] Fix c.<cc>.<fmt> instruction definition."
This reverts commit r281022. Mips buildbot broke, due to unhandled register
class FCC.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281033 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 11:06:01 +00:00
Sam Kolton
442deaa85b [AMDGPU] Assembler: rename amd_kernel_code_t asm names according to spec
Summary:
Also removed duplicate code from AMDGPUTargetAsmStreamer.
This change only change how amd_kernel_code_t is parsed and printed. No variable names are changed.

Reviewers: vpykhtin, tstellarAMD

Subscribers: arsenm, wdng, nhaehnle

Differential Revision: https://reviews.llvm.org/D24296

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281028 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 10:08:02 +00:00
James Molloy
c68299484e [Thumb1] Teach optimizeCompareInstr about thumb1 compares
This avoids us doing a completely unneeded "cmp r0, #0" after a flag-setting instruction if we only care about the Z or C flags.

Add LSL/LSR to the whitelist while we're here and add testing. This code could really do with a spring clean.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281027 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 09:51:06 +00:00
Simon Dardis
43cc6f8676 [mips] Fix c.<cc>.<fmt> instruction definition.
As part of this effort, remove MipsFCmp nodes and use tablegen
patterns rather than custom lowering through C++.

Unexpectedly, this improves codesize for microMIPS as previous floating
point setcc expansions would materialize 0 and 1 into GPRs before using
the relevant mov[tf].[sd] instruction. Now $zero is used directly.

Reviewers: dsanders, vkalintiris, zoran.jovanovic

Differential Review: https://reviews.llvm.org/D23118


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281022 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 09:22:52 +00:00
Chris Dewhurst
847743b22a [Sparc][LEON] Unit test for CASA instruction supported by some LEON processors added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281021 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 09:08:13 +00:00
Craig Topper
a1223781e3 [AVX-512] Add VPCMP instructions to the load folding tables and make them commutable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281013 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 01:36:10 +00:00
Craig Topper
beb82336a0 [AVX-512] Add more integer vector comparison tests with loads. Some of these show opportunities where we can commute to fold loads.
Commutes will be added in a followup commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281012 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 01:36:04 +00:00
Michael Kuperstein
74990a5bca [X86] Add more baseline tests for "irregular" shuffles. NFC.
This adds more tests for shuffles where the output width does not match
the input width and/or the output is generated from more than two inputs.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281005 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-09 00:49:29 +00:00
Hans Wennborg
d837ce2087 Win64: Don't use REX prefix for direct tail calls
The REX prefix should be used on indirect jmps, but not direct ones.
For direct jumps, the unwinder looks at the offset to determine if
it's inside the current function.

Differential Revision: https://reviews.llvm.org/D24359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281003 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 23:35:10 +00:00
Krzysztof Parzyszek
01b876d45e [RDF] Further improve handling of multiple phis reached from shadows
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280987 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 20:48:42 +00:00
Krzysztof Parzyszek
3358dc461b [Hexagon] Expand sext- and zextloads of vector types, not just extloads
Recent change exposed this issue, breaking the Hexagon buildbots.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280973 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 17:42:14 +00:00
Matt Arsenault
c0196eb442 AMDGPU: Try to commute when selecting s_addk_i32/s_mulk_i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280972 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 17:35:41 +00:00
Matt Arsenault
d764af3c4e AMDGPU: Support commuting with immediate in src0
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280970 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 17:19:29 +00:00
Renato Golin
86159cb9be Revert "[XRay] ARM 32-bit no-Thumb support in LLVM"
And associated commits, as they broke the Thumb bots.

This reverts commit r280935.
This reverts commit r280891.
This reverts commit r280888.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280967 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-08 17:10:39 +00:00