Commit Graph

17183 Commits

Author SHA1 Message Date
Matt Arsenault
d35aece639 AMDGPU: Implement per-function subtargets
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273940 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 20:48:03 +00:00
Matt Arsenault
dca409d5ad AMDGPU: Move subtarget feature checks into passes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273937 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 20:32:13 +00:00
Justin Holewinski
76e2771df0 Only emit extension for zeroext/signext arguments if type is < 32 bits
Reviewers: jingyue, jlebar

Subscribers: jholewinski

Differential Revision: http://reviews.llvm.org/D21756

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273922 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 20:22:22 +00:00
Rafael Espindola
bc84e94c11 Teach shouldAssumeDSOLocal about tls.
Fixes a fixme about handling other visibilities.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273921 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 20:19:14 +00:00
Matt Arsenault
5123c149e7 AMDGPU: Fix verifier errors with undef vector indices
Also fix pointlessly adding exec to liveins.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273916 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 19:57:44 +00:00
Matt Arsenault
bd288e1778 DAGCombiner: Don't narrow volatile vector loads + extract
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273909 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 19:31:04 +00:00
Elena Demikhovsky
89fca4c1b1 X86 Lowering - Fixed a crash in ICMP scalar instruction
Fixed a bug in EmitTest() function in combining shl + icmp.

https://llvm.org/bugs/show_bug.cgi?id=28119



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273899 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 18:07:16 +00:00
Artur Pilipenko
be0da39a48 Revert -r273892 "Support arbitrary addrspace pointers in masked load/store intrinsics" since some of the clang tests don't expect to see the updated signatures.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273895 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:54:33 +00:00
Artur Pilipenko
9227558e8e Support arbitrary addrspace pointers in masked load/store intrinsics
This is a resubmittion of 263158 change after fixing the existing problem with intrinsics mangling (see LTO and intrinsics mangling llvm-dev thread for details).

This patch fixes the problem which occurs when loop-vectorize tries to use @llvm.masked.load/store intrinsic for a non-default addrspace pointer. It fails with "Calling a function with a bad signature!" assertion in CallInst constructor because it tries to pass a non-default addrspace pointer to the pointer argument which has default addrspace.

The fix is to add pointer type as another overloaded type to @llvm.masked.load/store intrinsics.

Reviewed By: reames

Differential Revision: http://reviews.llvm.org/D17270


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273892 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:29:26 +00:00
Simon Pilgrim
17c3914be2 [X86][SSE] Added extra broadcast tests to cover PR28327
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273891 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 16:15:37 +00:00
Zhan Jun Liau
ff74d2352e [SystemZ] Avoid generating 2 XOR instructions for (and (xor x, -1), y)
Summary:
Created a pattern to match 64-bit mode (and (xor x, -1), y)
to a shorter sequence of instructions.

Before the change, the canonical form is translated to:
        xihf    %r3, 4294967295
        xilf    %r3, 4294967295
        ngr     %r2, %r3

After the change, the canonical form is translated to:
        ngr     %r3, %r2
        xgr     %r2, %r3

Reviewers: zhanjunl, uweigand

Subscribers: llvm-commits

Author: assem

Committing on behalf of Assem.

Differential Revision: http://reviews.llvm.org/D21693

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273887 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 15:55:30 +00:00
Krzysztof Parzyszek
339dc3dc8f [Hexagon] Equally-sized vectors are equivalent in ISel (except vNi1)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273885 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 15:08:22 +00:00
Nico Weber
ebad00c746 Revert 273848, it caused PR28329
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273879 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 14:36:46 +00:00
Simon Pilgrim
9040edc5de Removed duplicate assertions note
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273874 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 13:06:18 +00:00
Hrvoje Varga
b256e8a5b2 [mips][micromips] Implement LD, LLD, LWU, SD, DSRL, DSRL32 and DSRLV instructions
Differential Revision: http://reviews.llvm.org/D16625


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273850 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 08:23:28 +00:00
Simon Pilgrim
7c1d489b88 [X86][AVX] Peek through bitcasts to find the source of broadcasts
AVX1 can only broadcast vectors as floats/doubles, so for 256-bit vectors we insert bitcasts if we are shuffling v8i32/v4i64 types. Unfortunately the presence of these bitcasts prevents the current broadcast lowering code from peeking through cases where we have concatenated / extracted vectors to create the 256-bit vectors.

This patch allows us to peek through bitcasts as long as the number of elements doesn't change (i.e. element bitwidth is the same) so the broadcast index is not affected.

Note this bitcast peek is different from the stage later on which doesn't care about the type and is just trying to find a load node.

Differential Revision: http://reviews.llvm.org/D21660

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273848 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 07:44:32 +00:00
Rafael Espindola
00fd9cb07c Mips: Fix access to private functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273843 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 03:19:40 +00:00
Jan Vesely
d207fc4c12 AMDGPU/R600: Fix GlobalValue regressions.
Don't cast GV expression to MCSymbolRefExpr. r272705 changed GV to binary
expressions by including offset even if the offset it 0
(we haven't hit this sooner since tested workloads don't include static offsets)
We don't really care about the type of expression, so set it directly.
Fixes: r272705

Consider section relative relocations. Since all const as data is in one boffer section relative is equivalent to abs32.
Fixes: r273166

Differential Revision: http://reviews.llvm.org/D21633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273785 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-25 18:24:16 +00:00
Konstantin Zhuravlyov
20c7a48718 [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header
Debugger prologue is emitted if -mattr=+amdgpu-debugger-emit-prologue.

Debugger prologue writes work group IDs and work item IDs to scratch memory at fixed location in the following format:
  - offset 0: work group ID x
  - offset 4: work group ID y
  - offset 8: work group ID z
  - offset 16: work item ID x
  - offset 20: work item ID y
  - offset 24: work item ID z

Set
  - amd_kernel_code_t::debug_wavefront_private_segment_offset_sgpr to scratch wave offset reg
  - amd_kernel_code_t::debug_private_segment_buffer_sgpr to scratch rsrc reg
  - amd_kernel_code_t::is_debug_supported to true if all debugger features are enabled

Differential Revision: http://reviews.llvm.org/D20335


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273769 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-25 03:11:28 +00:00
Tom Stellard
16fa6f1061 AMDGPU/SI: Make sure not to fold offsets into local address space globals
Summary:
Offset folding only works if you are emitting relocations, and we don't
emit relocations for local address space globals.

Reviewers: arsenm, nhaustov

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: http://reviews.llvm.org/D21647

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-25 01:59:16 +00:00
Matthias Braun
f011e37181 MachineScheduler: Fully compare top/bottom candidates
In bidirectional scheduling this gives more stable results than just
comparing the "reason" fields of the top/bottom node because the reason
field may be higher depending on what other nodes are in the queue.

Differential Revision: http://reviews.llvm.org/D19401

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273755 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-25 00:23:00 +00:00
Matthias Braun
0791b66fef AMDGPU: Define a schedule class for COPY.
COPY was lacking a scheduling class, define it to avoid regressions in
the upcoming change to the bidirectional MachineScheduler. Approved by
tstellar on IRC.

Differential Revision: http://reviews.llvm.org/D21540

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273751 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 23:52:11 +00:00
Krzysztof Parzyszek
99719f40ee [Hexagon] Simplify (+fix) instruction selection for indexed loads/stores
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273733 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 21:27:17 +00:00
Rafael Espindola
ab8ffadc13 Add support for musl-libc on ARM Linux.
Patch by Lei Zhang!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273726 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 21:14:33 +00:00
Rafael Espindola
fddccef6cb Use shouldAssumeDSOLocal in isOffsetFoldingLegal.
This makes it slightly more powerful for dynamic-no-pic.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273704 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 18:48:36 +00:00
Kyle Butt
546c8eba34 Codegen: Fix broken assumption in Tail Merge.
Tail merge was making the assumption that a layout successor or
predecessor was always a cfg successor/predecessor. Remove that
assumption. Changes to tests are necessary because the errant cfg edges
were preventing optimizations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273700 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 18:16:36 +00:00
Rafael Espindola
d4e43baeb0 Use FileCheck. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273699 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 18:04:39 +00:00
Chad Rosier
238d855bdc [MachineDominatorTree] Add a MDT verifier.
Differential Revision: http://reviews.llvm.org/D21657

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273678 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 13:32:22 +00:00
Daniel Sanders
f333491832 [mips] Use --check-prefixes where appropriate. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273669 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 12:23:17 +00:00
Matt Arsenault
11c2d4bf28 AMDGPU: Add stub custom CodeGenPrepare pass
This will do various things including ones
CodeGenPrepare does, but with knowledge of uniform
values.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273657 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 07:07:55 +00:00
Matt Arsenault
8b2f86f045 AMDGPU: Un-xfail and add tests
Un XFAIL a few tests plus a few more I had lying around
in my tree, which seem to all work now but I don't see tests
that quite test the same things.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273655 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:58:01 +00:00
Matt Arsenault
9af2418e41 AMDGPU: Remove disable-irstructurizer subtarget feature
The only real reason to use it is for testing, so replace
it with a command line option instead of a potentially function
dependent feature.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273653 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 06:30:22 +00:00
Ahmed Bougacha
379e6dc9b4 [ARM] Use aapcs_vfp for ___truncdfhf2 on v7k.
r215348 overrode the f16 libcalls to be soft-float, but
v7k uses the default (hard-float) calling convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273631 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-24 00:08:01 +00:00
Kyle Butt
da15505032 Codegen: [X86] preservere memory refs for folded umul_lohi
Memory references were not being propagated for this folded load. This
prevented optimizations like LICM from hoisting the load.

Added test to verify that this allows LICM to proceed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273617 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 21:40:35 +00:00
Kyle Butt
ffdf177de1 Codegen: LICM Remove check for exactly 1 register def.
When considering whether to split an instruction with a memory operand
into an explicit load and a register-based instruction, we currently
check that the resulting instruction has exactly 1 def. This prevents 2
important LICM optimizations: compares with memory operands, and double
indirect calls. All the tests and the test-suite pass without the check.
My guess as to original intent is to limit the additional register pressure
created by the new instruction, but given that we only split out a single
register, it is already limited.

The licm-dominance test now checks actual memory loads for hoisting instead of
undef, and it tests compares.
hoist-invariant-load.ll now checks for 2 hoists, the intended hoist, and a bonus
from calling a got-relative function in a loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273616 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 21:38:49 +00:00
Rafael Espindola
e41c71efea Uses shouldAssumeDSOLocal.
With that SystemZ knows to avoid a GOT for PIE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273614 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 21:18:59 +00:00
Rafael Espindola
205ddae3d3 Convert test to FileCheck.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273609 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 20:37:49 +00:00
Michael Kuperstein
f7e90f80e1 [X86] Extract HiPE prologue constants into metadata
X86FrameLowering::adjustForHiPEPrologue() contains a hard-coded offset
into an Erlang Runtime System-internal data structure (the PCB). As the
layout of this data structure is prone to change, this poses problems
for maintaining compatibility.

To address this problem, the compiler can produce this information as
module-level named metadata. For example (where P_NSP_LIMIT is the
offending offset):

!hipe.literals = !{ !2, !3, !4 }
!2 = !{ !"P_NSP_LIMIT", i32 152 }
!3 = !{ !"X86_LEAF_WORDS", i32 24 }
!4 = !{ !"AMD64_LEAF_WORDS", i32 24 }

Patch by Magnus Lang

Differential Revision: http://reviews.llvm.org/D20363


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273593 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 18:17:25 +00:00
Pablo Barrio
e8da13b383 [ARM] Lower (select_cc k k (select_cc ~k ~k x)) into (SSAT l_k x)
Summary:
SSAT saturates an integer, making sure that its value lies within
an interval [-k, k]. Since the constant is given to SSAT as the
number of bytes set to one, k + 1 must be a power of 2, otherwise
the optimization is not possible. Also, the select_cc must use <
and > respectively so that they define an interval.

Reviewers: mcrosier, jmolloy, rengolin

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D21372

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273581 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 16:53:49 +00:00
Artur Pilipenko
525757e9f7 Upgrade other old memset/memcpy signatures in tests causing buildbot failures with rL273568.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273580 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 16:34:52 +00:00
Artur Pilipenko
1fa3fc6d89 Fix an old memset signature in 2009-09-01-PostRAProlog.ll test causing a buildbot failure
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273573 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 16:07:10 +00:00
Simon Pilgrim
c1faee3baa [X86][AVX512] Added AVX512F vector sign extend tests
Now that Elena has confirmed that PR26474 has been fixed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273560 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 14:01:45 +00:00
Daniel Sanders
43733c2252 [mips] Don't derive the default ABI from the CPU in the backend.
Summary:
The backend has no reason to behave like a driver and should generally do
as it's told (and error out if it can't) instead of trying to figure out
what the API user meant. The default ABI is still derived from the arch
component as a concession to backwards compatibility.

API-users that previously passed an explicit CPU and a triple that was
inconsistent with the CPU (e.g. mips-linux-gnu and mips64r2) may get a
different ABI to what they got before. However, it's expected that there
are no such users on the basis that CodeGen has been asserting that the
triple is consistent with the selected ABI for several releases. API-users
that were consistent or passed '' or 'generic' as the CPU will see no
difference.

Reviewers: sdardis, rafael

Subscribers: rafael, dsanders, sdardis, llvm-commits

Differential Revision: http://reviews.llvm.org/D21466


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273557 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 12:42:53 +00:00
Diana Picus
a4a23eae96 [AMDGPU] Remove exit-on-error in test (PR27761)
The exit-on-error flag was necessary in order to avoid an assertion when
handling DYNAMIC_STACKALLOC nodes in SelectionDAGLegalize.

We can avoid the assertion by creating some dummy nodes. This enables us to
remove the exit-on-error flag on the first 2 run lines (SI), but on the third
run line (R600) we would run into another assertion when trying to reserve
indirect registers. This patch also replaces that assertion with an early exit
from the function.

Fixes PR27761.

Differential Revision: http://reviews.llvm.org/D20852

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273550 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 09:19:16 +00:00
Craig Topper
12d48c9c94 [AVX512] Remove masked unpack intrinsics and autoupgrade to vectorshuffle and selects.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273543 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-23 07:37:33 +00:00
Matt Arsenault
fddf7f599f AMDGPU: Fix liveness when expanding m0 loop
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273514 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 23:40:57 +00:00
Sanjoy Das
700dff7358 [ImplicitNullChecks] Hoist trivial depdendencies if possible
When trying to convert a loading instruction into a FAULTING_LOAD, we
sometimes face code like this:

  if %R10 is not null:
    %R9<def> = MOV32ri Immediate
    %R9<def, tied> = AND32rm %R9, 0x20(%R10)
  else:
    goto TRAP

In these cases we would like to use the AND32rm instruction as the
faulting operation by hoisting the "depedency" def-ing %R9 also above
the control flow, transforming the program into:

  %R9<def> = MOV32ri Immediate
  %R9<def, tied> = FAULTING_LOAD_OP(AND32rm %R9, 0x20(%R10), FailPath: TRAP)

This change teaches ImplicitNullChecks to do the above, when safe.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273501 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 22:16:51 +00:00
Rafael Espindola
bf7782c956 Use shouldAssumeDSOLocal.
With this it handle -fPIE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273499 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 22:09:17 +00:00
Changpeng Fang
7cde679f44 AMDGPU/SI: Define an intrinsic to expose ds_swizzle_b32
Reviewers: tstellarAMD, arsenm

Differential Revision: http://reviews.llvm.org/D21533

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273496 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 21:33:49 +00:00
Peter Collingbourne
277258478e IR: Introduce Module::global_objects().
This is a convenience iterator that allows clients to enumerate the
GlobalObjects within a Module.

Also start using it in a few places where it is obviously the right thing
to use.

Differential Revision: http://reviews.llvm.org/D21580

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273470 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-22 20:29:42 +00:00