Commit Graph

968 Commits

Author SHA1 Message Date
Sjoerd Meijer
c46479857e TargetInstrInfo: add virtual function getInstSizeInBytes
This adds a target hook getInstSizeInBytes to TargetInstrInfo that a lot of
subclasses already implement.

Differential Revision: https://reviews.llvm.org/D22885


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277126 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-29 08:16:16 +00:00
Changpeng Fang
539fec5dc2 AMDGPU/SI: Don't handle a loop if there is no loop at all for a terminator BB.
Differential Revision: http://reviews.llvm.org/D22021

Reviewed by: arsenm

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277073 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 23:01:45 +00:00
Matthias Braun
f79c57a412 MachineFunction: Return reference for getFrameInfo(); NFC
getFrameInfo() never returns nullptr so we should use a reference
instead of a pointer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277017 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 18:40:00 +00:00
Wei Ding
ee8c4ca1e1 AMDGPU : Add intrinsics for compare with the full wavefront result
Differential Revision: http://reviews.llvm.org/D22482

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276998 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 16:42:13 +00:00
Tom Stellard
04cc0adf58 AMDGPU/SI: Don't use reserved VGPRs for SGPR spilling
Summary:
We were using reserved VGPRs for SGPR spilling and this was causing
some programs with a workgroup size of 1024 to use more than 64
registers, which is illegal.

Reviewers: arsenm, mareko, nhaehnle

Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22032

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276980 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 14:30:43 +00:00
Nicolai Haehnle
b18ca96c79 AMDGPU: add execfix flag to SI_ELSE
Summary:
SI_ELSE is lowered into two parts:

s_or_saveexec_b64 dst, src (at the start of the basic block)

s_xor_b64 exec, exec, dst (at the end of the basic block)

The idea is that dst contains the exec mask of the preceding IF block. It can
happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside
the basic block that contains SI_ELSE, in which case it introduces an instruction

s_and_b64 exec, exec, s[...]

which masks out bits that can correspond to both the IF and the ELSE paths.
So the resulting sequence must be:

s_or_savexec_b64 dst, src

s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode
s_and_b64 dst, dst, exec <-- added by SILowerControlFlow

s_xor_b64 exec, exec, dst

Whether to add the additional s_and_b64 dst, dst, exec is currently determined
via the ExecModified tracking. With this change, it is instead determined by
an additional flag on SI_ELSE which is set by SIWholeQuadMode.

Finally: It also occured to me that an alternative approach for the long run
is for SILowerControlFlow to unconditionally emit

s_or_saveexec_b64 dst, src

...

s_and_b64 dst, dst, exec
s_xor_b64 exec, exec, dst

and have a pass that detects and cleans up the "redundant AND with exec"
pattern where possible. This could be useful anyway, because we also add
instructions

s_and_b64 vcc, exec, vcc

before s_cbranch_scc (in moveToALU), and those are often redundant. I have
some pending changes to how KILL is lowered that could also benefit from
such a cleanup pass.

In any case, this current patch could help in the short term with the whole
ExecModified business.

Reviewers: tstellarAMD, arsenm

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D22846

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276972 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 11:39:24 +00:00
Matt Arsenault
96ddf547a5 AMDGPU: Turn dead checks into asserts
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276946 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 00:32:05 +00:00
Matt Arsenault
b5a809e37c AMDGPU: Remove analyzeImmediate
This no longer uses the more complicated classification
of constants.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276945 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-28 00:32:02 +00:00
Reid Kleckner
12e910f70a Remove MCAsmInfo.h include from TargetOptions.h
TargetOptions wants the ExceptionHandling enum. Move that to
MCTargetOptions.h to avoid transitively including Dwarf.h everywhere in
clang. Now you can add a DWARF tag without a full rebuild of clang
semantic analysis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276883 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-27 16:03:57 +00:00
Ahmed Bougacha
f15a020711 [GlobalISel] Introduce an instruction selector.
And implement it for AArch64, supporting x/w ADD/OR.

Differential Revision: https://reviews.llvm.org/D22373

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276875 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-27 14:31:55 +00:00
Matt Arsenault
f799c706db AMDGPU: Use rcp for fdiv 1, x with fpmath metadata
Using rcp should be OK for safe math usually, so this
should not be replacing the original fdiv.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276823 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 23:25:44 +00:00
Matt Arsenault
8cbfb0914d AMDGPU: Use implicit_def for selecting anyext
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276819 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 23:06:33 +00:00
Matt Arsenault
252b5ebfdd AMDGPU/R600: Remove dead custom inserters
The intrinsics for these were removed, so this is dead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276805 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 21:03:38 +00:00
Matt Arsenault
7aeb3e40c1 AMDGPU: Minor AsmPrinter cleanups
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276804 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 21:03:36 +00:00
Matt Arsenault
d506595769 AMDGPU: Make AMDGPUMachineFunction fields private
ABIArgOffset is a problem because properly fsetting the
KernArgSize requires that the reserved area before the
real kernel arguments be correctly aligned, which requires
fixing clover.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276766 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 16:45:58 +00:00
Matt Arsenault
ee4cdb7b75 AMDGPU: Add fp legacy instruction intrinsics
This could use some additional optimization work
to use mad/mac legacy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276764 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-26 16:45:45 +00:00
Jan Vesely
4a44da0c82 AMDGPU: Remove read_workdim intrinsic
Differential revision: https://reviews.llvm.org/D22732

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276682 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-25 20:17:02 +00:00
Matt Arsenault
21e0aa8d55 AMDGPU: Make skip threshold an option
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276680 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-25 19:48:29 +00:00
Matt Arsenault
5895e79530 AMDGPU: Delete dead code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276675 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-25 19:06:25 +00:00
Joel Jones
8a39975ebd MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC
Some targets, notably AArch64 for ILP32, have different relocation encodings
based upon the ABI. This is an enabling change, so a future patch can use the
ABIName from MCTargetOptions to chose which relocations to use. Tested using
check-llvm.

The corresponding change to clang is in: http://reviews.llvm.org/D16538

Patch by: Joel Jones

Differential Revision: https://reviews.llvm.org/D16213


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276654 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-25 17:18:28 +00:00
Matt Arsenault
36cfd1c475 AMDGPU: Delete dead code
This has been dead since r269479

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276518 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-23 07:07:14 +00:00
Tom Stellard
a6b9e20623 Revert "[AMDGPU] Emit read-only data to .rodata for hsa"
This reverts commit r276298.

Data stored in .rodata can have a negative offset from .text, but we
don't support negative values in relocations yet.

This caused a regression in one of the amp conformance tests:
5_Data_Cont/5_2_a_v/5_2_3_m/Assignment/Test.02.01

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276498 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 23:46:40 +00:00
Tim Northover
ea26cb1f48 GlobalISel: implement legalization pass, with just one transformation.
This adds the actual MachineLegalizeHelper to do the work and a trivial pass
wrapper that legalizes all instructions in a MachineFunction. Currently the
only transformation supported is splitting up a vector G_ADD into one acting on
smaller vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276461 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 20:03:43 +00:00
Matt Arsenault
9da217ee1e AMDGPU: Fix groupstaticsize for large LDS
The size can exceed s_movk_i32's limit, and we don't
want to use it this early since it inhibits optimizations.

This should probably be merged to the release branch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276438 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 17:01:33 +00:00
Matt Arsenault
30f0e3e4be AMDGPU: Add HSA dispatch id intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276437 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 17:01:30 +00:00
Matt Arsenault
c28b821881 AMDGPU: Delete more dead code
Remove dead code from r600 intrinsic removal.
Remove unset members, rename StackSize to be less ambiguous.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276436 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 17:01:25 +00:00
Matt Arsenault
7488ab3114 AMDGPU: Fix i1 fp_to_int
R600's i1 fp_to_uint selected but was incorrect according to
what instcombine constant folds to.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276435 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 17:01:21 +00:00
Matt Arsenault
7c8be6eeb7 AMDGPU: Don't reinvent transferSuccessorsAndUpdatePHIs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276434 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-22 17:01:15 +00:00
Konstantin Zhuravlyov
82910c89dd [AMDGPU] Emit read-only data to .rodata for hsa
Differential Revision: https://reviews.llvm.org/D22538


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276298 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-21 15:59:23 +00:00
Konstantin Zhuravlyov
33649d7d0b AMDGPU/SI: Add support for R_AMDGPU_ABS32
Differential Revision: https://reviews.llvm.org/D21646


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276294 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-21 15:29:19 +00:00
Sam Kolton
50e9ffb710 [AMDGPU] Some code cleaning in SIRegisterInfo.td
Reviewers: tstellarAMD, vpykhtin

Subscribers: arsenm, kzhuravl

Differential Revision: https://reviews.llvm.org/D22620

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276274 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-21 13:29:57 +00:00
Matt Arsenault
a9994065f9 AMDGPU: Fix phis from blocks split due to register indexing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276257 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-21 09:40:57 +00:00
Yaxun Liu
59e8cabf31 AMDGPU: Fix bug causing crash due to invalid opencl version metadata.
Differential Revision: https://reviews.llvm.org/D22526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276119 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-20 14:38:06 +00:00
Matt Arsenault
63be72069d AMDGPU: Change fdiv lowering based on !fpmath metadata
If 2.5 ulp is acceptable, denormals are not required, and
isn't a reciprocal which will already be handled, replace
with a faster fdiv.

Simplify the lowering tests by using per function
subtarget features.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276051 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 23:16:53 +00:00
Davide Italiano
f36cce1574 [AMDGPU] Remove spurious line (should've been removed in r276029).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276030 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 21:16:30 +00:00
Davide Italiano
5012465830 [AMDGPU] Remove dead code.
LGTM'd by Matt Arsenault.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276029 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 21:10:49 +00:00
Matt Arsenault
1ce58d721f AMDGPU: Only use legal inline immediates with kill pseudo
Only if the value is negative or positive is what matters,
so use a constant that doesn't require an instruction to
materialize.

These should really just emit the write exec directly,
but for stick with the kill pseudo-terminator.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275988 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 16:27:56 +00:00
Matt Arsenault
530f0c21c6 AMDGPU/SI: Fix SI scheduler refcount issue
Without this fix, releaseSuccessors when InOrOutBlock is
false could release SUs outside the schedule BasicBlock.

Patch by Axel Davy

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275935 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 00:35:22 +00:00
Matt Arsenault
4cead0b564 AMDGPU: Expand register indexing pseudos in custom inserter
This is to help moveSILowerControlFlow to before regalloc.
There are a couple of tradeoffs with this. The complete CFG
is visible to more passes, the loop body avoids an extra copy of m0,
vcc isn't required, and immediate offsets can be shrunk into s_movk_i32.

The disadvantage is the register allocator doesn't understand that
the single lane's vector is dead within the loop body, so an extra
register is used to outlive the loop block when expanding the
VGPR -> m0 loop. This also now results in worse waitcnt insertion
before the loop instead of after for pending operations at the point
of the indexing, but that should be fixed by future improvements to
cross block waitcnt insertion.

v_movreld_b32's operands are now modeled more correctly since vdst
is not a true output. This is kind of a hack to treat vdst as a
use operand. Extra checking is required in the verifier since
I can't seem to get tablegen to emit an implicit operand for a
virtual register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275934 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-19 00:35:03 +00:00
Matt Arsenault
1b96f3c048 AMDGPU: Remove pointless dyn_cast_or_null
This is already casted above so non-null

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275881 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 19:00:07 +00:00
Matt Arsenault
dddc5303e9 AMDGPU: Fix missing switch case warning
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275873 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:40:51 +00:00
Matt Arsenault
bb09cfd86f AMDGPU: Add intrinsic for s_flbit_i32/v_ffbh_i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275871 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:35:05 +00:00
Matt Arsenault
40ca91a07a AMDGPU/R600: Replace barrier intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275870 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:34:59 +00:00
Matt Arsenault
865e2fa1dc AMDGPU: Remove dead check in AMDGPUPromoteAlloca
This is currently only called with GEP users. A direct
alloca would only happen with current typed pointers
for arrays which are a perverse case.

Also fix crashes on 0 x and 1 x arrays.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275869 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:34:53 +00:00
Matt Arsenault
797b9ee060 AMDGPU: Remove dead code and redundant check
Non intrinsic calls aren't really handled, and this
IntrinsicInst dyn_cast checks for the function for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275868 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:34:48 +00:00
Nicolai Haehnle
0c05ce4746 AMDGPU: Disable AMDGPUPromoteAlloca pass for shader calling conventions.
Summary:
The work item intrinsics are not available for the shader
calling conventions. And even if we did hook them up most
shader stages haves some extra restrictions on the amount
of available LDS.

Reviewers: tstellarAMD, arsenm

Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D20728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275779 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 09:02:47 +00:00
Yaxun Liu
384c6423e5 Re-commit [AMDGPU] Add metadata for runtime
Attempting to fix lit test failure on ppc.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275676 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-16 05:09:21 +00:00
Matt Arsenault
e066e581b1 AMDGPU: Fix verifier error from partially undef copy
In this situation:

%VGPR2<def> = BUFFER_LOAD_DWORD_OFFSET %SGPR8_SGPR9_SGPR10_SGPR11,
%VGPR7<def,tied3> = V_MAC_F32_e32 %VGPR0<undef>, %VGPR1<kill>, %VGPR7<kill,tied0>, %EXEC<imp-use>
%VGPR3_VGPR4_VGPR5_VGPR6<def> = COPY %VGPR0_VGPR1_VGPR2_VGPR3
%VGPR4<def> = COPY %VGPR2

The copy for VGPR1 -> VGPR4 was an error from reading undefined VGPR1,
but VGPR4 is defined immediately after this copy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275635 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 22:32:02 +00:00
Matt Arsenault
35290cc53d AMDGPU: Remove brev intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275620 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:27:13 +00:00
Matt Arsenault
5fecfa22e5 AMDGPU: Fix TargetPrefix for remaining r600 intrinsics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275619 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-15 21:27:08 +00:00