Commit Graph

221 Commits

Author SHA1 Message Date
Justin Bogner
b224a73497 CodeGen: print and verify after TargetPassConfig::insertPass by default
In r224059, we started verifying after addPass, but missed doing so on
insertPass. There isn't a good reason for the discrepancy, and
skipping the verifier in these cases causes bugs.

This also exposes a verifier error that was introduced in r249087, but
the verifier doesn't run until after the register coalescer, when the
issue happens to have been resolved. I've skipped the verifier after
SIFixSGPRLiveRangesID to avoid the failures for now and will follow up
with Matt for a proper fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249643 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-08 00:36:22 +00:00
Matt Arsenault
2ec781c2fe AMDGPU: Fix missing implicit m0 uses on movrel instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249577 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 17:46:32 +00:00
Matt Arsenault
3eab2d1f26 AMDGPU: Add comment for VOP2b operand class
Because of the constant bus requirement, it is never legal to
use a literal constant for these instructions despite the encoding
allowing it. This was already doing the right thing, but note why.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249500 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 01:36:00 +00:00
Matt Arsenault
d07019533e AMDGPU: Properly register passes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 00:42:53 +00:00
Matt Arsenault
3f7c35a966 AMDGPU: Use explicit register size indirect pseudos
This stops using an unknown reg class operand.

Currently build_vector selection has a broken looking check
where it tries to use a VGPR reg class and an SGPR one if it
sees an SGPR use.

With the source operand has an explicit VGPR class,
illegal copies will be inserted that SIFixSGPRCopies will take care
of normally later, which will allow removing the weird check
of build_vector users. Without this, when removed v_movrels_b32 would
still be emitted even though all of the values were only stored in
SGPRs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249494 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 00:42:51 +00:00
Matt Arsenault
ce4122358f AMDGPU: Remove inferRegClassFromUses / inferRegClassFromDefs
I'm not sure why this would be necessary, and no tests fail with
them removed. Looking at the uses is suspect as well because
the use reg classes will likely change when the users are moved
as a result of moving this instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249493 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 00:42:31 +00:00
Tom Stellard
a152ab62bf AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments()
Summary:
We currently ignore the calling convention, so there is no real reason to
assert on the calling convention of functions.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13367

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 21:16:34 +00:00
Tom Stellard
63c550368d AMDGPU/SI: Add 64-bit versions of v_nop and v_clrexcp
Summary:
The assembly printing of these is still missing the encoding size
suffix, but this will be fixed in a later commit.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13436

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249424 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 15:57:53 +00:00
Tom Stellard
d0c6de8696 AMDGPU/SI: Add a helper for creating aliases for the _e32 instructions
Summary:
We are currently only using these aliases for VOPC instructions,
but this helper will make it easier to use them everywhere.

These aliases allow for the automatic matching of instructions
with forced 32-bit encoding.  Eventually, we should be able to remove
the custom C++ logic we have for this in the assembler.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13396

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249330 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-05 17:57:39 +00:00
Tom Stellard
8da370d30b AMDGPU/SI: Remove unused tablegen multiclass
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13395

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249221 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-03 00:29:50 +00:00
Matt Arsenault
29467e755f AMDGPU/SI: Add verifier check for exec reads
Make sure we aren't accidentally not setting
these in the instruction definitions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249170 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-02 18:58:37 +00:00
Matt Arsenault
fb65d2c241 AMDGPU: Fix unused variable warning in release build
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249091 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 22:40:35 +00:00
Matt Arsenault
a7474fd041 AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass
Replace LiveInterval usage with LiveVariables. LiveIntervals
computes far more information than is needed for this pass
which just needs to find if an SGPR is live out of the
defining block.

LiveIntervals are not usually available that early, requiring
computing them twice which is very expensive. The extra run of
LiveIntervals/LiveVariables/SlotIndexes was costing in total
about 5% of compile time.

Continuing to use LiveIntervals is problematic. It seems
there is an option (early-live-intervals) to run the analysis
about where it should go to avoid recomputing LiveVariables,
but it seems to be completely broken with subreg liveness
enabled. There are also problems from trying to recompute
LiveIntervals since this seems to undo LiveVariables
and clearing kill flags, causing TwoAddressInstructions
to make bad decisions.

Insert the pass right after live variables and preserve it.
The tricky case to worry about might be phis since
LiveVariables doesn't count a register as live out if
in the successor block it is only used in a phi,
but I don't think this is a concern right now
because SIFixSGPRCopies replaces SGPR phis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249087 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 22:10:03 +00:00
Matt Arsenault
ac5ec1c051 AMDGPU: Merge if and switch
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249082 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 21:51:59 +00:00
Matt Arsenault
db84edf57d AMDGPU: Remove dead code
There's no point in checking VReg_1 because all uses
of it should already have been removed by SILowerI1Copies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249081 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 21:51:57 +00:00
Matt Arsenault
ae6db4bdd7 AMDGPU: Make SIInsertWaits about a factor of 4 faster
This was the slowest target custom pass and was spending 80%
of the time in getMinimalPhysRegClass which was called
for every register operand.

Try to use the statically known register class when possible from
the instruction's MCOperandInfo. There are a few pseudo instructions
which are not well behaved with unknown register classes which still
require the expensive physical register class search.

There are a few other possibilities for making this even faster,
such as not inspecting implicit operands. For now those are checked
because it is technically possible to have a scalar load into
exec or vcc which can be implicitly used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249079 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 21:43:15 +00:00
Tom Stellard
5246a3643b AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass
Summary:
Instead of asserting when the kernel metadata is different than we expect,
we should just skip lowering that function.  This fixes assertion
failures with OpenCL argument metadata from older LLVM releases.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249073 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 21:16:05 +00:00
Tom Stellard
7fe09ce82f AMDGPU: Add MEM_RAT STORE_TYPED.
v2: Add test (Matt).
    Fix capitalization of isEOP (Matt).
    Move pattern to class parameter (Matt).
    Make the instruction available to Cayman (Matt).
    Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED.

Patch by: Zoltan Gilian

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 17:51:34 +00:00
Tom Stellard
e53ca97014 AMDGPU: Factor out EOP query.
v2: Fix brace placement and capitalization (Matt).

Patch by: Zoltan Gilian

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249041 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 17:51:29 +00:00
Tom Stellard
32dcf8b8e9 AMDGPU/SI: Re-order PreloadedValue enum and number entries based on init order
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12451

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248978 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 02:02:46 +00:00
Marek Olsak
bc68baa694 AMDGPU/SI: Don't set DATA_FORMAT if ADD_TID_ENABLE is set
to prevent setting a huge stride, because DATA_FORMAT has a different
meaning if ADD_TID_ENABLE is set.

This is a candidate for stable llvm 3.7.

Tested-and-Reviewed-by: Christian König <christian.koenig@amd.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248858 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-29 23:37:32 +00:00
Matt Arsenault
e706695c2f AMDGPU: Factor switch into separate function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248742 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:57 +00:00
Matt Arsenault
3443ffa833 AMDGPU: Fix splitting x16 SMRD loads
When used recursively, this would set the kill flag
on the intermediate step from first splitting
x16 to x8.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248741 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:52 +00:00
Matt Arsenault
33d8695b88 AMDGPU: Fix moving SMRD loads with literal offsets on CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248740 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:46 +00:00
Matt Arsenault
9ed2f31125 AMDGPU: Fix splitting SMRD with large offset
The splitting of > 4 dword SMRD instructions
if using an offset in an SGPR instead of an immediate
was not setting the destination register,
resulting an an instruction missing an operand
which would assert later.

Test will be included in a following commit
which fixes a related issue.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248739 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:42 +00:00
Andrew Kaylor
aac3c943f3 Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing.
Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com)

Differential Revision: http://reviews.llvm.org/D11370



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:33:22 +00:00
Matt Arsenault
5775e77c0c AMDGPU: Remove hasPostISelHook from most instructions
Since this is only needed for VOP3 and a few other special
case instructions, stop setting it on everything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 05:06:48 +00:00
Matt Arsenault
4d6cb933eb AMDGPU: Switch over reg class size instead of checking all super classes
This gets isSGPRClass out of my profile of SIFixSGPRCopies.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248656 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 04:59:04 +00:00
Matt Arsenault
0a8ee148bb AMDGPU: Don't handle invalid reg classes in helper functions
No tests hit these and it would be better to have checks like
this explicit where they are used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248655 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 04:53:30 +00:00
Saleem Abdulrasool
1af10ebe4d AMDGPU: address -Winconsistent-missing-override
Add missing override.  NFC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248652 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 04:34:52 +00:00
Matt Arsenault
32c69dae68 AMDGPU: Set CopyCost of register classes
These require multiple mov instructions to copy,
but the default value is that 1 instruction is needed.
I'm not sure if this actually changes anything.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248651 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 04:09:34 +00:00
Matt Arsenault
c03102ae77 AMDGPU: VOP3b definition cleanups
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248647 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 02:25:48 +00:00
Matt Arsenault
23663b84b9 AMDGPU: Fix sched model for VOP2b instructions
Trying to use the version with the explicit output operand
would complain because of the missing WriteSALU. I'm not sure
why it doesn't complain about this with the implicit VCC def.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248646 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 02:25:45 +00:00
Matt Arsenault
728cde2865 AMDGPU: Construct new buffer instruction when moving SMRD
It's easier to understand creating a full instruction
than the current situation where sometimes a new
instruction is created and sometimes it is awkwardly
mutated in place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 22:21:19 +00:00
Tom Stellard
1566e71dbd AMDGPU/SI: Use .hsatext section instead of .text for HSA
Reviewers: arsenm, grosbach, rafael

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 21:41:28 +00:00
Matt Arsenault
b95c6df392 AMDGPU: Make getNamedOperandIdx declaration readonly
This matches how it is defined in the generated implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248598 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 18:09:15 +00:00
Matt Arsenault
a481a619e7 AMDGPU: Disable some passes that are not meaningful
Don't run passes related to stack maps, garbage collection,
exceptions since these aren't useful for GPUs.

There might be a few more to turn off that I'm less sure about
(e.g. ShrinkWrapping) or I'm not sure how to disable
(SafeStack and StackProtector)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248591 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:41:20 +00:00
Matt Arsenault
9225f01169 AMDGPU: Handle i64->v2i32 loads/stores in PreprocessISelDAG
This fixes a select error when the i64 source was also
bitcasted to v2i32 in the original source.

Instead of awkwardly trying to select the modified source value and
the store, replace before isel begins.

Uses a worklist to avoid possible problems from mutating the DAG,
although it seems to work OK without it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248589 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:27:08 +00:00
Matt Arsenault
7a6a7f2409 AMDGPU: Fix recomputing dominator tree unnecessarily
SIFixSGPRCopies does not modify the CFG, but this was
being recomputed before running SIFoldOperands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248587 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:21:28 +00:00
Matt Arsenault
323c9fbce2 AMDGPU: Re-justify workaround and fix worked around problem
When buffer resource descriptors were built, the upper two components
of the descriptor were first composed into a 64-bit register because
legalizeOperands assumed all operands had the same register class.
Fix that problem, but keep the workaround. I'm not sure anything
actually is actually emitting such a REG_SEQUENCE now.

If multiple resource descriptors are set up with different base
pointers, this is copied with a single s_mov_b64. We probably
should fix this better by recognizing a pair of s_mov_b32 later,
but for now delete the dead code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248585 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:08:42 +00:00
Matt Arsenault
7ba1878629 AMDGPU: Don't create REG_SEQUENCE with SGPR dest and VGPR sources
This avoids needting to re-legalize the new REG_SEQUENCE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248584 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:08:40 +00:00
Matt Arsenault
88ba582891 AMDGPU: Fix not adding exec to defs of cmpx instruction pseudos
This was only set on the final _si/_vi version, but not
on the pseudos most of codegen sees.

No test since these instructions aren't used yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248583 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 16:58:27 +00:00
Matt Arsenault
4fc498f43c AMDGPU: Improve accuracy of instruction rates for VOPC
These were all using the default 32-bit VALU write class,
but the i64/f64 compares are half rate.

I'm not sure this is really correct, because they are still using
the write to VALU write class, even though they really write
to the SALU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248582 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 16:58:25 +00:00
Matt Arsenault
29c29e1c0f AMDGPU: Remove unused includes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 00:28:43 +00:00
Matt Arsenault
d0edb1f758 AMDGPU: Add s_dcache_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:27 +00:00
Matt Arsenault
1348e9d04d AMDGPU: Add cache invalidation instructions.
These are necessary for implementing mem_fence for
OpenCL 2.0.

The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:21 +00:00
Matt Arsenault
b10121bd9d Introduce target hook for optimizing register copies
Allow a target to do something other than search for copies
that will avoid cross register bank copies.

Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.

I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid  kinds
of subregister copies on some targets.

I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.

The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248478 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 08:36:14 +00:00
Matt Arsenault
a5e772ea93 AMDGPU: Return after instruction is processed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248476 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:28 +00:00
Matt Arsenault
e7de900cec AMDGPU: Remove another unnecessary check from commuteInstruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248475 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:25 +00:00
Matt Arsenault
f248cd4de4 AMDGPU: Add readonly to InstrMapping functions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248474 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:23 +00:00