190 Commits

Author SHA1 Message Date
Matt Arsenault
c03102ae77 AMDGPU: VOP3b definition cleanups
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248647 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 02:25:48 +00:00
Matt Arsenault
23663b84b9 AMDGPU: Fix sched model for VOP2b instructions
Trying to use the version with the explicit output operand
would complain because of the missing WriteSALU. I'm not sure
why it doesn't complain about this with the implicit VCC def.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248646 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 02:25:45 +00:00
Matt Arsenault
728cde2865 AMDGPU: Construct new buffer instruction when moving SMRD
It's easier to understand creating a full instruction
than the current situation where sometimes a new
instruction is created and sometimes it is awkwardly
mutated in place.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248627 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 22:21:19 +00:00
Tom Stellard
1566e71dbd AMDGPU/SI: Use .hsatext section instead of .text for HSA
Reviewers: arsenm, grosbach, rafael

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 21:41:28 +00:00
Matt Arsenault
b95c6df392 AMDGPU: Make getNamedOperandIdx declaration readonly
This matches how it is defined in the generated implementation.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248598 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 18:09:15 +00:00
Matt Arsenault
a481a619e7 AMDGPU: Disable some passes that are not meaningful
Don't run passes related to stack maps, garbage collection,
exceptions since these aren't useful for GPUs.

There might be a few more to turn off that I'm less sure about
(e.g. ShrinkWrapping) or I'm not sure how to disable
(SafeStack and StackProtector)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248591 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:41:20 +00:00
Matt Arsenault
9225f01169 AMDGPU: Handle i64->v2i32 loads/stores in PreprocessISelDAG
This fixes a select error when the i64 source was also
bitcasted to v2i32 in the original source.

Instead of awkwardly trying to select the modified source value and
the store, replace before isel begins.

Uses a worklist to avoid possible problems from mutating the DAG,
although it seems to work OK without it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248589 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:27:08 +00:00
Matt Arsenault
7a6a7f2409 AMDGPU: Fix recomputing dominator tree unnecessarily
SIFixSGPRCopies does not modify the CFG, but this was
being recomputed before running SIFoldOperands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248587 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:21:28 +00:00
Matt Arsenault
323c9fbce2 AMDGPU: Re-justify workaround and fix worked around problem
When buffer resource descriptors were built, the upper two components
of the descriptor were first composed into a 64-bit register because
legalizeOperands assumed all operands had the same register class.
Fix that problem, but keep the workaround. I'm not sure anything
actually is actually emitting such a REG_SEQUENCE now.

If multiple resource descriptors are set up with different base
pointers, this is copied with a single s_mov_b64. We probably
should fix this better by recognizing a pair of s_mov_b32 later,
but for now delete the dead code.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248585 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:08:42 +00:00
Matt Arsenault
7ba1878629 AMDGPU: Don't create REG_SEQUENCE with SGPR dest and VGPR sources
This avoids needting to re-legalize the new REG_SEQUENCE.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248584 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:08:40 +00:00
Matt Arsenault
88ba582891 AMDGPU: Fix not adding exec to defs of cmpx instruction pseudos
This was only set on the final _si/_vi version, but not
on the pseudos most of codegen sees.

No test since these instructions aren't used yet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248583 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 16:58:27 +00:00
Matt Arsenault
4fc498f43c AMDGPU: Improve accuracy of instruction rates for VOPC
These were all using the default 32-bit VALU write class,
but the i64/f64 compares are half rate.

I'm not sure this is really correct, because they are still using
the write to VALU write class, even though they really write
to the SALU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248582 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 16:58:25 +00:00
Matt Arsenault
29c29e1c0f AMDGPU: Remove unused includes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248553 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 00:28:43 +00:00
Matt Arsenault
d0edb1f758 AMDGPU: Add s_dcache_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:27 +00:00
Matt Arsenault
1348e9d04d AMDGPU: Add cache invalidation instructions.
These are necessary for implementing mem_fence for
OpenCL 2.0.

The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:21 +00:00
Matt Arsenault
b10121bd9d Introduce target hook for optimizing register copies
Allow a target to do something other than search for copies
that will avoid cross register bank copies.

Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.

I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid  kinds
of subregister copies on some targets.

I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.

The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248478 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 08:36:14 +00:00
Matt Arsenault
a5e772ea93 AMDGPU: Return after instruction is processed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248476 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:28 +00:00
Matt Arsenault
e7de900cec AMDGPU: Remove another unnecessary check from commuteInstruction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248475 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:25 +00:00
Matt Arsenault
f248cd4de4 AMDGPU: Add readonly to InstrMapping functions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248474 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:23 +00:00
Matt Arsenault
809a684e81 AMDGPU: Fix printing trailing whitespace for mubuf atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248472 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:17 +00:00
Matt Arsenault
bb9c0afde5 AMDGPU: Reduce number of copies emitted
Instead of always inserting a copy in case
the super register is itself a subregister,
only extract to the super reg class if this is
actually the case.

This shouldn't really change codegen, but
makes looking at the output of SIFixSGPRCopies
easier to read.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:16:37 +00:00
NAKAMURA Takumi
09c0ea51ca Untabify.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248264 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:15:07 +00:00
NAKAMURA Takumi
c36e746e98 Reformat blank lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248263 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:14:39 +00:00
NAKAMURA Takumi
6902c8db26 Reformat comment lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248262 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 11:14:12 +00:00
Matt Arsenault
d89d4bccff AMDGPU: Remove unnecessary check
If the instruction doesn't have enough operands, it
either shouldn't be marked as isCommutable or is malformed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248242 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 04:17:45 +00:00
Matt Arsenault
73b56327ef AMDGPU: Move copy handling under switch like other instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248172 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 16:27:22 +00:00
Craig Topper
795a06a046 Use makeArrayRef or None to avoid unnecessarily mentioning the ArrayRef type extra times. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 05:32:41 +00:00
Craig Topper
2e1b727d8d Don't pass StringRefs around by const reference. Pass by value instead per coding standards. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248136 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 00:18:00 +00:00
Matt Arsenault
96465eb3e3 AMDGPU: Remove dead code
getCFGStructurizerRegClass is not used for SI, so
move it into R600 specific stuff.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248087 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-19 06:41:10 +00:00
Eric Christopher
973f7aa32a constify the Function parameter to the TTI creation callback and
propagate to all callers/users/etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247864 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 23:38:13 +00:00
Sanjay Patel
39490133e4 propagate fast-math-flags on DAG nodes
After D10403, we had FMF in the DAG but disabled by default. Nick reported no crashing errors after some stress testing, 
so I enabled them at r243687. However, Escha soon notified us of a bug not covered by any in-tree regression tests: 
if we don't propagate the flags, we may fail to CSE DAG nodes because differing FMF causes them to not match. There is
one test case in this patch to prove that point.

This patch hopes to fix or leave a 'TODO' for all of the in-tree places where we create nodes that are FMF-capable. I 
did this by putting an assert in SelectionDAG.getNode() to find any FMF-capable node that was being created without FMF
( D11807 ). I then ran all regression tests and test-suite and confirmed that everything passes.

This patch exposes remaining work to get DAG FMF to be fully functional: (1) add the flags to non-binary nodes such as
FCMP, FMA and FNEG; (2) add the flags to intrinsics; (3) use the flags as conditions for transforms rather than the
current global settings.

Differential Revision: http://reviews.llvm.org/D12095



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247815 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 16:31:21 +00:00
Daniel Sanders
47b167dd84 Revert r247692: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Eric has replied and has demanded the patch be reverted.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247702 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 16:17:27 +00:00
Daniel Sanders
9781f90c7e Re-commit r247683: Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change. Thanks go to Pavel Labath for fixing LLDB for me.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247692 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 14:08:28 +00:00
Daniel Sanders
a6aa0c3bcc Revert r247684 - Replace Triple with a new TargetTuple ...
LLDB needs to be updated in the same commit.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247686 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 13:46:21 +00:00
Daniel Sanders
7b82808e13 Replace Triple with a new TargetTuple in MCTargetDesc/* and related. NFC.
Summary:
This is the first patch in the series to migrate Triple's (which are ambiguous)
to TargetTuple's (which aren't).

For the moment, TargetTuple simply passes all requests to the Triple object it
holds. Once it has replaced Triple, it will start to implement the interface in
a more suitable way.

This change makes some changes to the public C++ API. In particular,
InitMCSubtargetInfo(), createMCRelocationInfo(), and createMCSymbolizer()
now take TargetTuples instead of Triples. The other public C++ API's have
been left as-is for the moment to reduce patch size.

This commit also contains a trivial patch to clang to account for the C++ API
change.

Reviewers: rengolin

Subscribers: jyknight, dschuff, arsenm, rampitec, danalbert, srhines, javed.absar, dsanders, echristo, emaste, jholewinski, tberghammer, ted, jfb, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D10969



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247683 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-15 13:17:40 +00:00
Bruce Mitchener
767c34a919 Fix typos.
Summary: This fixes a variety of typos in docs, code and headers.

Subscribers: jholewinski, sanjoy, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12626

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-12 01:17:08 +00:00
Cong Hou
e5457136e7 Pass BranchProbability/BlockMass by value instead of const& as they are small. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247357 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 23:10:42 +00:00
Matt Arsenault
d62d9ca7fa AMDGPU: Simplify debug printing
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247345 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 21:51:19 +00:00
Matt Arsenault
3a8d465ddd AMDGPU: Use StringRef value
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247344 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 21:51:15 +00:00
Matt Arsenault
3a2cec85a7 AMDGPU/SI: Fix more cases of losing exec operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247230 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 01:23:28 +00:00
Matt Arsenault
117c014fc5 AMDGPU/SI: Fix creating v_mov_b32s without exec uses
This will be caught by existing tests with a
verifier check to be added in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247229 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-10 01:06:06 +00:00
Matt Arsenault
92a899b660 AMDGPU: Extract full 64-bit subregister and use subregs
Instead of extracting both 32-bit components from the 128-bit
register. This produces fewer copies and is easier for
the copy peephole optimizer to understand and see the actual uses
as extracts from a reg_sequence.

This avoids needing to handle subregister composing in the
PeepholeOptimizer's ValueTracker for this case.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247162 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 17:03:29 +00:00
Matt Arsenault
9a4bcd9558 AMDGPU: Remove unused multiclass argument
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247161 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 17:03:18 +00:00
Tom Stellard
6680fc3579 AMDGPU/SI: Fold operands through REG_SEQUENCE instructions
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247157 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 15:43:26 +00:00
Matt Arsenault
a4f3264d02 AMDGPU: Fix not encoding src2 of VOP3b instructions
Broken by r247074. Should include an assembler test,
but the assembler is currently broken for VOP3b apparently.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247123 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 08:39:49 +00:00
Matt Arsenault
7e657c85c6 SelectionDAG: Support Expand of f16 extloads
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.

This moves a hack out of SI's argument lowering and
is covered by existing tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247113 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 01:12:27 +00:00
Matt Arsenault
fa7378ca6e AMDGPU/SI: Fix input vcc operand for VOP2b instructions
Adds vcc to output string input for e32. Allows option
of using e64 encoding with assembler.

Also fixes these instructions not implicitly reading exec.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 21:15:00 +00:00
Matt Arsenault
55ec9f281c AMDGPU: Mark s_barrier as a high latency instruction
These were marked as WriteSALU, which is low latency.
I'm guessing at the value to use, but it should probably
be considered the highest latency instruction.

I'm not sure this has any actual effect since hasSideEffects
probably is preventing any moving of these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247060 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 19:54:32 +00:00
Matt Arsenault
33f41ffd1f AMDGPU: Fix s_barrier flags
This should be convergent. This is not a
barrier in the isBarrier sense, nor
hasCtrlDep.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247059 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 19:54:25 +00:00
Matt Arsenault
626338d11e AMDGPU: Handle sub of constant for DS offset folding
sub C, x - > add (sub 0, x), C for DS offsets.

This is mostly to fix regressions that show up when
SeparateConstOffsetFromGEP is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247054 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 19:34:22 +00:00