197 Commits

Author SHA1 Message Date
Matt Arsenault
69971dbc30 AMDGPU: SIDebuggerInsertNops preserves CFG
This saves an additional run of the DominatorTree and
MachineLoopInfo

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271444 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-02 00:04:22 +00:00
Matt Arsenault
9d102db2c1 AMDGPU: Remove unused address space
Also return a single StringRef instead of building a string.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271296 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-31 16:57:45 +00:00
Rafael Espindola
ac8db59598 Delete Reloc::Default.
Having an enum member named Default is quite confusing: Is it distinct
from the others?

This patch removes that member and instead uses Optional<Reloc> in
places where we have a user input that still hasn't been maped to the
default value, which is now clear has no be one of the remaining 3
options.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269988 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 22:04:49 +00:00
Matt Arsenault
39107ccf80 AMDGPU: Don't run passes that aren't useful
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269943 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 15:41:07 +00:00
Konstantin Zhuravlyov
2147c01e5a [AMDGPU][NFC] Rename SIInsertNops -> SIDebuggerInsertNops
Differential Revision: http://reviews.llvm.org/D20117


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269098 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 18:33:41 +00:00
Matthias Braun
6a6190de10 CodeGen: Move TargetPassConfig from Passes.h to an own header; NFC
Many files include Passes.h but only a fraction needs to know about the
TargetPassConfig class. Move it into an own header. Also rename
Passes.cpp to TargetPassConfig.cpp while we are at it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269011 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-10 03:21:59 +00:00
Tom Stellard
66eb4d17bb AMDGPU/SI: Add support for AMD code object version 2.
Summary:
Version 2 is now the default.  If you want to emit version 1, use
the amdgcn--amdhsa-amdcov1 triple.

Reviewers: arsenm, kzhuravl

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19283

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268647 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-05 17:03:33 +00:00
Tom Stellard
6ab99c7ca6 AMDGPU/SI: Enable the post-ra scheduler
Summary:
This includes a hazard recognizer implementation to replace some of
the hazard handling we had during frame index elimination.

Reviewers: arsenm

Subscribers: qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18602

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268143 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-30 00:23:06 +00:00
Matt Arsenault
44f78713bc AMDGPU/SI: Move post regalloc run of SIShrinkInstructions
Move to addPreEmitPass. This is so it runs after post-RA
scheduling so we can merge s_nops emitted by the scheduler
and hazard recognizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@268095 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-29 20:23:42 +00:00
Konstantin Zhuravlyov
1a459df239 [AMDGPU] Insert nop pass: take care of outstanding feedback
- Switch few loops to range-based for loops
- Fix nop insertion at the end of BB
- Fix formatting
- Check for endpgm

Differential Revision: http://reviews.llvm.org/D19380


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267167 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-22 17:04:51 +00:00
Konstantin Zhuravlyov
5d42fbaf4c [AMDGPU] Add insert nops pass based on subtarget features instead of cl::opt
Also,
- Skip pass if machine module does not have debug info
- Minor comment changes
- Added test

Differential Revision: http://reviews.llvm.org/D19079


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266626 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-18 16:28:23 +00:00
Matt Arsenault
6d955a1d6d AMDGPU: Run SIFoldOperands after PeepholeOptimizer
PeepholeOptimizer cleans up redundant copies, which makes
the operand folding more effective.

shader-db stats:

Totals:
SGPRS: 34200 -> 34336 (0.40 %)
VGPRS: 22118 -> 21655 (-2.09 %)
Code Size: 632144 -> 633460 (0.21 %) bytes
LDS: 11 -> 11 (0.00 %) blocks
Scratch: 10240 -> 11264 (10.00 %) bytes per wave
Max Waves: 8822 -> 8918 (1.09 %)
Wait states: 0 -> 0 (0.00 %)

Totals from affected shaders:
SGPRS: 7704 -> 7840 (1.77 %)
VGPRS: 5169 -> 4706 (-8.96 %)
Code Size: 234444 -> 235760 (0.56 %) bytes
LDS: 2 -> 2 (0.00 %) blocks
Scratch: 0 -> 1024 (0.00 %) bytes per wave
Max Waves: 1188 -> 1284 (8.08 %)
Wait states: 0 -> 0 (0.00 %)

Increases:
SGPRS: 35 (0.01 %)
VGPRS: 1 (0.00 %)
Code Size: 59 (0.02 %)
LDS: 0 (0.00 %)
Scratch: 1 (0.00 %)
Max Waves: 48 (0.02 %)
Wait states: 0 (0.00 %)

Decreases:
SGPRS: 26 (0.01 %)
VGPRS: 54 (0.02 %)
Code Size: 68 (0.03 %)
LDS: 0 (0.00 %)
Scratch: 0 (0.00 %)
Max Waves: 4 (0.00 %)
Wait states: 0 (0.00 %)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266378 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 21:58:24 +00:00
Tom Stellard
65b55414cc AMDGPU: Add skeleton GlobalIsel implementation
Summary:
This adds the necessary target code to be able to run the ir translator.
Lowering function arguments and returns is a nop and there is no support
for RegBankSelect.

Reviewers: arsenm, qcolombet

Subscribers: arsenm, joker.eph, vkalintiris, llvm-commits

Differential Revision: http://reviews.llvm.org/D19077

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266356 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 19:09:28 +00:00
Nicolai Haehnle
d07340545e AMDGPU: Remove SIFixSGPRLiveRanges pass
Summary:
This pass is unnecessary and overly conservative. It was motivated by
situations like

  def %vreg0:SGPR_32
  ...
if-block:
  ..
  def %vreg1:SGPR_32
  ...
else-block:
  ...
  use %vreg0:SGPR_32
  ...

and similar situations with uses after the non-uniform control flow, where
we are not allowed to assign %vreg0 and %vreg1 to the same physical register,
even though in the original, thread/workitem-based CFG, it looks like the
live ranges of these registers do not overlap.

However, by the time register allocation runs, we have moved to a wave-based
CFG that accurately represents the fact that the wave may run through both
the if- and the else-block. So the live ranges of %vreg0 and %vreg1 already
overlap even without the SIFixSGPRLiveRanges pass.

In addition to proving this change correct, I have tested it with Piglit
and a small number of other tests.

Reviewers: arsenm, tstellarAMD

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D19041

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 17:42:29 +00:00
Nicolai Haehnle
f0b7f107b9 AMDGPU: Add SIWholeQuadMode pass
Summary:
Whole quad mode is already enabled for pixel shaders that compute
derivatives, but it must be suspended for instructions that cause a
shader to have side effects (i.e. stores and atomics).

This pass addresses the issue by storing the real (initial) live mask
in a register, masking EXEC before instructions that require exact
execution and (re-)enabling WQM where required.

This pass is run before register coalescing so that we can use
machine SSA for analysis.

The changes in this patch expose a problem with the second machine
scheduling pass: target independent instructions like COPY implicitly
use EXEC when they operate on VGPRs, but this fact is not encoded in
the MIR. This can lead to miscompilation because instructions are
moved past changes to EXEC.

This patch fixes the problem by adding use-implicit operands to
target independent instructions. Some general codegen passes are
relaxed to work with such implicit use operands.

Reviewers: arsenm, tstellarAMD, mareko

Subscribers: MatzeB, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D18162

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263982 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-21 20:28:33 +00:00
Matt Arsenault
7137d0dce6 AMDGPU: R600 code splitting cleanup
Move a few functions only used by R600 to R600 specific code,
fix header macros to stop using R600, mark classes as final.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263204 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-11 08:00:27 +00:00
Tom Stellard
bf86946b60 AMDGPU: Insert two S_NOP instructions for every high level source statement.
Patch by: Konstantin Zhuravlyov

Summary: Tools, such as debugger, need to pause execution based on user input (i.e. breakpoint). In order to do this, two S_NOP instructions are inserted for each high level source statement: one before first isa instruction of high level source statement, and one after last isa instruction of high level source statement. Further, debugger may replace S_NOP instructions with S_TRAP instructions based on user input.

Reviewers: tstellarAMD, arsenm

Subscribers: echristo, dblaikie, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D17454

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262579 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-03 03:53:29 +00:00
Tom Stellard
98ef447825 AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions
Reviewers: arsenm

Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16603

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260765 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 23:45:29 +00:00
Matt Arsenault
85b3e06674 AMDGPU: Initialize SILowerControlFlow
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260645 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-12 02:16:10 +00:00
Tom Stellard
739e22cf9d AMDGPU: Fix ordering of CPU and FS parameters in TargetMachine constructors
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16863

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259897 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-05 18:29:17 +00:00
Tom Stellard
35cb73cd06 AMDGPU/SI: Correctly initialize SIInsertWaits pass
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16724

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259894 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-05 17:42:38 +00:00
Matt Arsenault
ec856e4504 AMDGPU: Skip promote alloca with no optimizations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259551 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:32:42 +00:00
Matt Arsenault
084a86c382 AMDGPU: Fix emitting invalid workitem intrinsics for HSA
The AMDGPUPromoteAlloca pass was emitting the read.local.size
calls, which with HSA was incorrectly selected to reading from
the offset mesa uses off of the kernarg pointer.

Error on intrinsics which aren't supported by HSA, and start
emitting the correct IR to read the workgroup size
out of the dispatch pointer.

Also initialize the pass so it can be tested with opt, and
start moving towards not depending on the subtarget as an
argument.

Start emitting errors for the intrinsics not handled with HSA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259297 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-30 05:19:45 +00:00
Matt Arsenault
de2c3bc98d AMDGPU: Fix default device handling
When no device name is specified, default to kaveri
for HSA since SI is not supported and it woud fail.

Default to "tahiti" instead of "SI" since these are
effectively the same, and tahiti is an actual device.

Move default device handling to the TargetMachine
rather than the AMDGPUSubtarget. The module ISA version
is computed from the device name provided with the target
machine, so the attributes printed by the AsmPrinter were
inconsistent with those computed in the subtarget.

Also remove DevName field from subtarget since it's redundant
with getCPU() in the superclass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258901 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-27 02:17:49 +00:00
Tom Stellard
72304925ab AMDGPU/SI: Pass whether to use the SI scheduler via Target Attribute
Summary:
Currently the SI scheduler can be selected via command line option,
but it turned out it would be better if it was selectable via a Target Attribute.

This patch adds "si-scheduler" attribute to the backend.

Reviewers: tstellarAMD, echristo

Subscribers: echristo, arsenm

Differential Revision: http://reviews.llvm.org/D16192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258386 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-21 04:28:34 +00:00
Tom Stellard
b59d538d00 Correctly initialize SIAnnotateControlFlow
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258319 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-20 15:48:27 +00:00
Nicolai Haehnle
cead1b4a6d AMDGPU/SI: Add SI Machine Scheduler
Summary:
It is off by default, but can be used
with --misched=si

Patch by: Axel Davy

Reviewers: arsenm, tstellarAMD, nhaehnle

Subscribers: nhaehnle, solenskiner, arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D11885

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257609 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-13 16:10:10 +00:00
Tom Stellard
6a0d02e088 AMDGPU/SI: Select constant loads with non-uniform addresses to MUBUF instructions
Summary:
We were previously selecting all constant loads to SMRD instructions and legalizing
the SMRDs with non-uniform addresses during the SIFixSGPRCopesPass.

This new solution is more simple and also generates much better code, because
the instruction selector is able to take advantage of all the MUBUF addressing
modes that are legalization pass wasn't able to.

We also no longer need to generate v_add_* instructions when we
have a uniform pointer and a non-uniform offset, as this is now folded into the
MUBUF instruction during instruction selection.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15425

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255672 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-15 20:55:55 +00:00
Tom Stellard
7d2a810fef AMDGPU/SI: Emit constant arrays in the .text section
Summary:
This allows us to remove the END_OF_TEXT_LABEL hack we had been using
and simplifies the fixups used to compute the address of constant
arrays.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15257

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@255204 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-10 02:13:01 +00:00
Matt Arsenault
956f59ab56 AMDGPU: Remove SIPrepareScratchRegs
It does not work because of emergency stack slots.
This pass was supposed to eliminate dummy registers for the
spill instructions, but the register scavenger can introduce
more during PrologEpilogInserter, so some would end up
left behind if they were needed.

The potential for spilling the scratch resource descriptor
and offset register makes doing something like this
overly complicated. Reserve registers to use for the resource
descriptor and use them directly in eliminateFrameIndex.

Also removes creating another scratch resource descriptor
when directly selecting scratch MUBUF instructions.

The choice of which registers are reserved is temporary.
For now it attempts to pick the next available registers
after the user and system SGPRs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254329 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-30 21:15:53 +00:00
Matt Arsenault
454a57ccb6 AMDGPU: Add pass to detect used kernel features
Mark kernels that use certain features that require user
SGPRs to support with kernel attributes. We need to know
before instruction selection begins because it impacts
the kernel calling convention lowering.

For now this only detects the workitem intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252323 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 18:01:57 +00:00
Matt Arsenault
01d92b08c8 AMDGPU: Initialize SIFixSGPRCopies so -print-after works
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251995 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 22:30:13 +00:00
Matt Arsenault
e008212ebb AMDGPU: Register some more passes so -print-before works
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250071 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-12 17:43:59 +00:00
Justin Bogner
b224a73497 CodeGen: print and verify after TargetPassConfig::insertPass by default
In r224059, we started verifying after addPass, but missed doing so on
insertPass. There isn't a good reason for the discrepancy, and
skipping the verifier in these cases causes bugs.

This also exposes a verifier error that was introduced in r249087, but
the verifier doesn't run until after the register coalescer, when the
issue happens to have been resolved. I've skipped the verifier after
SIFixSGPRLiveRangesID to avoid the failures for now and will follow up
with Matt for a proper fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249643 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-08 00:36:22 +00:00
Matt Arsenault
d07019533e AMDGPU: Properly register passes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249495 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 00:42:53 +00:00
Matt Arsenault
a7474fd041 AMDGPU: Move SIFixSGPRLiveRanges to be a regalloc pass
Replace LiveInterval usage with LiveVariables. LiveIntervals
computes far more information than is needed for this pass
which just needs to find if an SGPR is live out of the
defining block.

LiveIntervals are not usually available that early, requiring
computing them twice which is very expensive. The extra run of
LiveIntervals/LiveVariables/SlotIndexes was costing in total
about 5% of compile time.

Continuing to use LiveIntervals is problematic. It seems
there is an option (early-live-intervals) to run the analysis
about where it should go to avoid recomputing LiveVariables,
but it seems to be completely broken with subreg liveness
enabled. There are also problems from trying to recompute
LiveIntervals since this seems to undo LiveVariables
and clearing kill flags, causing TwoAddressInstructions
to make bad decisions.

Insert the pass right after live variables and preserve it.
The tricky case to worry about might be phis since
LiveVariables doesn't count a register as live out if
in the successor block it is only used in a phi,
but I don't think this is a concern right now
because SIFixSGPRCopies replaces SGPR phis.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249087 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 22:10:03 +00:00
Tom Stellard
1566e71dbd AMDGPU/SI: Use .hsatext section instead of .text for HSA
Reviewers: arsenm, grosbach, rafael

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 21:41:28 +00:00
Matt Arsenault
a481a619e7 AMDGPU: Disable some passes that are not meaningful
Don't run passes related to stack maps, garbage collection,
exceptions since these aren't useful for GPUs.

There might be a few more to turn off that I'm less sure about
(e.g. ShrinkWrapping) or I'm not sure how to disable
(SafeStack and StackProtector)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248591 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:41:20 +00:00
Eric Christopher
973f7aa32a constify the Function parameter to the TTI creation callback and
propagate to all callers/users/etc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247864 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-16 23:38:13 +00:00
Matt Arsenault
60b173f7cb AMDGPU: Make sure to run verifier after SIFixSGPRLiveRanges
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@245769 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-22 00:19:34 +00:00
Tom Stellard
945ad7d241 AMDGPU: Add pass to lower OpenCL image and sampler arguments.
The pass adds new kernel arguments for image attributes, and
resolves calls to dummy attribute and resource id getter functions.

Patch by: Zoltan Gilian

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244372 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-07 23:19:30 +00:00
Matt Arsenault
3aa0d7cb53 AMDGPU/SI: Fix read2 merging into a super register.
If the read2 produced was supposed to be writing into a
super register, it would use the wrong subregister indices.
Fix this by inserting copies, so we only ever write to a vreg_64.
Run the register coalescer again to clean this up, although this
isn't ideal and often does result in an extra move.

Also remove the assert that offset1 > offset0.

There isn't a real reason to not allow this other than a minor
convenience in the compiler, and it doesn't seem worth the effort
of avoiding it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242174 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-14 17:57:36 +00:00
Mehdi Amini
966e6ca1ac Make TargetTransformInfo keeping a reference to the Module DataLayout
DataLayout is no longer optional. It was initialized with or without
a DataLayout, and the DataLayout when supplied could have been the
one from the TargetMachine.

Summary:
This change is part of a series of commits dedicated to have a single
DataLayout during compilation by using always the one owned by the
module.

Reviewers: echristo

Subscribers: jholewinski, llvm-commits, rafael, yaron.keren

Differential Revision: http://reviews.llvm.org/D11021

From: Mehdi Amini <mehdi.amini@apple.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241774 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-09 02:08:42 +00:00
Matt Arsenault
b560f9ca2f AMDGPU: Run SIInsertWaits as pre-emit pass
Running this after the scheduler enables scheduling
waits later so other ALU instructions can run while
this would be waiting.

When combined with enabling the post-RA scheduler, this
gives about a ~20% improvement on sgemm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@241473 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-06 17:02:20 +00:00
Tom Stellard
953c681473 R600 -> AMDGPU rename
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 03:28:10 +00:00
Tom Stellard
b1162b8d4b Revert "AMDGPU: Add core backend files for R600/SI codegen v6"
This reverts commit 4ea70107c5e51230e9e60f0bf58a0f74aa4885ea.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160303 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 18:19:53 +00:00
Tom Stellard
23dc769a9b AMDGPU: Add core backend files for R600/SI codegen v6
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@160270 91177308-0d34-0410-b5e6-96231b3b80d8
2012-07-16 14:17:08 +00:00