Commit Graph

96 Commits

Author SHA1 Message Date
Pete Cooper
6d024c616a Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 05:56:52 +00:00
Pete Cooper
8b170f7f29 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-18 22:17:24 +00:00
Dan Gohman
7673d242f7 Use TargetRegisterInfo for printing MachineOperand register comments
Several places in AsmPrinter.cpp print comments describing MachineOperand
registers using MCRegisterInfo, which uses MCOperand-oriented names. This
doesn't work for targets that use virtual registers exclusively, as
WebAssembly does, since virtual registers are represented and printed
differently.

This patch preserves what seems to be the spirit of r229978, avoiding the
use of TM.getSubtargetImpl(), while still using MachineOperand-oriented
printing for MachineOperands.

Differential Revision: http://reviews.llvm.org/D14709


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253338 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-17 16:01:28 +00:00
Tom Stellard
b6cc1fddcd Revert "Remove unnecessary call to getAllocatableRegClass"
This reverts commit r252565.

This also includes the revert of the commit mentioned below in order to
avoid breaking tests in AMDGPU:

Revert "AMDGPU: Set isAllocatable = 0 on VS_32/VS_64"

This reverts commit r252674.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252956 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-12 21:43:25 +00:00
Matt Arsenault
259b76dfea AMDGPU: Set isAllocatable = 0 on VS_32/VS_64
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252674 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-11 00:01:32 +00:00
Tom Stellard
136bd632b6 DAGCombiner: Check shouldReduceLoadWidth before combining (and (load), x) -> extload
Reviewers: resistor, arsenm

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D13805

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252349 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 21:58:37 +00:00
Matt Arsenault
ade9b95acb AMDGPU: Create emergency stack slots during frame lowering
Test has a bogus verifier error which will be fixed by later commits.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252327 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 18:17:45 +00:00
Matt Arsenault
454a57ccb6 AMDGPU: Add pass to detect used kernel features
Mark kernels that use certain features that require user
SGPRs to support with kernel attributes. We need to know
before instruction selection begins because it impacts
the kernel calling convention lowering.

For now this only detects the workitem intrinsics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252323 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 18:01:57 +00:00
Matt Arsenault
bf0ce512c5 AMDGPU: Hack for VS_32 register pressure
For some reason VS_32 ends up factoring into the pressure heuristics
even though we should never see a virtual register with this class.

When SGPRs are reserved for register spilling, this for some reason
triggers reg-crit scheduling.

Setting isAllocatable = 0 may help with this since that seems to remove
it from the default implementation's generated table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252321 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 17:54:43 +00:00
Tom Stellard
65cad952e4 AMDGPU/SI: Emit HSA kernels with symbol type STT_AMDGPU_HSA_KERNEL
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252291 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-06 11:45:14 +00:00
Peter Collingbourne
5f220beefc DI: Reverse direction of subprogram -> function edge.
Previously, subprograms contained a metadata reference to the function they
described. Because most clients need to get or set a subprogram for a given
function rather than the other way around, this created unneeded inefficiency.

For example, many passes needed to call the function llvm::makeSubprogramMap()
to build a mapping from functions to subprograms, and the IR linker needed to
fix up function references in a way that caused quadratic complexity in the IR
linking phase of LTO.

This change reverses the direction of the edge by storing the subprogram as
function-level metadata and removing DISubprogram's function field.

Since this is an IR change, a bitcode upgrade has been provided.

Fixes PR23367. An upgrade script for textual IR for out-of-tree clients is
attached to the PR.

Differential Revision: http://reviews.llvm.org/D14265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252219 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 22:03:56 +00:00
Matt Arsenault
76b6b15dcd AMDGPU: Fix assert when legalizing atomic operands
The operand layout is slightly different for the atomic
opcodes from the usual MUBUF loads and stores.

This should only fix it on SI/CI. VI is still broken
because it still emits the addr64 replacement.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252140 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 02:46:56 +00:00
Matt Arsenault
ecbecea873 AMDGPU: Add missing v2f64 fadd tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 01:03:11 +00:00
Matt Arsenault
142cd116fa AMDGPU: Stop assuming vreg for build_vector
This was causing a variety of test failures when v2i64
is added as a legal type.

SIFixSGPRCopies should correctly handle the case of vector inputs
to a scalar reg_sequence, so this isn't necessary anymore. This
was hiding some deficiencies in how reg_sequence is handled later,
but this shouldn't be a problem anymore since the register class
copy of a reg_sequence is now done before the reg_sequence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251860 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:30:48 +00:00
Matt Arsenault
896d6554cb AMDGPU: Error on graphics shaders with HSA
I've found myself pointlessly debugging problems from running
graphics tests with an HSA triple a few times, so stop this from
happening again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251858 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:23:02 +00:00
Matt Arsenault
ee4932fd12 AMDGPU: Un XFAIL a test
This should probably be merged with one of the other private memory
tests, but it fails on r600.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251856 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:15:46 +00:00
Matt Arsenault
bd96659e08 AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE
Make the REG_SEQUENCE be a VGPR, and do the register class
copy first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:15:42 +00:00
Marek Olsak
40e7b16d54 AMDGPU/SI: handle undef for llvm.SI.packf16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:29:09 +00:00
Marek Olsak
595eb2bbd1 AMDGPU/SI: use S_OR for fneg (fabs f32)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251631 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:29:05 +00:00
Marek Olsak
bbc32e3efd AMDGPU/SI: use S_AND for i1 trunc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251630 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:05:03 +00:00
Matt Arsenault
caa831cabd AMDGPU: Fix verifier error in SIFoldOperands
There may be other use operands that also need their kill flags cleared.

This happens in a few tests when SIFoldOperands is moved after
PeepholeOptimizer.

PeepholeOptimizer rewrites cases that look like:
%vreg0 = ...
%vreg1 = COPY %vreg0
use %vreg1<kill>
%vreg2 = COPY %vreg0
use %vreg2<kill>

to use the earlier source to
%vreg0 = ...
use %vreg0
use %vreg0

Currently SIFoldOperands sees the copied registers, so there is
only one use. So far I haven't managed to come up with a test
that currently has multiple uses of a foldable VGPR -> VGPR copy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250960 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-21 22:37:50 +00:00
Matt Arsenault
08e1ec066d AMDGPU: Stop reserving v[254:255]
This wasn't doing anything useful. They weren't explicitly used
anywhere, and the RegScavenger ignores reserved registers.

This for some reason caused a random scheduling change in the test.
Getting the check lines to pass is too frustrating, and there's probably
not too much value in checking the vector case's operands N times.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250794 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-20 03:59:58 +00:00
Matt Arsenault
1349eb7659 DAGCombiner: Don't stop finding better chain on 2 aliases
The comment says this was stopped because it was unlikely to be
profitable. This is not true if you want to combine vector loads
with multiple components.

For a simple case that looks like

t0 = load t0 ...
t1 = load t0 ...
t2 = load t0 ...
t3 = load t0 ...

t4 = store t0:1, t0:1
  t5 = store t4, t1:0
    t6 = store t5, t2:0
	  t7 = store t6, t3:0

We want to get all of these stores onto a chain
that is a TokenFactor of these N loads. This mostly
solves the AMDGPU merge-stores.ll regressions
with -combiner-alias-analysis for merging vector
stores of vector loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250138 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-13 00:49:00 +00:00
Matt Arsenault
c08fe15c4f DAGCombiner: Combine extract_vector_elt from build_vector
This basic combine was surprisingly missing.
AMDGPU legalizes many operations in terms of 32-bit vector components,
so not doing this results in many extra copies and subregister extracts
that need to be cleaned up later.

InstCombine already does this for the hasOneUse case. The target hook
is to fix a handful of tests which break (e.g. ARM/vmov.ll) which turn
from a vector materialize repeated immediate instruction to a constant
vector load with more scalar copies from it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250129 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-12 23:59:50 +00:00
Matt Arsenault
3f7c35a966 AMDGPU: Use explicit register size indirect pseudos
This stops using an unknown reg class operand.

Currently build_vector selection has a broken looking check
where it tries to use a VGPR reg class and an SGPR one if it
sees an SGPR use.

With the source operand has an explicit VGPR class,
illegal copies will be inserted that SIFixSGPRCopies will take care
of normally later, which will allow removing the weird check
of build_vector users. Without this, when removed v_movrels_b32 would
still be emitted even though all of the values were only stored in
SGPRs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249494 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-07 00:42:51 +00:00
Tom Stellard
a152ab62bf AMDGPU/SI: Remove calling convention assertion from LowerFormalArguments()
Summary:
We currently ignore the calling convention, so there is no real reason to
assert on the calling convention of functions.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13367

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249468 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-06 21:16:34 +00:00
Tom Stellard
5246a3643b AMDGPU/SI: Remove assert from AMDGPUOpenCLImageTypeLowering pass
Summary:
Instead of asserting when the kernel metadata is different than we expect,
we should just skip lowering that function.  This fixes assertion
failures with OpenCL argument metadata from older LLVM releases.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D13356

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249073 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 21:16:05 +00:00
Tom Stellard
7fe09ce82f AMDGPU: Add MEM_RAT STORE_TYPED.
v2: Add test (Matt).
    Fix capitalization of isEOP (Matt).
    Move pattern to class parameter (Matt).
    Make the instruction available to Cayman (Matt).
    Change name from MEM_RAT WRITE_TYPED to MEM_RAT STORE_TYPED.

Patch by: Zoltan Gilian

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@249042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-01 17:51:34 +00:00
Matt Arsenault
3443ffa833 AMDGPU: Fix splitting x16 SMRD loads
When used recursively, this would set the kill flag
on the intermediate step from first splitting
x16 to x8.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248741 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:52 +00:00
Matt Arsenault
33d8695b88 AMDGPU: Fix moving SMRD loads with literal offsets on CI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248740 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:46 +00:00
Matt Arsenault
445a12ee1c AMDGPU: Add testcases
Make sure we are testing moving users
of the moved and split SMRD loads.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248738 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:38 +00:00
Matt Arsenault
ba1978624f AMDGPU: Cleanup test
Run instnamer on it, and rename check prefix.

This is in preparation for adding new testcases to cover
bugs on other subtargets.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248737 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-28 20:54:32 +00:00
Matt Arsenault
23663b84b9 AMDGPU: Fix sched model for VOP2b instructions
Trying to use the version with the explicit output operand
would complain because of the missing WriteSALU. I'm not sure
why it doesn't complain about this with the implicit VCC def.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248646 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-26 02:25:45 +00:00
Tom Stellard
1566e71dbd AMDGPU/SI: Use .hsatext section instead of .text for HSA
Reviewers: arsenm, grosbach, rafael

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12424

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248619 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 21:41:28 +00:00
Matt Arsenault
0a70893bb8 PeepholeOptimizer: Remove redundant copies
If a virtual register is copied and another copy was already
seen, replace with the previous copy. This only handles the
simplest cases for now.

This pattern shows up from various operand restrictions
AMDGPU has which require inserting copies depending
on the register class of the operands.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248611 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 20:22:12 +00:00
Matt Arsenault
b10f8d4469 AMDGPU: Add some more tests for literal operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248600 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 18:21:47 +00:00
Matt Arsenault
9225f01169 AMDGPU: Handle i64->v2i32 loads/stores in PreprocessISelDAG
This fixes a select error when the i64 source was also
bitcasted to v2i32 in the original source.

Instead of awkwardly trying to select the modified source value and
the store, replace before isel begins.

Uses a worklist to avoid possible problems from mutating the DAG,
although it seems to work OK without it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248589 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 17:27:08 +00:00
Matt Arsenault
4fc498f43c AMDGPU: Improve accuracy of instruction rates for VOPC
These were all using the default 32-bit VALU write class,
but the i64/f64 compares are half rate.

I'm not sure this is really correct, because they are still using
the write to VALU write class, even though they really write
to the SALU.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248582 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-25 16:58:25 +00:00
Matt Arsenault
d0edb1f758 AMDGPU: Add s_dcache_* instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248533 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:27 +00:00
Matt Arsenault
1348e9d04d AMDGPU: Add cache invalidation instructions.
These are necessary for implementing mem_fence for
OpenCL 2.0.

The VI assembler tests are disabled since it seems to be
using the wrong encoding or opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248532 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 19:52:21 +00:00
Matt Arsenault
b10121bd9d Introduce target hook for optimizing register copies
Allow a target to do something other than search for copies
that will avoid cross register bank copies.

Implement for SI by only rewriting the most basic copies,
so it should look through anything like a subregister extract.

I'm not entirely satisified with this because it seems like
eliminating a reg_sequence that isn't fully used should work
generically for all targets without them having to override
something. However, it seems to be tricky to have a simple
implementation of this without rewriting to invalid  kinds
of subregister copies on some targets.

I'm not sure if there is currently a generic way to easily check
if a subregister index would be valid for the current use.
The current set of TargetRegisterInfo::get*Class functions don't
quite behave like I would expect (e.g. getSubClassWithSubReg
returns the maximal register class rather than the minimal), so
I'm not sure how to make the generic test keep searching if
SrcRC:SrcSubReg is a valid replacement for DefRC:DefSubReg. Making
the default implementation to check for simple copies breaks
a variety of ARM and x86 tests by producing illegal subregister uses.

The ARM tests are not actually changed since it should still be using
the same sharesSameRegisterFile implementation, this just relaxes
them to not check for specific registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248478 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 08:36:14 +00:00
Matt Arsenault
809a684e81 AMDGPU: Fix printing trailing whitespace for mubuf atomics
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248472 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:51:17 +00:00
Matt Arsenault
bb9c0afde5 AMDGPU: Reduce number of copies emitted
Instead of always inserting a copy in case
the super register is itself a subregister,
only extract to the super reg class if this is
actually the case.

This shouldn't really change codegen, but
makes looking at the output of SIFixSGPRCopies
easier to read.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248467 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-24 07:16:37 +00:00
Matthias Braun
6404850444 LiveIntervalAnalysis: Avoid multiple connected liveness components
We may have subregister defs which are unused but not discovered and
cleaned up prior to liveness analysis. This creates multiple connected
components in the resulting live range which are forbidden in the
MachineVerifier because they would unnecesarily constrain the register
allocator. Rewrite those dead definitions to define a newly created
virtual register.

Differential Revision: http://reviews.llvm.org/D13035

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248335 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-22 22:37:44 +00:00
Simon Pilgrim
41fc6f18d7 [DAGCombiner] Improve FMA support for interpolation patterns
This patch adds support for combining patterns such as (FMUL(FADD(1.0, x), y)) and (FMUL(FSUB(x, 1.0), y)) to their FMA equivalents.

This is useful in particular for linear interpolation cases such as (FADD(FMUL(x, t), FMUL(y, FSUB(1.0, t))))

Differential Revision: http://reviews.llvm.org/D13003


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248210 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 20:32:48 +00:00
Matt Arsenault
3f80d148b1 DAGCombiner: Replace store of FP constant after attemping store merges
If storing multiple FP constants, some subset of the stores
would be replaced with integers due to visit order, so
MergeConsecutiveStores would only partially merge
these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248169 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-21 15:59:46 +00:00
Matt Arsenault
124f7d9b25 AMDGPU: Add failing testcase for live interval construction
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248067 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-19 00:03:56 +00:00
Tom Stellard
6680fc3579 AMDGPU/SI: Fold operands through REG_SEQUENCE instructions
Summary:
This helps mostly when we use add instructions for address calculations
that contain immediates.

Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D12256

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247157 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 15:43:26 +00:00
Matt Arsenault
7e657c85c6 SelectionDAG: Support Expand of f16 extloads
Currently this hits an assert that extload should
always be supported, which assumes integer extloads.

This moves a hack out of SI's argument lowering and
is covered by existing tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247113 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-09 01:12:27 +00:00
Matt Arsenault
626338d11e AMDGPU: Handle sub of constant for DS offset folding
sub C, x - > add (sub 0, x), C for DS offsets.

This is mostly to fix regressions that show up when
SeparateConstOffsetFromGEP is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247054 91177308-0d34-0410-b5e6-96231b3b80d8
2015-09-08 19:34:22 +00:00