Commit Graph

1218 Commits

Author SHA1 Message Date
Matt Arsenault
ac5efca3f0 AMDGPU: Use 1/2pi inline imm on VI
I'm guessing at how it is supposed to be printed

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285490 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-29 04:05:06 +00:00
Tom Stellard
b15bbca10c AMDGPU/SI: Don't use non-0 waitcnt values when waiting on Flat instructions
Summary:
Flat instruction can return out of order, so we need always need to wait
for all the outstanding flat operations.

Reviewers: tony-tye, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl

Differential Revision: https://reviews.llvm.org/D25998

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285479 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 23:53:48 +00:00
Matt Arsenault
0f0ebbd2d9 AMDGPU: Fix instruction flags for s_endpgm
Set isReturn, remove hasSideEffects. Also remove
hasCtrlDep, I'm not really sure what that does.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285476 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 23:00:38 +00:00
Matt Arsenault
d6028cdcc7 AMDGPU: Add definitions for scalar store instructions
Also add glc bit to the scalar loads since they exist on VI
and change the caching behavior.

This currently has an assembler bug where the glc bit is incorrectly
accepted on SI/CI which do not have it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285463 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 21:55:15 +00:00
Matt Arsenault
be4e1c4eb0 AMDGPU: Rename glc operand type
While trying to add the glc bit to SMEM instructions on VI
with the new refactoring I ran into some kind of shadowing
problem for the glc operand when using the pseudoinstruction
as a multiclass parameter.

Everywhere that currently uses it defines the operand to have the same
name as its type, i.e. glc:$glc which works. For some reason now it
conflicts, and its up evaluating to the wrong thing. For the
real encoding classes,

let Inst{16} = !if(ps.has_glc, glc, ?); was not being evaluated
and still visible in the Inst initializer in the expanded td file.
In other cases I got a a different error about an illegal operand
where this was using { 0 } initializer from the bits<1> glc initializer
instead of evaluating it as false in the if.

For consistency all of the operand types should probably
be captialized to avoid conflicting with the variable names
unless somebody has a better idea of how to fix this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285462 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 21:55:08 +00:00
Matt Arsenault
6cabc8f486 AMDGPU: Diagnose using too many SGPRs
This is possible when using inline asm.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285447 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 20:31:47 +00:00
Matt Arsenault
2d7bc6b1e1 AMDGPU: Fix using incorrect private resource with no allocation
It's possible to have a use of the private resource descriptor or
scratch wave offset registers even though there are no allocated
stack objects. This would result in continuing to use the maximum
number reserved registers. This could go over the number of SGPRs
available on VI, or violate the SGPR limit requested by
the function attributes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285435 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-28 19:43:31 +00:00
Tom Stellard
a911f5ff01 AMDGPU/SI: Handle hazard with s_rfe_b64
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25638

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285368 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 23:50:21 +00:00
Tom Stellard
8434132101 AMDGPU/SI: Handle hazard with sgpr lane selects for v_{read,write}lane
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25637

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285367 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 23:42:29 +00:00
Tom Stellard
0b23373c51 AMDGPU/SI: Fix unused variable warning on non-debug builds
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285363 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 23:28:03 +00:00
Tom Stellard
5480a2423d AMDGPU/SI: Handle hazard with > 8 byte VMEM stores
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285359 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 23:05:31 +00:00
Tom Stellard
79758d450e AMDGPU/SI: Handle s_setreg hazard in GCNHazardRecognizer
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25528

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285338 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 20:39:09 +00:00
Nicolai Haehnle
ce0a5230f5 AMDGPU: Fix SILoadStoreOptimizer when writes cannot be merged due register dependencies
Summary:
When finding a match for a merge and collecting the instructions that must
be moved, keep in mind that the instruction we merge might actually use one
of the defs that are being moved.

Fixes piglit spec/arb_enhanced_layouts/execution/component-layout/vs-tcs-load-output[-indirect].

The fact that the ds_read in the test case is not eliminated suggests that
there might be another problem related to alias analysis, but that's a
separate problem: this pass should still work correctly even when earlier
optimization passes missed something or were disabled.

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25829

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285273 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-27 08:15:07 +00:00
Yaxun Liu
c93e472923 AMDGPU: Refactor processor definition to use ISA version features
Add missing ISA versions 7.0.2/8.0.4/8.1.0. to backend.

Refactor processor definition to use ISA version features.

Fixed ISA version for stoney.

Based on Laurent Morichetti's patch.

Differential Revision: https://reviews.llvm.org/D25919

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285210 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-26 16:37:56 +00:00
Matt Arsenault
f63894ba9e Reapply "AMDGPU: Don't use offen if it is 0"
This reverts r283003

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285203 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-26 15:08:16 +00:00
Matt Arsenault
44de456371 AMDGPU: Fix counting si_mask_branch as 4 bytes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285202 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-26 14:53:54 +00:00
Tom Stellard
1a633d108a AMDGPU/SI: Don't emit multi-dword flat memory ops when they might access scratch
Summary:
A single flat memory operations that might access the scratch buffer
can only access MaxPrivateElementSize bytes.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25788

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285198 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-26 14:38:47 +00:00
Peter Collingbourne
80e2a2f817 Target: Change various section classifiers in TargetLoweringObjectFile to take a GlobalObject.
These functions are about classifying a global which will actually be
emitted, so it does not make sense for them to take a GlobalValue which may
for example be an alias.

Change the Mach-O object writer and the Hexagon, Lanai and MIPS backends to
look through aliases before using TargetLoweringObjectFile interfaces. These
are functional changes but all appear to be bug fixes.

Differential Revision: https://reviews.llvm.org/D25917

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@285006 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-24 19:23:39 +00:00
Nicolai Haehnle
e07033aa42 AMDGPU: Fix Two Address problems with v_movreld
Summary:
The v_movreld machine instruction is used with three operands that are
in a sense tied to each other (the explicit VGPR_32 def and the implicit
VGPR_NN def and use). There is no way to express that using the currently
available operand bits, and indeed there are cases where the Two Address
instructions pass does the wrong thing.

This patch introduces a new set of pseudo instructions that are identical
in intended semantics as v_movreld, but they only have two tied operands.

Having to add a new set of pseudo instructions is admittedly annoying, but
it's a fairly straightforward and solid approach. The only alternative I
see is to try to teach the Two Address instructions pass about Three Address
instructions, and I'm afraid that's trickier and is going to end up more
fragile.

Note that v_movrels does not suffer from this problem, and so this patch
does not touch it.

This fixes several GL45-CTS.shaders.indexing.* tests.

Reviewers: tstellarAMD, arsenm

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25633

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284980 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-24 14:56:02 +00:00
Konstantin Zhuravlyov
25a30cdacb [AMDGPU] Perform uchar to float combine for ISD::SINT_TO_FP
This will prevent following regression when enabling i16 support (D18049):
  test/CodeGen/AMDGPU/cvt_f32_ubyte.ll

Differential Revision: https://reviews.llvm.org/D25805


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284891 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 22:10:03 +00:00
Tom Stellard
a0d7657789 AMDGPU/SI: Fix crash caused by r284267
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25782

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284875 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 20:25:11 +00:00
Artem Tamazov
3398735496 [AMDGPU][mc] Fix ds_min/max[_rtn]_f32 - extra source operand removed.
Fixes Bug 28215. Lit tests updated.

Differential Revision: https://reviews.llvm.org/D25837

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284825 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-21 14:49:22 +00:00
Konstantin Zhuravlyov
37962abba1 [AMDGPU] Make note record name a static const member of target streamer
Differential Revision: https://reviews.llvm.org/D25746


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284760 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 18:22:36 +00:00
Konstantin Zhuravlyov
e0e811c4ce [AMDGPU] Emit constant address space data in .rodata section and use relocations instead of fixups (amdhsa only)
Differential Revision: https://reviews.llvm.org/D25693


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284759 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 18:12:38 +00:00
Sanjay Patel
928f047b68 [Target] remove TargetRecip class; 2nd try
This is a retry of r284495 which was reverted at r284513 due to use-after-scope bugs
caused by faulty usage of StringRef.

This version also renames a pair of functions:
getRecipEstimateDivEnabled()
getRecipEstimateSqrtEnabled()
as suggested by Eric Christopher.

original commit msg:

[Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering

This is a follow-up to https://reviews.llvm.org/D24816 - where we changed reciprocal estimates to be function attributes
rather than TargetOptions.

This patch is intended to be a structural, but not functional change. By moving all of the
TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate
state, shield the callers from the string format implementation, and simplify/localize the
logic needed for a target to enable this.

If a function has a "reciprocal-estimates" attribute, those settings may override the target's
default reciprocal preferences for whatever operation and data type we're trying to optimize.
If there's no attribute string or specific setting for the op/type pair, just use the target
default settings.

As noted earlier, a better solution would be to move the reciprocal estimate settings to IR
instructions and SDNodes rather than function attributes, but that's a multi-step job that
requires infrastructure improvements. I intend to work on that, but it's not clear how long
it will take to get all the pieces in place.

Differential Revision: https://reviews.llvm.org/D25440



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284746 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 16:55:45 +00:00
Valery Pykhtin
446cd5eefe [AMDGPU] add fcopysign(f64, f32) pattern
Differential revision: https://reviews.llvm.org/D25827

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284743 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 16:17:54 +00:00
Benjamin Kramer
06d5a1641d Do a sweep over move ctors and remove those that are identical to the default.
All of these existed because MSVC 2013 was unable to synthesize default
move ctors. We recently dropped support for it so all that error-prone
boilerplate can go.

No functionality change intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284721 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-20 12:20:28 +00:00
Wei Ding
cc8ca50286 AMDGPU : Add a function to enable and disable IEEEBit for SC and shader
respectively.

Differential Revision: http://reviews.llvm.org/D25789

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284655 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 22:34:49 +00:00
Krzysztof Parzyszek
735fbf86f3 [AMDGPU] Stop using MCRegisterClass::getSize()
Differential Review: https://reviews.llvm.org/D24675


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284619 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-19 17:40:36 +00:00
Sanjay Patel
bbcb21daf0 revert r284495: [Target] remove TargetRecip class
There's something wrong with the StringRef usage while parsing the attribute string.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284513 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 18:36:49 +00:00
Sanjay Patel
5800d6e9a7 [Target] remove TargetRecip class; move reciprocal estimate isel functionality to TargetLowering
This is a follow-up to D24816 - where we changed reciprocal estimates to be function attributes
rather than TargetOptions.

This patch is intended to be a structural, but not functional change. By moving all of the
TargetRecip functionality into TargetLowering, we can remove all of the reciprocal estimate
state, shield the callers from the string format implementation, and simplify/localize the
logic needed for a target to enable this.

If a function has a "reciprocal-estimates" attribute, those settings may override the target's
default reciprocal preferences for whatever operation and data type we're trying to optimize.
If there's no attribute string or specific setting for the op/type pair, just use the target
default settings.

As noted earlier, a better solution would be to move the reciprocal estimate settings to IR
instructions and SDNodes rather than function attributes, but that's a multi-step job that
requires infrastructure improvements. I intend to work on that, but it's not clear how long
it will take to get all the pieces in place.

Differential Revision: https://reviews.llvm.org/D25440


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284495 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-18 17:05:05 +00:00
Konstantin Zhuravlyov
18560f1ee0 [AMDGPU] Mark .note section SHF_ALLOC so lld creates a segment for it
Differential Revision: https://reviews.llvm.org/D25694


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284435 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-17 22:40:15 +00:00
Tom Stellard
0d8c32f076 AMDGPU/SI: LowerParameter() should be computing align based on memory type
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25203

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284398 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-17 16:56:19 +00:00
Tom Stellard
de864533c8 AMDGPU/SI: Fix LowerParameter() for i16 arguments
Summary:
If we are loading an i16 value from a 32-bit memory location, then
we need to be able to truncate the loaded value to i16.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25198

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284397 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-17 16:21:45 +00:00
Tom Stellard
6b339ba8ac AMDGPU/SI: Handle s_getreg hazard in GCNHazardRecognizer
Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25526

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284298 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-15 00:58:14 +00:00
Tom Stellard
18cea770a0 AMDGPU/SI: Use new SimplifyDemandedBits helper for multi-use operations
Summary:
We are using this helper for our 24-bit arithmetic combines, so we are now able to eliminate multi-use operations that mask the high-bits of 24-bit inputs (e.g. and x, 0xffffff)

Reviewers: arsenm, nhaehnle

Subscribers: tony-tye, arsenm, kzhuravl, wdng, nhaehnle, llvm-commits, yaxunl

Differential Revision: https://reviews.llvm.org/D24672

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284267 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 19:14:29 +00:00
Tom Stellard
a9c6165732 AMDGPU/SI: Don't allow unaligned scratch access
Summary: The hardware doesn't support this.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25523

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284257 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 18:10:39 +00:00
Nicolai Haehnle
877e3beed6 AMDGPU: Select 64-bit {ADD,SUB}{C,E} nodes
Summary:
This will be used for 64-bit MULHU, which is in turn used for the 64-bit
divide-by-constant optimization (see D24822).

Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, tony-tye

Differential Revision: https://reviews.llvm.org/D25289

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284224 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 10:30:00 +00:00
Nicolai Haehnle
5d662038f2 AMDGPU: Fix use-after-frees
Reviewers: arsenm, tstellarAMD

Subscribers: kzhuravl, wdng, yaxunl, tony-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D25312

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284215 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 09:03:04 +00:00
Konstantin Zhuravlyov
a91924b28a [AMDGPU] Emit 32-bit lo/hi got and pc relative variant kinds for external and global address space variables
Differential Revision: https://reviews.llvm.org/D25562


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284196 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 04:37:34 +00:00
Konstantin Zhuravlyov
84208dd954 [AMDGPU] Add 32-bit lo/hi got and pc relative variant kinds and emit appropriate relocations
Differential Revision: https://reviews.llvm.org/D25548


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284195 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-14 04:21:32 +00:00
Nirav Dave
080559c6d3 Revert "In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled."
This reverts commit r284151 which appears to be triggering a LTO
failures on Hexagon

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284157 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 20:23:25 +00:00
Nirav Dave
19dc709f4b In visitSTORE, always use FindBetterChain, rather than only when UseAA is enabled.
Retrying after upstream changes.

   Simplify Consecutive Merge Store Candidate Search

   Now that address aliasing is much less conservative, push through
   simplified store merging search which only checks for parallel stores
   through the chain subgraph. This is cleaner as the separation of
   non-interfering loads/stores from the store-merging logic.

   Whem merging stores, search up the chain through a single load, and
   finds all possible stores by looking down from through a load and a
   TokenFactor to all stores visited. This improves the quality of the
   output SelectionDAG and generally the output CodeGen (with some
   exceptions).

   Additional Minor Changes:

       1. Finishes removing unused AliasLoad code
       2. Unifies the the chain aggregation in the merged stores across
       code paths
       3. Re-add the Store node to the worklist after calling
       SimplifyDemandedBits.
       4. Increase GatherAllAliasesMaxDepth from 6 to 18. That number is
       arbitrary, but seemed sufficient to not cause regressions in
       tests.

   This finishes the change Matt Arsenault started in r246307 and
   jyknight's original patch.

   Many tests required some changes as memory operations are now
   reorderable. Some tests relying on the order were changed to use
   volatile memory operations

   Noteworthy tests:

    CodeGen/AArch64/argument-blocks.ll -
      It's not entirely clear what the test_varargs_stackalign test is
      supposed to be asserting, but the new code looks right.

    CodeGen/AArch64/arm64-memset-inline.lli -
    CodeGen/AArch64/arm64-stur.ll -
    CodeGen/ARM/memset-inline.ll -

      The backend now generates *worse* code due to store merging
      succeeding, as we do do a 16-byte constant-zero store efficiently.

    CodeGen/AArch64/merge-store.ll -
      Improved, but there still seems to be an extraneous vector insert
      from an element to itself?

    CodeGen/PowerPC/ppc64-align-long-double.ll -
      Worse code emitted in this case, due to the improved store->load
      forwarding.

    CodeGen/X86/dag-merge-fast-accesses.ll -
    CodeGen/X86/MergeConsecutiveStores.ll -
    CodeGen/X86/stores-merging.ll -
    CodeGen/Mips/load-store-left-right.ll -
      Restored correct merging of non-aligned stores

    CodeGen/AMDGPU/promote-alloca-stored-pointer-value.ll -
      Improved. Correctly merges buffer_store_dword calls

    CodeGen/AMDGPU/si-triv-disjoint-mem-access.ll -
      Improved. Sidesteps loading a stored value and
      merges two stores

    CodeGen/X86/pr18023.ll -
      This test has been removed, as it was asserting incorrect
      behavior. Non-volatile stores *CAN* be moved past volatile loads,
      and now are.

    CodeGen/X86/vector-idiv.ll -
    CodeGen/X86/vector-lzcnt-128.ll -
      It's basically impossible to tell what these tests are actually
      testing. But, looks like the code got better due to the memory
      operations being recognized as non-aliasing.

    CodeGen/X86/win32-eh.ll -
      Both loads of the securitycookie are now merged.

    CodeGen/AMDGPU/vgpr-spill-emergency-stack-slot-compute.ll -
      This test appears to work but no longer exhibits the spill behavior.

Reviewers: arsenm, hfinkel, tstellarAMD, jyknight, nhaehnle

Subscribers: wdng, nhaehnle, nemanjai, arsenm, weimingz, niravd, RKSimon, aemerson, qcolombet, dsanders, resistor, tstellarAMD, t.p.northover, spatel

Differential Revision: https://reviews.llvm.org/D14834

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284151 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 19:20:16 +00:00
Matt Arsenault
c31d80dbf5 AMDGPU: Assume spilling will occur at -O0
Because everything live is spilled at the end of a
block by fast regalloc, assume this will happen and
avoid the copies of the resource descriptor.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284119 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 13:10:00 +00:00
Matt Arsenault
3339279d4a AMDGPU: Fix truncate to bool warnings
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284116 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-13 12:45:16 +00:00
Matt Arsenault
7b6a558f24 AMDGPU: Initial implementation of VGPR indexing mode
This is the most basic handling of the indirect access
pseudos using GPR indexing mode. This currently only enables
the mode for a single v_mov_b32 and then disables it.
This is much more complicated to use than the movrel instructions,
so a new optimization pass is probably needed to fold the access
into the uses and keep the mode enabled for them.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284031 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 18:49:05 +00:00
Matt Arsenault
7e2ade4213 AMDGPU: Add instruction definitions for VGPR indexing
VI added a second method of indexing into VGPRs
besides using v_movrel*

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284027 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 18:00:51 +00:00
Tom Stellard
6589a4ece1 AMDGPU/SI: Change mimg intrinsic signatures
This makes more fields overridable and removes redundant bits.

Patch by: Changpeng Fang

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@284024 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-12 16:35:29 +00:00
Konstantin Zhuravlyov
c7a23a58d8 [AMDGPU] Refactor waitcnt encoding
- Refactor bit packing/unpacking
- Calculate bit mask given bit shift and bit width
- Introduce function for decoding bits of waitcnt
- Introduce function for encoding bits of waitcnt
- Introduce function for getting waitcnt mask (instead of using bare numbers)
- Introduce function fot getting max waitcnt(s) (instead of using bare numbers)

Differential Revision: https://reviews.llvm.org/D25298


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283919 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 18:58:22 +00:00
Changpeng Fang
2683a2de40 AMDGPU/SI: Update ISA version numbers for Tonga and Polaris10/11.
Differential Revision:
  http://reviews.llvm.org/D25454

Reviewers:
  tstellarAMD

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283893 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-11 16:00:47 +00:00