Commit Graph

1010 Commits

Author SHA1 Message Date
Nirav Dave
acc2c1d71d Elide stores which are overwritten without being observed.
Summary:
In SelectionDAG, when a store is immediately chained to another store
to the same address, elide the first store as it has no observable
effects. This is causes small improvements dealing with intrinsics
lowered to stores.

Test notes:

* Many testcases overwrite store addresses multiple times and needed
  minor changes, mainly making stores volatile to prevent the
  optimization from optimizing the test away.

* Many X86 test cases optimized out instructions associated with
  associated with va_start.

* Note that test_splat in CodeGen/AArch64/misched-stp.ll no longer has
  dependencies to check and can probably be removed and potentially
  replaced with another test.

Reviewers: rnk, john.brawn

Subscribers: aemerson, rengolin, qcolombet, jyknight, nemanjai, nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D33206

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303198 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-16 19:43:56 +00:00
Dmitry Preobrazhensky
232c3d52ea [AMDGPU][MC] Corrected several VI opcodes to avoid printing _e64
See bug 32936: https://bugs.llvm.org//show_bug.cgi?id=32936

Reviewers: artem.tamazov, vpykhtin

Differential Revision: https://reviews.llvm.org/D33123

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@303070 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-15 14:28:23 +00:00
Changpeng Fang
ed4c8077b0 AMDGPU/SI: Don't promote to vector if the load/store is volatile.
Summary:
  We should not change volatile loads/stores in promoting alloca to vector.

Reviewers:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D33107

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 20:31:12 +00:00
Tom Stellard
593d52aaad AMDGPU: Add lit.local.cfg to disable global-isel tests when global-isel is disabled
This should fix bots broken by r302919.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302928 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 17:59:30 +00:00
Tom Stellard
f366f4cc57 AMDGPU/GlobalISel: Mark 32-bit integer constants as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D33115

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302919 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-12 16:46:46 +00:00
Matt Arsenault
9f4e5a06c6 AMDGPU: Remove tfe bit from flat instruction definitions
We don't use it and it was removed in gfx9, and the encoding
bit repurposed.

Additionally actually using it requires changing the output register
class, which wasn't done anyway.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-11 17:38:33 +00:00
Matt Arsenault
2bbb56fd75 AMDGPU: Pull fneg out of extract_vector_elt
This allows folding source modifiers in more f16 cases.
Makes it easier to select per-component packed neg modifiers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302813 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-11 17:26:25 +00:00
Dmitry Preobrazhensky
0e72980cc2 [AMDGPU][MC] Corrected v_madak/madmk to avoid printing "_e32" in disassembler output
See bug 32927: https://bugs.llvm.org//show_bug.cgi?id=32927

Reviewers: vpykhtin, artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D32913

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302648 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-10 13:00:28 +00:00
Kannan Narayanan
96d48fac54 [AMDGPU] In the new waitcnt insertion pass, use getHeader
instead of getTopBlock to find the loop header.

Differential Revision: https://reviews.llvm.org/D32831



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302290 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 21:10:17 +00:00
Matthias Braun
0cb25a2a10 MIParser/MIRPrinter: Compute block successors if not explicitely specified
- MIParser: If the successor list is not specified successors will be
  added based on basic block operands in the block and possible
  fallthrough.

- MIRPrinter: Adds a new `simplify-mir` option, with that option set:
  Skip printing of block successor lists in cases where the
  parser is guaranteed to reconstruct it. This means we still print the
  list if some successor cannot be determined (happens for example for
  jump tables), if the successor order changes or branch probabilities
  being unequal.

Differential Revision: https://reviews.llvm.org/D31262

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302289 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 21:09:30 +00:00
Konstantin Zhuravlyov
d2ff9194d6 AMDGPU/AMDHSA: Set COMPUTE_PGM_RSRC2:LDS_SIZE to 0
This field is populated by the CP

Differential Revision: https://reviews.llvm.org/D32619


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302277 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-05 20:13:55 +00:00
Marek Olsak
24aaeeb480 AMDGPU: GFX9 GS and HS shaders always have the scratch wave offset in SGPR5
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D32645

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302200 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 22:25:20 +00:00
Chad Rosier
fd8f24ed83 [DAGCombine] Transform (fadd A, (fmul B, -2.0)) -> (fsub A, (fadd B, B)).
Differential Revision: http://reviews.llvm.org/D32596

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302153 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-04 14:14:44 +00:00
Matt Arsenault
5c95b810cb AMDGPU: Don't promote alloca to LDS for leaf functions
LDS use in leaf functions not currently handled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301958 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 18:33:18 +00:00
Matt Arsenault
822c8e2ad2 AMDGPU: Make intrinsics speculatable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301937 91177308-0d34-0410-b5e6-96231b3b80d8
2017-05-02 16:57:44 +00:00
Matt Arsenault
23450e5997 AMDGPU: Fix copies from physical registers in SIFixSGPRCopies
This would assert when there were multiple defs of
a physical register.

We just need to move all of the users of it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301730 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-29 01:26:34 +00:00
Marek Olsak
007530e1a2 AMDGPU: Add new amdgcn.init.exec intrinsics
v2: More tests, bug fixes, cosmetic changes.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D31762

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301677 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-28 20:21:58 +00:00
Konstantin Zhuravlyov
c75cdfc65b AMDGPU: Fix ValueKind code object metadata for images
Differential Revision: https://reviews.llvm.org/D32504


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301360 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-25 20:38:26 +00:00
Matt Arsenault
47fbd9bc5d Revert "StructurizeCFG: Directly invert cmp instructions"
This reverts commit r300732. This breaks a few tests.
I think the problem is related to adding more uses of
the condition that don't yet exist at this point.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 20:25:01 +00:00
Matt Arsenault
38bd5524b0 AMDGPU: Select scratch mubuf offsets when pointer is a constant
In call sequence setups, there may not be a frame index base
and the pointer is a constant offset from the frame
pointer / scratch wave offset register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301230 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 19:40:59 +00:00
Stanislav Mekhanoshin
49a37e6bb1 [AMDGPU] Merge M0 initializations
Merges equivalent initializations of M0 and hoists them into a common
dominator block. Technically the same code can be used with any
register, physical or virtual.

Differential Revision: https://reviews.llvm.org/D32279

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301228 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 19:37:54 +00:00
Yaxun Liu
76c532ddba CodeGen: Add a hook for getFenceOperandTy
Currently the operand type for ATOMIC_FENCE assumes value type of a pointer in address space 0.
This is fine for most targets. However for amdgcn target, the size of pointer in address space 0
depends on triple environment. For amdgiz environment, it is 64 bit but for other environment it is
32 bit. On the other hand, amdgcn target expects 32 bit fence operands independent of the target
triple environment. Therefore a hook is need in target lowering for getting the fence operand type.

This patch has no effect on targets other than amdgcn.

Differential Revision: https://reviews.llvm.org/D32186


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301215 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 18:26:27 +00:00
Matt Arsenault
666020a37d AMDGPU: Move trap lowering to DAG
Fixes traps in any block besides the entry block,
and fixes depending on a live-in physical register
by using a virtual register copy.

Also happens to stop emitting a nop in the case
debug trap is not supported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301206 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 17:49:13 +00:00
Nicolai Haehnle
c3187b408e AMDGPU: Move v_readlane lane select from VGPR to SGPR
Summary:
Fix a compiler bug when the lane select happens to end up in a VGPR.

Clarify the semantic of the corresponding intrinsic to be that of
the corresponding GLSL: the lane select must be uniform across a
wave front, otherwise results are undefined.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D32343

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301197 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 17:17:36 +00:00
Nicolai Haehnle
1c1f7ef631 AMDGPU: Fix crash when scheduling non-memory SMRD instructions
Summary: Fixes piglit spec/arb_shader_clock/execution/*

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D32345

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301191 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-24 16:53:52 +00:00
David Blaikie
dae36df6c6 Fix test from polluting the source tree
(though this seems like a "does this not crash" test - which isn't very
good. Should be fixed)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301071 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-22 07:53:40 +00:00
Konstantin Zhuravlyov
8c373cc5e6 AMDGPU: Temporarily disable packed inlinable literals (v2f16, v2i16)
Differential Revision: https://reviews.llvm.org/D32361


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301028 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 19:45:22 +00:00
Konstantin Zhuravlyov
ac73bb1e6a AMDGPU: Fix S_PACK_HH_B32_B16
- We really ought to zero out lower 16 bits

Differential Revision: https://reviews.llvm.org/D32356


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301026 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 19:35:05 +00:00
Yaxun Liu
5255fef87c [AMDGPU] Handle SI_MASKED_UNREACHABLE in instruction emitter
SI_MASKED_UNREACHABLE does not have machine instruction encoding.
It needs special handling in AMDGPUAsmPrinter::EmitInstruction like some
other pseudo instructions.

This patch fixes compilation failure of RadeonRays.

Differential Revision: https://reviews.llvm.org/D32364


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301025 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 19:32:02 +00:00
Konstantin Zhuravlyov
989643fa78 AMDGPU: Do not lower fast unsafe div for safe, f32, with fp32 denormals
Differential Revision: https://reviews.llvm.org/D32085


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301023 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-21 19:25:33 +00:00
Yaxun Liu
1baa360f32 CodeGen: Let frame index value type match alloca addr space
Recently alloca address space has been added to data layout. Due to this
change, pointer returned by alloca may have different size as pointer in
address space 0.

However, currently the value type of frame index is assumed to be of the
same size as pointer in address space 0.

This patch fixes that.

Most targets assume alloca returning pointer in address space 0, which
is the default alloca address space. Therefore it is NFC for them.

AMDGCN target with amdgiz environment requires this change since it
assumes alloca returning pointer to addr space 5 and its size is 32,
which is different from the size of pointer in addr space 0 which is 64.

Differential Revision: https://reviews.llvm.org/D32021


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300864 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-20 18:15:34 +00:00
Matt Arsenault
ac9b651ab6 AMDGPU: Custom lower illegal small select types
Promote them to i32 vectors to avoid unpacking and re-packing
the vectors.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-19 20:53:07 +00:00
Matt Arsenault
0e1e60b73a AMDGPU: Don't emit amd_kernel_code_t for callable functions
This is inserted directly in the text section. The relocation
for the function ends up resolving to the beginning of the
amd_kernel_code_t header rather than the actual function
entry point.

Also skip some of the comments for initialization
that only makes sense for kernels.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300736 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-19 19:38:10 +00:00
Matt Arsenault
fe6b2045f8 StructurizeCFG: Directly invert cmp instructions
The most common case for a branch condition is
a single use compare. Directly invert the branch
predicate rather than adding a lot of xor i1 true
which the DAG will have to fold later.

This produces nicer to read structurizer output.

This produces some random changes in codegen
due to the DAG swapping branch conditions itself,
and then does a poor job of dealing with those
inverts.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300732 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-19 18:29:07 +00:00
Matt Arsenault
902e7e59d1 AMDGPU: Don't align callable functions to 256
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300720 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-19 17:42:39 +00:00
Matt Arsenault
610621c4ba AMDGPU: Change DivergenceAnalysis for function arguments
Stop assuming all functions are kernels.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300719 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-19 17:42:34 +00:00
Matt Arsenault
74ac54ec5b AMDGPU: Use MachineRegisterInfo to find max used register
Avoid looping through program to determine register counts.
This avoids needing to look at regmask operands.

Also fixes some counting errors with flat_scr when there
are no stack objects.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300482 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-17 19:48:30 +00:00
Stanislav Mekhanoshin
d8c6515dbf [AMDGPU] set read_only access qualifier for pointers
If a kernel's pointer argument is known to be readonly
set access qualifier accordingly. This allows RT not to
flush caches before dispatches.

Differential Revision: https://reviews.llvm.org/D32091

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300362 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-14 19:11:40 +00:00
Stanislav Mekhanoshin
b02882850b [AMDGPU] added SIInstrInfo::getAddNoCarry() helper
Addressed rest of post submit comments from D31993.

Differential Revision: https://reviews.llvm.org/D32057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300288 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-14 00:33:44 +00:00
Konstantin Zhuravlyov
2b1bb6d8f8 AMDGPU/GFX9: Do not use v_pack_b32_f16 when packing
Differential Revision: https://reviews.llvm.org/D31819


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300275 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-13 23:17:00 +00:00
Stanislav Mekhanoshin
881e9f3177 [AMDGPU] Combine DS operations with offsets bigger than byte
In many cases ds operations can be combined even if offsets do not
fit into 8 bit encoding. What it takes is to adjust base address.

Differential Revision: https://reviews.llvm.org/D31993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300227 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-13 17:53:07 +00:00
Wei Ding
24e7b0fb5c AMDGPU : Fix common dominator of two incoming blocks terminates with uniform branch issue.
Differential Revision: http://reviews.llvm.org/D31350

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300142 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 23:51:47 +00:00
Matt Arsenault
ab28f3b39e AMDGPU: Fix invalid copies when copying i1 to phys reg
Insert a VReg_1 virtual register so the i1 workaround pass
can handle it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300113 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 21:58:23 +00:00
Stanislav Mekhanoshin
bb9002fbb2 [AMDGPU] Generate range metadata for workitem id
If workgroup size is known inform llvm about range returned by local
id  and local size queries.

Differential Revision: https://reviews.llvm.org/D31804

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300102 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 20:48:56 +00:00
Sam Kolton
9a421935fb [AMDGPU] SDWA: make pass global
Summary: Remove checks for basic blocks.

Reviewers: vpykhtin, rampitec, arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D31935

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300040 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 09:36:05 +00:00
Kannan Narayanan
d3302ddc52 [AMDGPU] Add a new pass to insert waitcnts. Leave under an option for testing.
Based on comments in https://reviews.llvm.org/D31161.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300023 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-12 03:25:12 +00:00
Matt Arsenault
8c86ad544b AMDGPU: Insert wait at start of callee functions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300000 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 22:29:31 +00:00
Matt Arsenault
56db90276b AMDGPU: Refactor SIMachineFunctionInfo slightly
Prepare for handling non-entry functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299999 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 22:29:28 +00:00
Matt Arsenault
938bfaf893 AMDGPU: Refactor argument lowering
Split into smaller functions and prepare for handling
non-entry functions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299998 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 22:29:24 +00:00
Matt Arsenault
651ac56097 AMDGPU: Fix folding reg_sequence into copy to phys reg
This was producing an illegal reg_sequence defining
a physical register with virtual register inputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299997 91177308-0d34-0410-b5e6-96231b3b80d8
2017-04-11 22:29:19 +00:00