Commit Graph

1613 Commits

Author SHA1 Message Date
Stanislav Mekhanoshin
ee8b410dd7 [AMDGPU] Add address space based alias analysis pass
This is direct port of HSAILAliasAnalysis pass, just cleaned for
style and renamed.

Differential Revision: https://reviews.llvm.org/D31103

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298172 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 23:56:58 +00:00
Matt Arsenault
6cf5553d32 AMDGPU: Fix broken condition in hazard recognizer
Fixes bug 32248.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298125 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 21:36:28 +00:00
Matt Arsenault
a754032661 AMDGPU: Fix handling of constant phi input loop conditions
If the loop condition was an i1 phi with a constantexpr input, this
would add a loop intrinsic fed by a phi dependent on a call to
if.break in the same block. Insert the call in the loop header.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298121 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 20:52:21 +00:00
Matt Arsenault
33eb1b078c AMDGPU: Cleanup control flow intrinsics
Move backend internal intrinsics along with the rest of the
normal intrinsics, and use the Intrinsic::getDeclaration
API instead of manually constructing the type list.

It's surprising this was working before. fdiv.fast had
the wrong number of parameters. The control flow intrinsic
declaration attributes were not being applied, and
their types were inconsistent. The actual IR use types
did not match the declaration, and were closer to the
types used for the patterns. The brcond lowering
was changing the types, so introduce new nodes for those.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298119 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 20:41:45 +00:00
Stanislav Mekhanoshin
17dcd3dc69 Only unswitch loops with uniform conditions
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.

Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.

Differential Revision: https://reviews.llvm.org/D30796

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298104 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-17 17:13:41 +00:00
Stanislav Mekhanoshin
1537930ea1 [AMDGPU] Run always inliner early in opt
We can mark functions to always inline early in the opt. Since we do not have
call support this early inlining creates opportunities for inter-procedural
optimizations which would not occur otherwise.

Differential Revision: https://reviews.llvm.org/D31016

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297958 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-16 16:11:46 +00:00
Matt Arsenault
d0064ed89e AMDGPU: Allow sinking of addressing modes for atomic_inc/dec
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297913 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 23:15:12 +00:00
Matt Arsenault
0c52bece01 AMDGPU: Fix unnecessary ands when packing f16 vectors
computeKnownBits didn't handle fp_to_fp16 to report
the high bits as 0. ARM maps the generic node to an instruction
that does not modify the high bits of the register, so introduce
a target node where the high bits are known 0.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 19:04:26 +00:00
Matt Arsenault
15db330072 AMDGPU: Minor SIAnnotateControlFlow cleanups
Newline fixes, early return, range loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297865 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 18:00:12 +00:00
Sanjay Patel
27c606791d Cyle -> Cycle; NFCI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297846 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 15:37:42 +00:00
Simon Pilgrim
c12050bd3a Reverted unintended commit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297841 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 14:47:30 +00:00
Simon Pilgrim
a6a5484b63 Fix Wint-in-bool-context warning (PR32248)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297840 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-15 14:38:19 +00:00
Matt Arsenault
6e1ede8ee8 AMDGPU: Re-use TM.getNullPointerValue
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297662 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-13 20:18:14 +00:00
Matt Arsenault
32cb946c46 AMDGPU: Treat 0 as private null pointer in addrspacecast lowering
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297658 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-13 19:47:31 +00:00
Matt Arsenault
a8ffe4b37c AMDGPU: Remove packf16 intrinsic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297557 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-11 05:51:16 +00:00
Matt Arsenault
dbe625a311 AMDGPU: Keep track of modifiers when converting v_mac to v_mad
Since v_max_f32_e64/v_max_f16_e64 can be folded if the target
instruction supports the clamp bit, we also need to maintain
modifiers when converting v_mac to v_mad.

This fixes a rendering issue with Dirt Rally because a v_mac
instruction with the clamp bit set was converted to a v_mad
but that bit was lost during the conversion.

Fixes: e184e01dd7 ("AMDGPU: Fold FP clamp as modifier bit")

Patch by Samuel Pitoiset <samuel.pitoiset@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297556 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-11 05:40:40 +00:00
Stanislav Mekhanoshin
3081264dbe [AMDGPU] Remove getBidirectionalReasonRank
This method inverts the Reason field of a scheduling candidate.
It does right comparison between RegCritical and RegExcess, but
everything else is broken. In fact it can prefer less strong reason
such as Weak over RegCritical because Weak > -RegCritical.

The CandReason enum is properly sorted, so just remove artificial
ranking.

Differential Revision: https://reviews.llvm.org/D30557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297536 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-11 00:29:27 +00:00
Konstantin Zhuravlyov
38890eb0c2 [AMDGPU] Split R600/SI getFrameIndexReference and emit stack object offsets for SI
Differential Revision: https://reviews.llvm.org/D29674


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297499 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 19:39:07 +00:00
Yaxun Liu
ab0e0e2181 Rename PT_NOTE namespace name used in AMDGPUPTNote.h
Patch by Guansong Zhang.

Differential Revision: https://reviews.llvm.org/D30750


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297498 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-10 19:35:43 +00:00
Changpeng Fang
1970df176f AMDGPU/SI: Disable unrolling in the loop vectorizer if the loop is not vectorized.
Reviewers:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D30719

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297328 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-09 00:07:00 +00:00
Matt Arsenault
f06f68a796 AMDGPU: Don't wait at end of block with a trivial successor
If there is only one successor, and that successor only
has one predecessor the wait can obviously be delayed until
uses or the end of the next block. This avoids code quality
regressions when there are trivial fallthrough blocks inserted
for structurization.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297251 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 01:06:58 +00:00
Matt Arsenault
fc8387b8d1 AMDGPU: Constant fold rcp node
When doing arcp optimization with a constant denominator,
this was leaving behind rcps with constant inputs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297248 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-08 00:48:46 +00:00
Changpeng Fang
2e729706f1 AMDGPU/SI: Do not insert EndCf in an unreachable block
Reviewers:
  arsenm

Differential Revision:
  http://reviews.llvm.org/D22025

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297243 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 23:29:36 +00:00
Daniel Sanders
35c6dd2400 Recommit: [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

The problem with the previous commit appears to have been that TableGen was including CodeGen/LowLevelType.h instead of Support/LowLevelTypeImpl.h.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297241 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 23:20:35 +00:00
Daniel Sanders
428e17c613 Revert r297177: Change LLT constructor string into an LLT-based object ...
More module problems. This time it only showed up in the stage 2 compile of
clang-x86_64-linux-selfhost-modules-2 but not the stage 1 compile.

Somehow, this change causes the build to need Attributes.gen before it's been
generated.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297188 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 19:21:23 +00:00
Daniel Sanders
86bbf4372b [globalisel] Change LLT constructor string into an LLT-based object that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297177 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 18:32:25 +00:00
Konstantin Zhuravlyov
58580c59ae Revert "AMDGPU: Set MCAsmInfo::PointerSize"
It breaks line tables because the patch is not complete, working on a complete one at the moment

This reverts commit r294031.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297118 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-07 04:44:33 +00:00
Jan Vesely
ec8e013baa AMDGPU/R600: Fix ALU clause markers use detection
also exit early on kill instead of redefinition.

Differential Revision: https://reviews.llvm.org/D30230

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297060 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-06 20:10:05 +00:00
Krzysztof Parzyszek
88a7ff46b1 Make TargetInstrInfo::isPredicable take a const reference, NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296901 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-03 18:30:54 +00:00
Dmitry Preobrazhensky
23bacbc32a [AMDGPU][MC] Fix for Bug 30829 + LIT tests
Added code to check constant bus restrictions for VOP formats (only one SGPR value or literal-constant may be used by the instruction).
Note that the same checks are performed by SIInstrInfo::verifyInstruction (used by lowering code).
Added LIT tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-03 14:31:06 +00:00
Matt Arsenault
003f1a56c5 AMDGPU: Fix missing dominator tree dependency
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296842 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-02 23:50:51 +00:00
Matt Arsenault
1a0fc1885e AMDGPU: Fix types for VOP_I16_I16_I16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296523 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 21:31:45 +00:00
Matt Arsenault
743da63164 AMDGPU: Add definition for v_swap_b32
This is somewhat tricky because there are two
pairs of tied operands, and it isn't allowed to be
VOP3 encoded.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296519 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 21:09:04 +00:00
Matt Arsenault
0911246281 AMDGPU: Add definition for v_xad_u32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296515 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 20:27:30 +00:00
Matt Arsenault
c1a133abee AMDGPU: Add ds_nop to assembler
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296513 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 20:15:46 +00:00
Matt Arsenault
efc7556475 AMDGPU: Add definitions for ds_{read|write}_b{96|128}
It's not clear to me if this is always better than
doing ds_write2_b64 This adds the constraint of
a 128-bit register input instead of a pair of
64-bit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296512 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 20:15:43 +00:00
Stanislav Mekhanoshin
0248798c22 [AMDGPU] Add second pass of the scheduler
If during scheduling we have identified that we cannot keep optimistic
occupancy increase critical register pressure limit and try scheduling
of the whole function again. In this case blocks with smaller pressure
will have a chance for better scheduling.

Differential Revision: https://reviews.llvm.org/D30442

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296506 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 19:20:33 +00:00
Stanislav Mekhanoshin
004075214e [AMDGPU] New method to estimate register pressure
This change introduces new method to estimate register pressure in
GCNScheduler. Standard RPTracker gives huge error due to the following
reasons:

1. It does not account for live-ins or live-outs if value is not used
in the region itself. That creates a huge error in a very common case
if there are a lot of live-thu registers.
2. It does not properly count subregs.
3. It assumes a register used as an input operand can be reused as an
output. This is not always possible by itself, this is not what RA
will finally do in many cases for various reasons not limited to RA's
inability to do so, and this is not so if the value is actually a
live-thu.

In addition we can now see clear separation between live-in pressure
which we cannot change with the scheduling and tentative pressure
which we can change.

Differential Revision: https://reviews.llvm.org/D30439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296491 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 17:22:39 +00:00
Konstantin Zhuravlyov
b1d063dce5 [AMDGPU] Change amd_kernel_code_t's minor version to 1
- We do emit amd_kernel_code_t v1.1

Differential Revision: https://reviews.llvm.org/D30433


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296489 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 17:17:52 +00:00
Stanislav Mekhanoshin
b57bf30b47 [AMDGPU] Fix read-undef flags when schedule is reverted
If two subregs of the same register are defined and we need to revert
schedule changing def order, we will end up with both instructions
having def,read-undef flags because adjustLaneLiveness() will only set
this flag but will not remove it.

Fix this by removing read-undef flags before calling adjustLaneLiveness.

Differential Revision: https://reviews.llvm.org/D30428

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296484 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 16:26:27 +00:00
Daniel Sanders
1e598cbf73 Revert r296474 - [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it.
There's a circular dependency that's only revealed when LLVM_ENABLE_MODULES=1.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296478 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 15:00:27 +00:00
Daniel Sanders
e0180ef4b8 [globalisel] Change LLT constructor string into an LLT subclass that knows how to generate it.
Summary:
This will allow future patches to inspect the details of the LLT. The implementation is now split between
the Support and CodeGen libraries to allow TableGen to use this class without introducing layering concerns.

Thanks to Ahmed Bougacha for finding a reasonable way to avoid the layering issue and providing the version of this patch without that problem.

Reviewers: t.p.northover, qcolombet, rovka, aditya_nandakumar, ab, javed.absar

Subscribers: arsenm, nhaehnle, mgorny, dberris, llvm-commits, kristof.beyls

Differential Revision: https://reviews.llvm.org/D30046

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296474 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-28 14:21:31 +00:00
Matt Arsenault
dd2186aaab AMDGPU: Use v_med3_{f16|i16|u16}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296401 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 22:40:39 +00:00
Matt Arsenault
27f4f2f4bc AMDGPU: Support v2i16/v2f16 packed operations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296396 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 22:15:25 +00:00
Matt Arsenault
563a987b91 AMDGPU: Add some of the new gfx9 VOP3 instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296382 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 21:04:41 +00:00
Matt Arsenault
a4e4156e12 AMDGPU: Support inlineasm for packed instructions
Add packed types as legal so they may be used with inlineasm.
Keep all operations expanded for now.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296379 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 20:52:10 +00:00
Matt Arsenault
132ab30572 AMDGPU: Don't fold immediate if clamp/omod are set
Doesn't fix any practical problems because clamp/omod
are currently folded after peephole optimizer.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296375 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 20:21:31 +00:00
Matt Arsenault
dd23defd5c AMDGPU: Fold omod into instructions
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296372 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 19:35:42 +00:00
Matt Arsenault
29df731fe5 AMDGPU: Add f16 to shader calling conventions
Mostly useful for writing tests for f16 features.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296370 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 19:24:47 +00:00
Matt Arsenault
87fd70245a AMDGPU: Add VOP3P instruction format
Add a few non-VOP3P but instructions related to packed.

Includes hack with dummy operands for the benefit of the assembler

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@296368 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-27 18:49:11 +00:00