Commit Graph

3261 Commits

Author SHA1 Message Date
Nick Desaulniers
a943056482 [AsmPrinter] refactor to remove remove AsmVariant. NFC
Summary:
The InlineAsm::AsmDialect is only required for X86; no architecture
makes use of it and as such it gets passed around between arch-specific
and general code while being unused for all architectures but X86.

Since the AsmDialect is queried from a MachineInstr, which we also pass
around, remove the additional AsmDialect parameter and query for it deep
in the X86AsmPrinter only when needed/as late as possible.

This refactor should help later planned refactors to AsmPrinter, as this
difference in the X86AsmPrinter makes it harder to make AsmPrinter more
generic.

Reviewers: craig.topper

Subscribers: jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, hiraditya, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, llvm-commits, peter.smith, srhines

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60488

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358101 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-10 16:38:43 +00:00
Tom Stellard
0f50aefa77 AMDGPU/GlobalISel: Implement call lowering for shaders returning values
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, jvesely, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, volkan, llvm-commits

Differential Revision: https://reviews.llvm.org/D57166

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357964 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-09 02:26:03 +00:00
Nikita Popov
cdee3b7f23 Reapply [ValueTracking] Support min/max selects in computeConstantRange()
Add support for min/max flavor selects in computeConstantRange(),
which allows us to fold comparisons of a min/max against a constant
in InstSimplify. This fixes an infinite InstCombine loop, with the
test case taken from D59378.

Relative to the previous iteration, this contains some adjustments for
AMDGPU med3 tests: The AMDGPU target runs InstSimplify prior to codegen,
which ends up constant folding some existing med3 tests after this
change. To preserve these tests a hidden -amdgpu-scalar-ir-passes option
is added, which allows disabling scalar IR passes (that use InstSimplify)
for testing purposes.

Differential Revision: https://reviews.llvm.org/D59506

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357870 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-07 17:22:16 +00:00
Stanislav Mekhanoshin
c7b87faaf6 [AMDGPU] Sort out and rename multiple CI/VI predicates
Differential Revision: https://reviews.llvm.org/D60346

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357835 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-06 09:20:48 +00:00
Stanislav Mekhanoshin
5d206b6327 [AMDGPU] Add MachineDCE pass after RenameIndependentSubregs
Detect dead lanes can create some dead defs. Then RenameIndependentSubregs
will break a REG_SEQUENCE which may use these dead defs. At this point
a dead instruction can be removed but we do not run a DCE anymore.

MachineDCE was only running before live variable analysis. The patch
adds a mean to preserve LiveIntervals and SlotIndexes in case it works
past this.

Differential Revision: https://reviews.llvm.org/D59626

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357805 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 20:11:32 +00:00
Stanislav Mekhanoshin
a0163213a9 [AMDGPU] predicate and feature refactoring
We have done some predicate and feature refactoring lately but
did not upstream it. This is to sync.

Differential revision: https://reviews.llvm.org/D60292

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357791 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 18:24:34 +00:00
Matt Arsenault
6c7dd5967a AMDGPU/GlobalISel: Fix non-power-of-2 select
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357762 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-05 14:03:04 +00:00
Matt Arsenault
7a6ef30046 AMDGPU: Split block for si_end_cf
Relying on no spill or other code being inserted before this was
precarious. It relied on code diligently checking isBasicBlockPrologue
which is likely to be forgotten.

Ideally this could be done earlier, but this doesn't work because of
phis. Any other instruction can't be placed before them, so we have to
accept the position being incorrect during SSA.

This avoids regressions in the fast register allocator rewrite from
inverting the direction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357634 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-03 20:53:20 +00:00
Matt Arsenault
cd3a8dda10 AMDGPU: Assume ECC is enabled by default if supported
The test should really be checking for the property directly in the
code object headers, but there are problems with this. I don't see
this directly represented in the text form, and for the binary
emission this is depending on a function level subtarget feature to
emit a global flag.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357558 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-03 01:58:57 +00:00
Matt Arsenault
581e35dc2d AMDGPU: Remove unnecessary subtarget get
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357542 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-03 00:01:05 +00:00
Matt Arsenault
14c192770d AMDGPU: Fix names for generation features
We should overall stop using these, but the uppercase name didn't
work. Any feature string is converted to lowercase, so these
could never be found in the table.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357541 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-03 00:01:03 +00:00
Sander de Smalen
74a4fc7f9d Enforce StackID definition in PEI
There are various places in LLVM where the definition of StackID is not
properly honoured, for example in PEI where objects with a StackID > 0 are
allocated on the default stack (StackID0). This patch enforces that PEI
only considers allocating objects to StackID 0.

Reviewers: arsenm, thegameg, MatzeB

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D60062


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357460 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-02 09:46:52 +00:00
Neil Henning
fcc236c268 [AMDGPU] Pre-allocate WWM registers to reduce VGPR pressure.
This change incorporates an effort by Connor Abbot to change how we deal
with WWM operations potentially trashing valid values in inactive lanes.

Previously, the SIFixWWMLiveness pass would work out which registers
were being trashed within WWM regions, and ensure that the register
allocator did not have any values it was depending on resident in those
registers if the WWM section would trash them. This worked perfectly
well, but would cause sometimes severe register pressure when the WWM
section resided before divergent control flow (or at least that is where
I mostly observed it).

This fix instead runs through the WWM sections and pre allocates some
registers for WWM. It then reserves these registers so that the register
allocator cannot use them. This results in a significant register
saving on some WWM shaders I'm working with (130 -> 104 VGPRs, with just
this change!).

Differential Revision: https://reviews.llvm.org/D59295

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357400 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-01 15:19:52 +00:00
Liang Zou
d39afb2c98 fix typo: "\t" => " "
Reviewers: llvm.org, Jim

Reviewed By: Jim

Subscribers: arsenm, jvesely, nhaehnle, rupprecht, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59983

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357365 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-31 14:49:00 +00:00
Matt Arsenault
234b3a117e AMDGPU: Remove dx10-clamp from subtarget features
Since this can be set with s_setreg*, it should not be a subtarget
property. Set a default based on the calling convention, and Introduce
a new amdgpu-dx10-clamp attribute to override this if desired.

Also introduce a new amdgpu-ieee attribute to match.

The values need to match to allow inlining. I think it is OK for the
caller's dx10-clamp attribute to override the callee, but there
doesn't appear to be the infrastructure to do this currently without
definining the attribute in the generic Attributes.td.

Eventually the calling convention lowering will need to insert a mode
switch somewhere for these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357302 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 19:14:54 +00:00
Dmitry Preobrazhensky
a33ed9cfec [AMDGPU][MC] Corrected conversion rules for inlinable constants to match rules for literals
See bug 40806: https://bugs.llvm.org/show_bug.cgi?id=40806

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D59786

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357262 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 14:50:20 +00:00
Dmitry Preobrazhensky
8617a128fc [AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes
See bug 40917: https://bugs.llvm.org/show_bug.cgi?id=40917

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D59878

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357249 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 12:16:04 +00:00
Konstantin Zhuravlyov
142ef796c0 AMDGPU: Make sram-ecc off by default for Vega20
Differential Revision: https://reviews.llvm.org/D59718


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357247 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 12:04:18 +00:00
Clement Courbet
f88426dac5 [ScheduleDAG] Move Topo and addEdge to base class.
Some DAG mutations can only be applied to `ScheduleDAGMI`, and have to
internally cast a `ScheduleDAGInstrs` to `ScheduleDAGMI`.

There is nothing actually specific to `ScheduleDAGMI` in `Topo`.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357239 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 08:33:05 +00:00
Matt Arsenault
768023a783 AMDGPU/GlobalISel: Insert waterfall loop for vector indexing
The register index can only really be an SGPR. Lie that a VGPR index
is legal, and then rewrite the instruction in a waterfall loop to
handle the index.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357235 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 03:54:56 +00:00
Reid Kleckner
f290220c22 Delay initialization of three static global maps, NFC
This avoids allocating a few KB of heap memory on startup, and instead
allocates these maps lazily. I noticed this while profiling LLD.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357192 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-28 17:33:41 +00:00
Matt Arsenault
a88dcdbff2 AMDGPU: Make exec mask optimzations more resistant to block splits
Also improve the check for SALU instructions to also ignore
implicit_def and other fake instructions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357170 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-28 14:01:39 +00:00
Justin Bogner
c466f32ecf [LegalizeVectorTypes] Allow single loads and stores for more short vectors
When lowering a load or store for TypeWidenVector, the type legalizer
would use a single load or store if the associated integer type was legal
or promoted. E.g. it loads a v4i8 as an i32 if i32 is legal/promotable.
(See https://reviews.llvm.org/rL236528 for reference.)

This applies that behaviour to vector types. If the vector type is
TypePromoteInteger, the element type is going to be TypePromoteInteger
as well, which will lead to have a single promoting load rather than N
individual promoting loads. For instance, if we have a v3i1, we would
now have a load of v4i1 instead of 3 loads of i1.

Patch by Guillaume Marques. Thanks!

Differential Revision: https://reviews.llvm.org/D56201

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357120 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 20:35:56 +00:00
Matt Arsenault
50359ffe2c Reapply "AMDGPU: Scavenge register instead of findUnusedReg"
This reapplies r356149, using the correct overload of findUnusedReg
which passes the current iterator.

This worked most of the time, because the scavenger iterator was moved
at the end of the frame index loop in PEI. This would fail if the
spill was the first instruction. This was further hidden by the fact
that the scavenger wasn't passed in for normal frame index
elimination.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357098 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 17:31:29 +00:00
Matt Arsenault
0755a8d19c AMDGPU: Enable the scavenger for large frames
Another test is needed for the case where the scavenge fail, but
there's another issue with that which needs an additional fix.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357093 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 17:14:32 +00:00
Matt Arsenault
44ee20512e AMDGPU: Add additional MIR tests for exec mask optimizations
Also includes one example of how this transform is unsound. This isn't
verifying the copies are used in the control flow intrinisic patterns.

Also add option to disable exec mask opt pass. Since this pass is
unsound, it may be useful to turn it off until it is fixed.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357091 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 16:58:30 +00:00
Matt Arsenault
99267d8b98 AMDGPU: Skip debug_instr when collapsing end_cf
Based on how these are inserted, I doubt this was causing a problem in
practice.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357090 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 16:58:27 +00:00
Matt Arsenault
502b049dcb AMDGPU: Fix missing scc implicit def on s_andn2_b64_term
Introduce new helper class to copy properties directly from the base
instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357089 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 16:58:22 +00:00
Matt Arsenault
68048d45d3 AMDGPU: Don't hardcode num defs for MUBUF instructions
This shouldn't change anything since the no-ret atomics are selected
later.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357084 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 16:12:29 +00:00
Matt Arsenault
17f72131c2 AMDGPU: wave_barrier is not isBarrier
This is not a control flow instruction, so should not be marked as
isBarrier. This fixes a verifier error if followed by unreachable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357081 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 15:54:45 +00:00
Matt Arsenault
f46cbbae71 AMDGPU: Fix areLoadsFromSameBasePtr for DS atomics
The offset operand index is different for atomics.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357073 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 15:41:00 +00:00
Dmitry Preobrazhensky
7d583cafa6 Revert of 357063 [AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes
Reason: the change was mistakenly committed before review

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357066 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 13:49:52 +00:00
Dmitry Preobrazhensky
f49a51bff0 [AMDGPU][MC] Corrected handling of tied src for atomic return MUBUF opcodes
See bug 40917: https://bugs.llvm.org/show_bug.cgi?id=40917

Reviewers: artem.tamazov, arsenm

Differential Revision: https://reviews.llvm.org/D59305

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357063 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-27 13:07:41 +00:00
Matt Arsenault
b2435c6b83 Revert "AMDGPU: Scavenge register instead of findUnusedReg"
This reverts r356149.

This is crashing on rocBLAS.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356958 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 21:41:40 +00:00
Matt Arsenault
70b1947802 AMDGPU: Remove unnecessary check for isFullCopy
Subregister indexes are not used for physical register operands, so
isFullCopy is implied by the physical register check.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356956 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 21:28:53 +00:00
Matt Arsenault
d8ebe2971a AMDGPU: Set hasSideEffects 0 on _term instructions
These were defaulting to true, but they are just wrappers around bit
operations. This avoids regressions in the exec mask optimization
passes in a future commit.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356952 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 21:10:12 +00:00
Konstantin Zhuravlyov
ba71b832d8 AMDGPU: Add support for cross address space synchronization scopes
Differential Revision: https://reviews.llvm.org/D59517


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356946 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 20:50:21 +00:00
Matt Arsenault
6375a63fcb AMDGPU: Preserve LiveIntervals in WQM
This seems to already be done, but wasn't marked.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356922 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-25 16:47:42 +00:00
Alina Sbirlea
5bfa50e0ac [AliasAnalysis] Second prototype to cache BasicAA / anyAA state.
Summary:
Adding contained caching to AliasAnalysis. BasicAA is currently the only one using it.

AA changes:
- This patch is pulling the caches from BasicAAResults to AAResults, meaning the getModRefInfo call benefits from the IsCapturedCache as well when in "batch mode".
- All AAResultBase implementations add the QueryInfo member to all APIs. AAResults APIs maintain wrapper APIs such that all alias()/getModRefInfo call sites are unchanged.
- AA now provides a BatchAAResults type as a wrapper to AAResults. It keeps the AAResults instance and a QueryInfo instantiated to batch mode. It delegates all work to the AAResults instance with the batched QueryInfo. More API wrappers may be needed in BatchAAResults; only the minimum needed is currently added.

MemorySSA changes:
- All walkers are now templated on the AA used (AliasAnalysis=AAResults or BatchAAResults).
- At build time, we optimize uses; now we create a local walker (lives only as long as OptimizeUses does) using BatchAAResults.
- All Walkers have an internal AA and only use that now, never the AA in MemorySSA. The Walkers receive the AA they will use when built.

- The walker we use for queries after the build is instantiated on AliasAnalysis and is built after building MemorySSA and setting AA.
- All static methods doing walking are now templated on AliasAnalysisType if they are used both during build and after. If used only during build, the method now only takes a BatchAAResults. If used only after build, the method now takes an AliasAnalysis.

Subscribers: sanjoy, arsenm, jvesely, nhaehnle, jlebar, george.burgess.iv, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59315

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356783 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-22 17:22:19 +00:00
Tim Renouf
795a6bf867 [AMDGPU] Use three- and five-dword result type in image ops
Some image ops return three or five dwords.  Previously, we modeled that
with a 4 or 8 dword register class.  The register allocator could
cleverly spot that some subregs were dead and allocate something else
there, but that caused the de-optimization that waitcnt insertion would
think that the result was used immediately.

This commit allows such an image op to have a result with a three or
five dword result, avoiding the above de-optimization.

Differential Revision: https://reviews.llvm.org/D58905

Change-Id: I3651211bbd7ed22721ee7b9fefd7bcc60a809d8b

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356757 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-22 15:21:11 +00:00
Tim Renouf
c89bf6fdf3 [AMDGPU] Implemented dwordx3 variants of buffer/tbuffer load/store intrinsics
Now we have vec3 MVTs, this commit implements dwordx3 variants of the
buffer intrinsics.

On gfx6, a dwordx3 buffer load intrinsic is implemented as a dwordx4
instruction, and a dwordx3 buffer store intrinsic is not supported.
We need to support the dwordx3 load intrinsic because it is generated by
subtarget-unaware code in InstCombine.

Differential Revision: https://reviews.llvm.org/D58904

Change-Id: I016729d8557b98a52f529638ae97c340a5922a4e

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356755 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-22 14:58:02 +00:00
Tim Renouf
7684aab92a [AMDGPU] Added v5i32 and v5f32 register classes
They are not used by anything yet, but a subsequent commit will start
using them for image ops that return 5 dwords.

Differential Revision: https://reviews.llvm.org/D58903

Change-Id: I63e1904081e39a6d66e4eb96d51df25ad399d271

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356735 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-22 10:11:21 +00:00
Tim Renouf
a047778b62 [AMDGPU] Support for v3i32/v3f32
Added support for dwordx3 for most load/store types, but not DS, and not
intrinsics yet.

SI (gfx6) does not have dwordx3 instructions, so they are not enabled
there.

Some of this patch is from Matt Arsenault, also of AMD.

Differential Revision: https://reviews.llvm.org/D58902

Change-Id: I913ef54f1433a7149da8d72f4af54dbb13436bd9

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356659 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-21 12:01:21 +00:00
Tim Renouf
2107f31cc3 [AMDGPU] Do not generate spurious PAL metadata
My previous fix rL356591 "[AMDGPU] Added MsgPack format PAL metadata"
accidentally caused a spurious PAL metadata .note record to be emitted
for any AMDGPU output. That caused failures in the lld test
amdgpu-relocs.s. Fixed.

Differential Revision: https://reviews.llvm.org/D59613

Change-Id: Ie04a2aaae890dcd490f22c89edf9913a77ce070e

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356621 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 22:02:09 +00:00
Michael Liao
2213c42310 [AMDGPU] Fix dependency on BinaryFormat
Summary: - The linking is broken when this library is built as shared one.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59610

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356617 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 21:22:27 +00:00
Matt Arsenault
f2fd578499 AMDGPU: Don't look for constant in insert/extract_vector_elt regbankselect
The constantness shouldn't change the register bank choice. We also
don't need to restrict this to only indexing VGPRs, since it's
possible to index SGPRs (but SelectionDAG made using this
difficult). Allow directly indexing SGPRs when appropriate.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356611 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 20:41:34 +00:00
Michael Liao
4dc029cbf0 [AMDGPU] Fix clamp bit DAG operand
Summary:
- Should use `targetconstant` instead of `constant` operand for clamp
  bit, which is expected as an immediate operand. Under certain
  conditions, such as a common `i1 false` constant is used in other
  place and selected before the instruction with clamp bit, register
  operand may be added instead of immediate one. Use `targetcosntant` to
  enforce that.

Subscribers: arsenm, kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D59608

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356608 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 20:18:56 +00:00
Konstantin Zhuravlyov
19d5a90f6d AMDHSA: Fix COMPUTE_PGM_RSRC2.USER_SGPR calculation when parsing ISA assembly
It must match https://llvm.org/docs/AMDGPUUsage.html#initial-kernel-execution-state

Differential Revision: https://reviews.llvm.org/D59570


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356603 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 19:44:47 +00:00
Tim Renouf
2b2c7f795d [AMDGPU] Added MsgPack format PAL metadata
Summary:
PAL metadata now supports both the old linear reg=val pairs format and
the new MsgPack format.

The MsgPack format uses YAML as its textual representation. On output to
YAML, a mnemonic name is provided for some hardware registers.

Differential Revision: https://reviews.llvm.org/D57028

Change-Id: I2bbaabaaca4b3574f7e03b80fbef7c7a69d06a94

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356591 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 18:47:21 +00:00
Tim Renouf
d1e92f3b2a [AMDGPU] Factored PAL metadata handling out into its own class
Summary:
This commit introduces a new AMDGPUPALMetadata class that:
* is inside the AMDGPU target;
* keeps an in-memory representation of PAL metadata;
* provides a method to read the frontend-supplied metadata from LLVM IR;
* provides methods for the asm printer to set metadata items;
* provides methods to write the metadata as a binary blob to put in a
  .note record or as an asm directive;
* provides a method to read the metadata as a binary blob from a .note
  record.

Because llvm-readobj cannot call directly into a target, I had to remove
llvm-readobj's ability to dump PAL metadata, pending a resolution to
https://reviews.llvm.org/D52821

Differential Revision: https://reviews.llvm.org/D57027

Change-Id: I756dc830894fcb6850324cdcfa87c0120eb2cf64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356582 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-20 17:42:00 +00:00