98 Commits

Author SHA1 Message Date
Matt Arsenault
fdc698b726 AMDGPU: Erase redundant redefs of m0 in SIFoldOperands
Only handle simple inter-block redefs of m0 to the same value. This
avoids interference from redefs of m0 in SILoadStoreOptimzer. I was
initially teaching that pass to ignore redefs of m0, but having them
not exist beforehand is much simpler.

This is in preparation for deleting the current special m0 handling in
SIFixSGPRCopies to allow the register coalescer to handle the
difficult cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375449 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-21 19:53:46 +00:00
Matt Arsenault
2df5f8ca5d AMDGPU: Increase vcc liveness scan threshold
Avoids a test regression in a future patch. Also add debug printing on
this case, so I waste less time debugging folds in the future.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375367 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-20 17:44:17 +00:00
Matt Arsenault
41cd9eb045 AMDGPU: Don't fold copies to physregs
In a future patch, this will help cleanup m0 handling.

The register coalescer handles copies from a register that
materializes an immediate, but doesn't handle move immediates
itself. The virtual register uses will often be allocated to the same
register, so there end up being no real copy.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374257 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-09 22:51:42 +00:00
Alexander Timofeev
864a620db8 [AMDGPU] SIFoldOperands should not fold register acrocc the EXEC definition
Reviewers: rampitec

      Differential Revision: https://reviews.llvm.org/D67662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373221 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-30 15:31:17 +00:00
Stanislav Mekhanoshin
6edc006701 [AMDGPU] gfx10 v_fmac_f16 operand folding
Fold immediates into v_fmac_f16.

Differential Revision: https://reviews.llvm.org/D68037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372906 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-25 18:40:20 +00:00
Stanislav Mekhanoshin
c3fd6a13de [AMDGPU] w/a for gfx908 mfma SrcC literal HW bug
gfx908 ignores an mfma if SrcC is a literal.

Differential Revision: https://reviews.llvm.org/D66670

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369816 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-23 22:09:58 +00:00
Alexander Timofeev
7822759c5f [AMDGPU] Prevent VGPR copies from moving across the EXEC mask definitions
Differential Revision: https://reviews.llvm.org/D63731
Reviewers: qcolombet, rampitec

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369532 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-21 15:15:04 +00:00
Daniel Sanders
57a8129407 Apply llvm-prefer-register-over-unsigned from clang-tidy to LLVM
Summary:
This clang-tidy check is looking for unsigned integer variables whose initializer
starts with an implicit cast from llvm::Register and changes the type of the
variable to llvm::Register (dropping the llvm:: where possible).

Partial reverts in:
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
X86FixupLEAs.cpp - Some functions return unsigned and arguably should be MCRegister
X86FrameLowering.cpp - Some functions return unsigned and arguably should be MCRegister
HexagonBitSimplify.cpp - Function takes BitTracker::RegisterRef which appears to be unsigned&
MachineVerifier.cpp - Ambiguous operator==() given MCRegister and const Register
PPCFastISel.cpp - No Register::operator-=()
PeepholeOptimizer.cpp - TargetInstrInfo::optimizeLoadInstr() takes an unsigned&
MachineTraceMetrics.cpp - MachineTraceMetrics lacks a suitable constructor

Manual fixups in:
ARMFastISel.cpp - ARMEmitLoad() now takes a Register& instead of unsigned&
HexagonSplitDouble.cpp - Ternary operator was ambiguous between unsigned/Register
HexagonConstExtenders.cpp - Has a local class named Register, used llvm::Register instead of Register.
PPCFastISel.cpp - PPCEmitLoad() now takes a Register& instead of unsigned&

Depends on D65919

Reviewers: arsenm, bogner, craig.topper, RKSimon

Reviewed By: arsenm

Subscribers: RKSimon, craig.topper, lenary, aemerson, wuzish, jholewinski, MatzeB, qcolombet, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, wdng, nhaehnle, sbc100, jgravelle-google, kristof.beyls, hiraditya, aheejin, kbarton, fedor.sergeev, javed.absar, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, tpr, PkmX, jocewei, jsji, Petar.Avramovic, asbirlea, Jim, s.egerton, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D65962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369041 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-15 19:22:08 +00:00
Tim Renouf
1b0576fa1b [AMDGPU] Fix to 'Fold readlane from copy of SGPR or imm'
That change (r363670) could leave a copy from vgpr to sgpr. Fixed.

Differential Revision: https://reviews.llvm.org/D66133

Change-Id: I00c3fe6fda2e8e1e36f53195b881b1449c777ea4

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368736 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-13 18:57:55 +00:00
Daniel Sanders
c7a3c5c5d1 Finish moving TargetRegisterInfo::isVirtualRegister() and friends to llvm::Register as started by r367614. NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367633 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-01 23:27:28 +00:00
Jay Foad
7f18dddc3d [AMDGPU] Fix DPP combiner check for exec modification
Summary:
r363675 changed the exec modification helper function, now called
execMayBeModifiedBeforeUse, so that if no UseMI is specified it checks
all instructions in the basic block, even beyond the last use. That
meant that the DPP combiner no longer worked in any basic block that
ended with a control flow instruction, and in particular it didn't work
on code sequences generated by the atomic optimizer.

Fix it by reinstating the old behaviour but in a new helper function
execMayBeModifiedBeforeAnyUse, and limiting the number of instructions
scanned.

Reviewers: arsenm, vpykhtin

Subscribers: kzhuravl, nemanjai, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, kbarton, MaskRay, jfb, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D64393

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365910 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-12 15:59:40 +00:00
Stanislav Mekhanoshin
6644a1885f [AMDGPU] gfx908 mfma support
Differential Revision: https://reviews.llvm.org/D64584

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365824 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-11 21:19:33 +00:00
Nicolai Haehnle
90f5eff5ac AMDGPU: Write LDS objects out as global symbols in code generation
Summary:
The symbols use the processor-specific SHN_AMDGPU_LDS section index
introduced with a previous change. The linker is then expected to resolve
relocations, which are also emitted.

Initially disabled for HSA and PAL environments until they have caught up
in terms of linker and runtime loader.

Some notes:

- The llvm.amdgcn.groupstaticsize intrinsics can no longer be lowered
  to a constant at compile times, which means some tests can no longer
  be applied.

  The current "solution" is a terrible hack, but the intrinsic isn't
  used by Mesa, so we can keep it for now.

- We no longer know the full LDS size per kernel at compile time, which
  means that we can no longer generate a relevant error message at
  compile time. It would be possible to add a check for the size of
  individual variables, but ultimately the linker will have to perform
  the final check.

Change-Id: If66dbf33fccfbf3609aefefa2558ac0850d42275

Reviewers: arsenm, rampitec, t-tye, b-sumner, jsjodin

Subscribers: qcolombet, kzhuravl, jvesely, wdng, yaxunl, dstuttard, tpr, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D61494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364297 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-25 11:52:30 +00:00
Matt Arsenault
68420a23e0 AMDGPU: Fold frame index into MUBUF
This matters for byval uses outside of the entry block, which appear
as copies.

Previously, the only folding done was during selection, which could
not see the underlying frame index. For any uses outside the entry
block, the frame index was materialized in the entry block relative to
the global scratch wave offset.

This may produce worse code in cases where the offset ends up not
fitting in the MUBUF offset field. A better heuristic would be helpfu
for extreme frames.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364185 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 14:53:56 +00:00
Matt Arsenault
bff29f6333 AMDGPU: Fix folding immediate into readfirstlane through reg_sequence
The def instruction for the vreg may not match, because it may be
folding through a reg_sequence. The assert was overly conservative and
not necessary. It's not actually important if DefMI really defined the
register, because the fold that will be done cares about the def of
the value that will be folded.

For some reason copies aren't making it through the reg_sequence,
although they should.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363876 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 20:44:15 +00:00
Matt Arsenault
9f1f314a53 AMDGPU: Change API for checking for exec modification
Invert the name and return value to better reflect the imprecise
nature.

Force passing in the DefMI, since it's known in the 2 users and could
possibly fail for an arbitrary vreg.

Allow specifying a specific user instruction. Scan through use
instructions, instead of use operands. Add scan thresholds instead of
searching infinitely.

Stop using a set to track seen uses. I didn't understand this usage,
or why it would not check the last use. I don't think the use list has
any particular order.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363675 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-18 12:48:36 +00:00
Matt Arsenault
ccd9c0a5d7 AMDGPU: Fold readlane from copy of SGPR or imm
These may be inserted to assert uniformity somewhere.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363670 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-18 12:23:46 +00:00
Matt Arsenault
09e40fba68 AMDGPU: Remove unnecessary check for virtual register
The copy was found by searching the uses of a virtual register, so
it's already known to be virtual.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363669 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-18 12:23:45 +00:00
Matt Arsenault
bec96285e2 AMDGPU: Support shrinking add with FI in SIFoldOperands
Avoids test regression in a future patch

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359898 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03 15:21:53 +00:00
Matt Arsenault
6db8359b59 AMDGPU: Replace shrunk instruction with dummy implicit_def
This was broken if the original operand was killed. The kill flag
would appear on both instructions, and fail the verifier. Keep the
kill flag, but remove the operands from the old instruction. This has
an added benefit of really reducing the use count for future folds.

Ideally the pass would be structured more like what PeepholeOptimizer
does to avoid this hack to avoid breaking instruction iterators.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359891 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03 14:40:10 +00:00
Matt Arsenault
8edcece164 AMDGPU: Fix incorrect commute with sub when folding immediates
When a fold of an immediate into a sub/subrev required shrinking the
instruction, the wrong VOP2 opcode was used. This was using the VOP2
equivalent of the original instruction, not the commuted instruction
with the inverted opcode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359883 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-03 13:42:56 +00:00
Stanislav Mekhanoshin
ffc5401cfb [AMDGPU] gfx1010 allows VOP3 to have a literal
Differential Revision: https://reviews.llvm.org/D61413

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@359756 91177308-0d34-0410-b5e6-96231b3b80d8
2019-05-02 04:01:39 +00:00
Michael Liao
77d3eb784a [AMDGPU] Fix an issue in op_sel_hi skipping.
Summary:
- Only apply packed literal `op_sel_hi` skipping on operands requiring
  packed literals. Even an instruction is `packed`, it may have operand
  requiring non-packed literal, such as `v_dot2_f32_f16`.

Reviewers: rampitec, arsenm, kzhuravl

Subscribers: jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D60978

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@358922 91177308-0d34-0410-b5e6-96231b3b80d8
2019-04-22 22:05:49 +00:00
Matt Arsenault
234b3a117e AMDGPU: Remove dx10-clamp from subtarget features
Since this can be set with s_setreg*, it should not be a subtarget
property. Set a default based on the calling convention, and Introduce
a new amdgpu-dx10-clamp attribute to override this if desired.

Also introduce a new amdgpu-ieee attribute to match.

The values need to match to allow inlining. I think it is OK for the
caller's dx10-clamp attribute to override the callee, but there
doesn't appear to be the infrastructure to do this currently without
definining the attribute in the generic Attributes.td.

Eventually the calling convention lowering will need to insert a mode
switch somewhere for these.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@357302 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-29 19:14:54 +00:00
Tim Renouf
0b9f636469 [AMDGPU] Asm/disasm v_cndmask_b32_e64 with abs/neg source modifiers
This commit allows v_cndmask_b32_e64 with abs, neg source
modifiers on src0, src1 to be assembled and disassembled.

This does appear to be allowed, even though they are floating point
modifiers and the operand type is b32.

To do this, I added src0_modifiers and src1_modifiers to the
MachineInstr, which involved fixing up several places in codegen and mir
tests.

Differential Revision: https://reviews.llvm.org/D59191

Change-Id: I69bf4a8c73ebc65744f6110bb8fc4e937d79fbea

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356398 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-18 19:25:39 +00:00
Stanislav Mekhanoshin
a4638a6a3a [AMDGPU] Silence gcc 7 warnings
Differential Revision: https://reviews.llvm.org/D59330

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@356100 91177308-0d34-0410-b5e6-96231b3b80d8
2019-03-13 21:15:52 +00:00
Chandler Carruth
6b547686c5 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@351636 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-19 08:50:56 +00:00
Alexander Timofeev
bf5049eaf1 [AMDGPU] Fix scalar operand folding bug that causes SHOC performance regression.
Detailed description: SIFoldOperands::foldInstOperand iterates over the
operand uses calling the function that changes def-use iteratorson the
way. As a result loop exits immediately when def-use iterator is
changed. Hence, the operand is folded to the very first use instruction
only. This makes VGPR live along the whole basic block and increases
register pressure significantly. The performance drop observed in SHOC
DeviceMemory test is caused by this bug.

Proposed fix: collect uses to separate container for further processing
in another loop.

Testing: make check-llvm
SHOC performance test.

Reviewers: rampitec, ronlieb

Differential Revision: https://reviews.llvm.org/D56161

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@350350 91177308-0d34-0410-b5e6-96231b3b80d8
2019-01-03 19:55:32 +00:00
Stanislav Mekhanoshin
45657b2c3b [AMDGPU] Fold copy (copy vgpr)
This allows to reduce a number of used VGPRs in some cases.

Differential Revision: https://reviews.llvm.org/D52577

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@343249 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-27 18:55:20 +00:00
Alexander Timofeev
c851af3c8a [AMDGPU] Preliminary patch for divergence driven instruction selection. Operands Folding 1.
Reviewers: rampitec

Differential revision: https://reviews/llvm/org/D51316

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341068 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-30 13:55:04 +00:00
Fangrui Song
18fbbd8771 [AMDGPU] Fix -Wunused-variable when -DLLVM_ENABLE_ASSERTIONS=off
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340868 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 19:19:03 +00:00
Matt Arsenault
0e21fb6c46 AMDGPU: Force shrinking of add/sub even if the carry is used
The original motivating example uses a 64-bit add, so the carry
is used. Insert a copy from VCC. This may allow shrinking of
the used carry instruction. At worst, we are replacing a
mov to materialize the constant with a copy of vcc.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340862 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 18:44:16 +00:00
Matt Arsenault
ac00e8c95b AMDGPU: Shrink insts to fold immediates
This needs to be done in the SSA fold operands
pass to be effective, so there is a bit of overlap
with SIShrinkInstructions but I don't think this
is practically avoidable.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@340859 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-28 18:34:24 +00:00
Matt Arsenault
10398322af AMDGPU: Check NSZ MI flag when folding omod
I'm not sure the exact nsz flag combination that
is OK. I think as long as it's on either, this is OK.
For now just check it on the omod multiply.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339513 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-12 08:44:25 +00:00
Matt Arsenault
273374717e AMDGPU: Fold v_lshl_or_b32 with 0 src0
Appears from expansion of some packed cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@339025 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-06 15:40:20 +00:00
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Tom Stellard
cba2181e77 AMDGPU: Separate R600 and GCN TableGen files
Summary:
We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc.  This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself.  This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.

Reviewers: arsenm, nhaehnle, jvesely

Reviewed By: arsenm

Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46365

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335942 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 23:47:12 +00:00
Tom Stellard
f02d6fd47c AMDGPU: Remove #include "MCTargetDesc/AMDGPUMCTargetDesc.h" from common headers
Summary:
MCTargetDesc/AMDGPUMCTargetDesc.h contains enums for all the instuction
and register defintions, which are huge so we only want to include
them where needed.

This will also make it easier if we want to split the R600 and GCN
definitions into separate tablegenerated files.

I was unable to remove AMDGPUMCTargetDesc.h from SIMachineFunctionInfo.h
because it uses some enums from the header to initialize default values
for the SIMachineFunction class, so I ended up having to remove includes of
SIMachineFunctionInfo.h from headers too.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: MatzeB, kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46272

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332930 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-22 02:03:23 +00:00
Nicola Zaghen
0818e789cb Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-14 12:53:11 +00:00
Matt Arsenault
ac9b3ef76a AMDGPU: Add Vega12 and Vega20
Changes by
  Matt Arsenault
  Konstantin Zhuravlyov

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331215 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-30 19:08:16 +00:00
Stanislav Mekhanoshin
dbe20f7ece [AMDGPU] Truncate packed inline constant
If a packed inline constant is sign extended it must be truncated
after the shift. I.e. a constant (0xH0000, 0xHBC00), will be represented
as 0xFFFFFFFFBC000000 in the IR because the immediate is sign extended
to 64 bit. After the value shifted right by 16 to use it in a low part
with op_sel_hi it becomes 0xFFFFFFFFBC00 and does not qualify as inline
constant any longer.

Fixed the error and added verification code. Without the fix and with
the verification bug is causing pk_max_f16_literal.ll to fail.

Differential Revision: https://reviews.llvm.org/D45987

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330752 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-24 18:17:55 +00:00
Stanislav Mekhanoshin
8aac337938 [AMDGPU] Use packed literals with zero either lower or hi part
Differential Revision: https://reviews.llvm.org/D45790

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330365 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-19 21:16:50 +00:00
Stanislav Mekhanoshin
eca9ecf8f9 [AMDGPU] Enabled v2.16 literals for VOP3P
Literal encoding needs op_sel_hi to select low 16 bit in this case.

Differential Revision: https://reviews.llvm.org/D45745

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330230 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-17 23:09:05 +00:00
Matt Arsenault
5c56853ab7 AMDGPU: Fix crash when constant folding with physreg operand
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@327209 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-10 16:05:35 +00:00
Matt Arsenault
0bd3aded70 AMDGPU: Don't crash when trying to fold implicit operands
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324550 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-08 01:12:46 +00:00
Matthias Braun
d318139827 MachineFunction: Return reference from getFunction(); NFC
The Function can never be nullptr so we can return a reference.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-15 22:22:58 +00:00
Matthias Braun
fa621d294f Rename LiveIntervalAnalysis.h to LiveIntervals.h
Headers/Implementation files should be named after the class they
declare/define.

Also eliminated an `#include "llvm/CodeGen/LiveIntervalAnalysis.h"` in
favor of `class LiveIntarvals;`

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320546 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-13 02:51:04 +00:00
Francis Visoiu Mistrih
fd11bc0813 [CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register.
Work towards the unification of MIR and debug output by refactoring the
interfaces.

For MachineOperand::print, keep a simple version that can be easily called
from `dump()`, and a more complex one which will be called from both the
MIRPrinter and MachineInstr::print.

Add extra checks inside MachineOperand for detached operands (operands
with getParent() == nullptr).

https://reviews.llvm.org/D40836

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/<def>//g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<def[ ]*,[ ]*dead>/dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ]*,[ ]*dead>/implicit-def dead \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g'
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" -o -name "*.s" \) -type f -print0 | xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g'

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320022 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 10:40:31 +00:00
Francis Visoiu Mistrih
7384652668 [CodeGen] Print "%vreg0" as "%0" in both MIR and debug output
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).

Basically:

* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed

Differential Revision: https://reviews.llvm.org/D40420

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319427 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 12:12:19 +00:00
Francis Visoiu Mistrih
a4ec08b6fd [CodeGen] Print register names in lowercase in both MIR and debug output
As part of the unification of the debug format and the MIR format,
always print registers as lowercase.

* Only debug printing is affected. It now follows MIR.

Differential Revision: https://reviews.llvm.org/D40417

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319187 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 17:15:09 +00:00