Commit Graph

1296 Commits

Author SHA1 Message Date
Alexander Timofeev
23476c4f80 [AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one
Differential revision: https://reviews.llvm.org/D38754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317884 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-10 12:21:10 +00:00
Yaxun Liu
bd2cfd038d [AMDGPU] Fix pointer info for lowering load/store for r600 for amdgiz environment
r600 uses dummy pointer info for lowering load/store. Since dummy pointer info
assumes address space 0, this causes isel failure when temporary load/store SDNodes
are generated for amdgiz environment.

Since the offest is not constant, FixedStack pseudo source value cannot be used
to create the pointer info. This patch creates pointer info using llvm undef value.
At least this provides correct address space so that isel can be done correctly.

Differential Revision: https://reviews.llvm.org/D39698


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317862 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-10 02:03:28 +00:00
Yaxun Liu
d3bd6cf5cc [AMDGPU] Fix pointer info for pseudo source for r600
The pointer info for pseudo source for r600 is not correct when
alloca addr space is not 0, which causes invalid SDNode for r600---amdgiz.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39670


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317861 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-10 01:53:24 +00:00
Marek Olsak
9960086c11 AMDGPU: Merge BUFFER_STORE_DWORD_OFFEN/OFFSET into x2, x4
Summary:
Only 56 shaders (out of 48486) are affected.

Totals from affected shaders (changed stats only):
SGPRS: 2420 -> 2460 (1.65 %)
Spilled VGPRs: 94 -> 112 (19.15 %)
Scratch size: 524 -> 528 (0.76 %) dwords per thread
Code Size: 187400 -> 184992 (-1.28 %) bytes

One DiRT Showdown shader spills 6 more VGPRs.
One Grid Autosport shader spills 12 more VGPRs.

The other 54 shaders only have a decrease in code size.
(I'm ignoring the SGPR noise)

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D39012

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317755 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-09 01:52:55 +00:00
Marek Olsak
aa75d4aeb0 AMDGPU: Lower buffer store and atomic intrinsics manually
Summary:
Without this, SIMemoryLegalizer inserts s_waitcnt vmcnt(0) before every
buffer store and atomic instruction.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D39060

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-09 01:52:48 +00:00
Marek Olsak
e79e4fb9f1 AMDGPU: Merge BUFFER_LOAD_DWORD_OFFSET into x2, x4
Summary: Only 3 (out of 48486) shaders are affected.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38951

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317753 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-09 01:52:36 +00:00
Marek Olsak
79b91c25d7 AMDGPU: Merge BUFFER_LOAD_DWORD_OFFEN into x2, x4
Summary:
-9.9% code size decrease in affected shaders.

Totals (changed stats only):
SGPRS: 2151462 -> 2170646 (0.89 %)
VGPRS: 1634612 -> 1640288 (0.35 %)
Spilled SGPRs: 8942 -> 8940 (-0.02 %)
Code Size: 52940672 -> 51727288 (-2.29 %) bytes
Max Waves: 373066 -> 371718 (-0.36 %)

Totals from affected shaders:
SGPRS: 283520 -> 302704 (6.77 %)
VGPRS: 227632 -> 233308 (2.49 %)
Spilled SGPRs: 3966 -> 3964 (-0.05 %)
Code Size: 12203080 -> 10989696 (-9.94 %) bytes
Max Waves: 44070 -> 42722 (-3.06 %)

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38950

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317752 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-09 01:52:30 +00:00
Marek Olsak
1a1019d019 AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4
Summary:
Only constant offsets (*_IMM opcodes) are merged.
It reuses code for LDS load/store merging.
It relies on the scheduler to group loads.

The results are mixed, I think they are mostly positive. Most shaders are
affected, so here are total stats only:

 SGPRS: 2072198 -> 2151462 (3.83 %)
 VGPRS: 1628024 -> 1634612 (0.40 %)
 Spilled SGPRs: 7883 -> 8942 (13.43 %)
 Spilled VGPRs: 97 -> 101 (4.12 %)
 Scratch size: 1488 -> 1492 (0.27 %) dwords per thread
 Code Size: 60222620 -> 52940672 (-12.09 %) bytes
 Max Waves: 374337 -> 373066 (-0.34 %)

There is 13.4% increase in SGPR spilling, DiRT Showdown spills a few more
VGPRs (now 37), but 12% decrease in code size.

These are the new stats for SGPR spilling. We already spill a lot SGPRs,
so it's uncertain whether more spilling will make any difference since
SGPRs are always spilled to VGPRs:

 SGPR SPILLING APPS   Shaders SpillSGPR AvgPerSh
 alien_isolation         2938       100      0.0
 batman_arkham_origins    589         6      0.0
 bioshock-infinite       1769         4      0.0
 borderlands2            3968        22      0.0
 counter_strike_glob..   1142        60      0.1
 deus_ex_mankind_div..   1410        79      0.1
 dirt-showdown            533         4      0.0
 dirt_rally               364      1163      3.2
 divinity                1052         2      0.0
 dota2                   1747         7      0.0
 f1-2015                  776      1515      2.0
 grid_autosport          1767      1505      0.9
 hitman                  1413       273      0.2
 left_4_dead_2           1762         4      0.0
 life_is_strange         1296        26      0.0
 mad_max                  358        96      0.3
 metro_2033_redux        2670        60      0.0
 payday2                 1362        22      0.0
 portal                   474         3      0.0
 saints_row_iv           1704         8      0.0
 serious_sam_3_bfe        392      1348      3.4
 shadow_of_mordor        1418        12      0.0
 shadow_warrior          3956       239      0.1
 talos_principle          324      1735      5.4
 thea                     172        17      0.1
 tomb_raider             1449       215      0.1
 total_war_warhammer      242        56      0.2
 ue4_effects_cave         295        55      0.2
 ue4_elemental            572        12      0.0
 unigine_tropics          210        56      0.3
 unigine_valley           278       152      0.5
 victor_vran             1262        84      0.1
 yofrankie                 82         2      0.0

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38949

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317751 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-09 01:52:23 +00:00
Marek Olsak
c94ae749c0 AMDGPU: Fold immediate offset into BUFFER_LOAD_DWORD lowered from SMEM
Summary:
-5.3% code size in affected shaders.

Changed stats only:

48486 shaders in 30489 tests
Totals:
SGPRS: 2086406 -> 2072430 (-0.67 %)
VGPRS: 1626872 -> 1627960 (0.07 %)
Spilled SGPRs: 7865 -> 7912 (0.60 %)
Code Size: 60978060 -> 60188764 (-1.29 %) bytes
Max Waves: 374530 -> 374342 (-0.05 %)

Totals from affected shaders:
SGPRS: 299664 -> 285688 (-4.66 %)
VGPRS: 233844 -> 234932 (0.47 %)
Spilled SGPRs: 3959 -> 4006 (1.19 %)
Code Size: 14905272 -> 14115976 (-5.30 %) bytes
Max Waves: 46202 -> 46014 (-0.41 %)

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38915

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317750 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-09 01:52:17 +00:00
Matt Arsenault
a932c3f118 AMDGPU: Set correct sched model on v_mad_u64_u32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317645 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-08 00:48:25 +00:00
Bjorn Pettersson
4cbab70b62 [MIRPrinter] Use %subreg.xxx syntax for subregister index operands
Summary:
Print %subreg.<subregidxname> instead of just the subregister
index when printing immediate operands corresponding to subreg
indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and
REG_SEQUENCE.

Reviewers: qcolombet, MatzeB

Reviewed By: MatzeB

Subscribers: nhaehnle, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D39696

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317513 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 21:46:06 +00:00
Matt Arsenault
bd04b64cd1 AMDGPU: Select v_mad_u64_u32 and v_mad_i64_i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317492 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 17:04:37 +00:00
Yaxun Liu
ec1f0cc416 [AMDGPU] Change alloca addr space of r600 to 5 for amdgiz environment
Differential Revision: https://reviews.llvm.org/D39657


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317479 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 14:32:33 +00:00
Yaxun Liu
13a223e6e6 [AMDGPU] Fix assertion due to assuming pointer in default addr space is 32 bit
The backend assumes pointer in default addr space is 32 bit, which is not
true for the new addr space mapping and causes assertion for unresolved
functions.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39643


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317476 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 13:01:33 +00:00
Yaxun Liu
35934f8100 [AMDGPU] Remove hardcoded address space value from AMDGPULibFunc
AMDGPULibFunc hardcodes address space values of the old address space mapping,
which causes invalid addrspacecast instructions and undefined functions in
APPSDK sample MonteCarloAsianDP.

This patch fixes that.

Differential Revision: https://reviews.llvm.org/D39616


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317409 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-04 17:37:43 +00:00
Marek Olsak
0ce6825d9e AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset
Summary:
Apps that benefit:
- alien isolation
- bioshock infinite
- civilization: beyond earth
- company of heroes 2
- dirt showdown
- dota 2
- F1 2015
- grid autosport
- hitman
- legend of grimrock
- serious sam 3: bfe
- shadow warrior
- talos principle
- total war: warhammer
- UE4 demos: effects cave, elemental, sun temple

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D38914

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317038 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 21:06:42 +00:00
Yaxun Liu
a52756b2c9 [AMDGPU] Emit metadata for hidden arguments for kernel enqueue
Identifies kernels which performs device side kernel enqueues and emit
metadata for the associated hidden kernel arguments. Such kernels are
marked with calls-enqueue-kernel function attribute by
AMDGPUOpenCLEnqueueKernelLowering pass and later on
hidden kernel arguments metadata HiddenDefaultQueue and
HiddenCompletionAction are emitted for them.

Differential Revision: https://reviews.llvm.org/D39255


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316907 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-30 14:30:28 +00:00
Tom Stellard
f94e4f3ed7 AMDGPU/GlobalISel: Mark 32-bit G_FADD as legal
Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38439

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316815 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-27 23:57:41 +00:00
Matt Arsenault
aa108d865d DAG: Fold fma (fneg x), K, y -> fma x, -K, y
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316753 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-27 09:06:07 +00:00
Konstantin Zhuravlyov
fdd275fc25 AMDGPU: Commit missing fence-barrier test
This should have been committed with memory model implementation


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316680 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-26 17:54:09 +00:00
Marek Olsak
b25352f371 AMDGPU: Handle s_buffer_load_dword hazard on SI
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D39171

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316666 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-26 14:43:02 +00:00
Alexander Richardson
5d1f72d3b5 Fix CodeGen/AMDGPU/fcanonicalize-elimination.ll on FreeBSD 11.0
Summary:
On FreeBSD11.0 the FileCheck NOT string "1.0" will be matched by
`.amd_amdgpu_isa "amdgcn-unknown-freebsd11.0--gfx802"` at the end of the
file. Add a CHECK for that directive to avoid failing the test.

Reviewers: rampitec, kzhuravl

Reviewed By: rampitec, kzhuravl

Subscribers: emaste, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, krytarowski

Differential Revision: https://reviews.llvm.org/D39306

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 21:44:21 +00:00
Konstantin Zhuravlyov
a3d8d8b25f AMDGPU: Cleanup memory legalizer load/store tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316590 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 17:04:46 +00:00
Konstantin Zhuravlyov
e62a0cb491 AMDGPU/NFC: Rename memory legalizer tests:
- memory-legalizer-atomic-load.ll -> memory-legalizer-load.ll
  - memory-legalizer-atomic-store.ll -> memory-legalizer-store.ll


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316586 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 16:45:28 +00:00
Daniil Fukalov
65f1735a62 [inlineasm] Fix crash when number of matched input constraint operands overflows signed char
In a case when number of output constraint operands that has matched input operands
doesn't fit to signed char, TargetLowering::ParseConstraints() can try to access
ConstraintOperands (that is std::vector) with negative index.

Reviewers: rampitec, arsenm

Differential Review: https://reviews.llvm.org/D39125

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316574 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 12:51:32 +00:00
Matt Arsenault
7bb5f9ead4 DAG: Fix creating select with wrong condition type
This code added in r297930 assumed that it could create
a select with a condition type that is just an integer
bitcast of the selected type. For AMDGPU any vselect is
going to be scalarized (although the vector types are legal),
and all select conditions must be i1 (the same as getSetCCResultType).

This logic doesn't really make sense to me, but there's
never really been a consistent policy in what the select
condition mask type is supposed to be. Try to extend
the logic for skipping the transform for condition types
that aren't setccs. It doesn't seem quite right to me though,
but checking conditions that seem more sensible (like whether the
vselect is going to be expanded) doesn't work since this
seems to depend on that also.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316554 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 07:14:07 +00:00
Justin Bogner
edab757966 MIR: Print the register class or bank in vreg defs
This updates the MIRPrinter to include the regclass when printing
virtual register defs, which is already valid syntax for the
parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,

  %1(s64) = COPY %0(s64)

would now be written as

  %1:gpr(s64) = COPY %0(s64)

While this change alone introduces a bit of redundancy with the
registers block, it allows us to update the tests to be more concise
and understandable and brings us closer to being able to remove the
registers block completely.

Note: We generally only print the class in defs, but there is one
exception. If there are uses without any defs whatsoever, we'll print
the class on all uses. I'm not completely convinced this comes up in
meaningful machine IR, but for now the MIRParser and MachineVerifier
both accept that kind of stuff, so we don't want to have a situation
where we can print something we can't parse.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316479 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 18:04:54 +00:00
Marek Olsak
4fda278e9b AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316427 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 10:27:13 +00:00
Marek Olsak
7525c08782 AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D38543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316426 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 10:26:59 +00:00
Matt Arsenault
367cfd84c3 AMDGPU: Fix default range in non-kernel functions
The range should be assumed to be the hardware maximum
if a workitem intrinsic is used in a callable function
which does not know the restricted limit of the calling
kernel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316346 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-23 17:09:35 +00:00
Justin Bogner
ba2fa173d9 Canonicalize a large number of mir tests using update_mir_test_checks
This converts a large and somewhat arbitrary set of tests to use
update_mir_test_checks. I ran the script on all of the tests I expect
to need to modify for an upcoming mir syntax change and kept the ones
that obviously didn't change the tests in ways that might make it
harder to understand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316137 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 23:18:12 +00:00
Konstantin Zhuravlyov
28cb7901b7 AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency
Differential Revision: https://reviews.llvm.org/D38957


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316097 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 17:31:09 +00:00
Wei Ding
607acf30af AMDGPU : Fix an error for the llvm.cttz implementation.
Differential Revision: http://reviews.llvm.org/D39014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316037 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-17 21:49:52 +00:00
Konstantin Zhuravlyov
cecf102e0c AMDGPU: Start generating metadata for MaxFlatWorkGroupSize
Differential Revision: https://reviews.llvm.org/D38958


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316024 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-17 20:03:21 +00:00
Mark Searles
df60d8e59e Use the return value of UpdateNodeOperands(); in some cases, UpdateNodeOperands() modifies the node in-place and using the return value isn’t strictly necessary. However, it does not necessarily modify the node, but may return a resultant node if it already exists in the DAG. See comments in UpdateNodeOperands(). In that case, the return value must be used to avoid such scenarios as an infinite loop (node is assumed to have been updated, so added back to the worklist, and re-processed; however, node hasn’t changed so it is once again passed to UpdateNodeOperands(), assumed modified, added back to worklist; cycle infinitely repeats).
Differential Revision: https://reviews.llvm.org/D38466

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315957 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-16 23:38:53 +00:00
Alexander Timofeev
f6eb7ff7f8 [AMDGPU] : revert r315908
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315916 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-16 16:57:37 +00:00
Alexander Timofeev
7811640b9a [AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one
Differential revision: https://reviews.llvm.org/D38754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315908 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-16 14:35:29 +00:00
Konstantin Zhuravlyov
73363df463 AMDGPU: Temporary disable pal metadata check line in llvm-readobj test
It fails on mips


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315837 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 23:42:11 +00:00
Konstantin Zhuravlyov
5556d8485b AMDGPU: Bring HSA metadata on par with the specification
Differential Revision: https://reviews.llvm.org/D38753


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315821 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov
7032e50fbc llvm-readobj: Print AMDGPU note contents
Differential Revision: https://reviews.llvm.org/D38752


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 18:21:42 +00:00
Konstantin Zhuravlyov
3b76e7aa12 AMDGPU: Cleanup elf-notes.ll test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 17:36:53 +00:00
Konstantin Zhuravlyov
5376db4ce8 llvm-readobj: Print AMDGPU note type names
Differential Revision: https://reviews.llvm.org/D38751


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315813 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 16:43:46 +00:00
Konstantin Zhuravlyov
473d951406 AMDGPU: Do not emit deprecated notes for code object v3
Differential Revision: https://reviews.llvm.org/D38749


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315810 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov
eb211af057 AMDGPU: Add support for isa version note
- Emit NT_AMD_AMDGPU_ISA
  - Add assembler parsing for isa version directive
    - If isa version directive does not match command line arguments, then return error

Differential Revision: https://reviews.llvm.org/D38748


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315808 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 15:40:33 +00:00
Matt Arsenault
93f0e0c1d1 AMDGPU: Implement hasBitPreservingFPLogic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 21:10:22 +00:00
Matt Arsenault
b1763ab78e AMDGPU: Look for src mods before fp_extend
When selecting modifiers for mad_mix instructions,
look at fneg/fabs that occur before the conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315748 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 20:45:49 +00:00
Matt Arsenault
9a6875264b AMDGPU: Implement isFPExtFoldable
This helps match v_mad_mix* in some cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315744 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 20:18:59 +00:00
Wei Ding
29de9d738e Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315610 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 19:37:14 +00:00
Tim Renouf
fadd83b09c [AMDGPU] For amdpal, widen interpolation mode workaround
Summary:
The interpolation mode workaround ensures that at least one
interpolation mode is enabled in PSInputAddr. It does not also check
PSInputEna on the basis that the user might enable bits in that
depending on run-time state.

However, for amdpal os type, the user does not enable some bits after
compilation based on run-time states; the register values being
generated here are the final ones set in the hardware. Therefore, apply
the workaround to PSInputAddr and PSInputEnable together. (The case
where a bit is set in PSInputAddr but not in PSInputEnable is where the
frontend set up an input arg for a particular interpolation mode, but
nothing uses that input arg. Really we should have an earlier pass that
removes such an arg.)

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 16:16:41 +00:00
Konstantin Zhuravlyov
257828766c AMDGPU/NFC: Minor clean ups in PAL metadata
- Move PAL metadata definitions to AMDGPUMetadata
  - Make naming consistent with HSA metadata

Differential Revision: https://reviews.llvm.org/D38745


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315523 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-11 22:41:09 +00:00