archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Alexander Timofeev	23476c4f80	[AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one Differential revision: https://reviews.llvm.org/D38754 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317884 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-10 12:21:10 +00:00
Yaxun Liu	bd2cfd038d	[AMDGPU] Fix pointer info for lowering load/store for r600 for amdgiz environment r600 uses dummy pointer info for lowering load/store. Since dummy pointer info assumes address space 0, this causes isel failure when temporary load/store SDNodes are generated for amdgiz environment. Since the offest is not constant, FixedStack pseudo source value cannot be used to create the pointer info. This patch creates pointer info using llvm undef value. At least this provides correct address space so that isel can be done correctly. Differential Revision: https://reviews.llvm.org/D39698 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317862 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-10 02:03:28 +00:00
Yaxun Liu	d3bd6cf5cc	[AMDGPU] Fix pointer info for pseudo source for r600 The pointer info for pseudo source for r600 is not correct when alloca addr space is not 0, which causes invalid SDNode for r600---amdgiz. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39670 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317861 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-10 01:53:24 +00:00
Marek Olsak	9960086c11	AMDGPU: Merge BUFFER_STORE_DWORD_OFFEN/OFFSET into x2, x4 Summary: Only 56 shaders (out of 48486) are affected. Totals from affected shaders (changed stats only): SGPRS: 2420 -> 2460 (1.65 %) Spilled VGPRs: 94 -> 112 (19.15 %) Scratch size: 524 -> 528 (0.76 %) dwords per thread Code Size: 187400 -> 184992 (-1.28 %) bytes One DiRT Showdown shader spills 6 more VGPRs. One Grid Autosport shader spills 12 more VGPRs. The other 54 shaders only have a decrease in code size. (I'm ignoring the SGPR noise) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39012 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317755 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-09 01:52:55 +00:00
Marek Olsak	aa75d4aeb0	AMDGPU: Lower buffer store and atomic intrinsics manually Summary: Without this, SIMemoryLegalizer inserts s_waitcnt vmcnt(0) before every buffer store and atomic instruction. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317754 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-09 01:52:48 +00:00
Marek Olsak	e79e4fb9f1	AMDGPU: Merge BUFFER_LOAD_DWORD_OFFSET into x2, x4 Summary: Only 3 (out of 48486) shaders are affected. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38951 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317753 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-09 01:52:36 +00:00
Marek Olsak	79b91c25d7	AMDGPU: Merge BUFFER_LOAD_DWORD_OFFEN into x2, x4 Summary: -9.9% code size decrease in affected shaders. Totals (changed stats only): SGPRS: 2151462 -> 2170646 (0.89 %) VGPRS: 1634612 -> 1640288 (0.35 %) Spilled SGPRs: 8942 -> 8940 (-0.02 %) Code Size: 52940672 -> 51727288 (-2.29 %) bytes Max Waves: 373066 -> 371718 (-0.36 %) Totals from affected shaders: SGPRS: 283520 -> 302704 (6.77 %) VGPRS: 227632 -> 233308 (2.49 %) Spilled SGPRs: 3966 -> 3964 (-0.05 %) Code Size: 12203080 -> 10989696 (-9.94 %) bytes Max Waves: 44070 -> 42722 (-3.06 %) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38950 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317752 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-09 01:52:30 +00:00
Marek Olsak	1a1019d019	AMDGPU: Merge S_BUFFER_LOAD_DWORD_IMM into x2, x4 Summary: Only constant offsets (*_IMM opcodes) are merged. It reuses code for LDS load/store merging. It relies on the scheduler to group loads. The results are mixed, I think they are mostly positive. Most shaders are affected, so here are total stats only: SGPRS: 2072198 -> 2151462 (3.83 %) VGPRS: 1628024 -> 1634612 (0.40 %) Spilled SGPRs: 7883 -> 8942 (13.43 %) Spilled VGPRs: 97 -> 101 (4.12 %) Scratch size: 1488 -> 1492 (0.27 %) dwords per thread Code Size: 60222620 -> 52940672 (-12.09 %) bytes Max Waves: 374337 -> 373066 (-0.34 %) There is 13.4% increase in SGPR spilling, DiRT Showdown spills a few more VGPRs (now 37), but 12% decrease in code size. These are the new stats for SGPR spilling. We already spill a lot SGPRs, so it's uncertain whether more spilling will make any difference since SGPRs are always spilled to VGPRs: SGPR SPILLING APPS Shaders SpillSGPR AvgPerSh alien_isolation 2938 100 0.0 batman_arkham_origins 589 6 0.0 bioshock-infinite 1769 4 0.0 borderlands2 3968 22 0.0 counter_strike_glob.. 1142 60 0.1 deus_ex_mankind_div.. 1410 79 0.1 dirt-showdown 533 4 0.0 dirt_rally 364 1163 3.2 divinity 1052 2 0.0 dota2 1747 7 0.0 f1-2015 776 1515 2.0 grid_autosport 1767 1505 0.9 hitman 1413 273 0.2 left_4_dead_2 1762 4 0.0 life_is_strange 1296 26 0.0 mad_max 358 96 0.3 metro_2033_redux 2670 60 0.0 payday2 1362 22 0.0 portal 474 3 0.0 saints_row_iv 1704 8 0.0 serious_sam_3_bfe 392 1348 3.4 shadow_of_mordor 1418 12 0.0 shadow_warrior 3956 239 0.1 talos_principle 324 1735 5.4 thea 172 17 0.1 tomb_raider 1449 215 0.1 total_war_warhammer 242 56 0.2 ue4_effects_cave 295 55 0.2 ue4_elemental 572 12 0.0 unigine_tropics 210 56 0.3 unigine_valley 278 152 0.5 victor_vran 1262 84 0.1 yofrankie 82 2 0.0 Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38949 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317751 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-09 01:52:23 +00:00
Marek Olsak	c94ae749c0	AMDGPU: Fold immediate offset into BUFFER_LOAD_DWORD lowered from SMEM Summary: -5.3% code size in affected shaders. Changed stats only: 48486 shaders in 30489 tests Totals: SGPRS: 2086406 -> 2072430 (-0.67 %) VGPRS: 1626872 -> 1627960 (0.07 %) Spilled SGPRs: 7865 -> 7912 (0.60 %) Code Size: 60978060 -> 60188764 (-1.29 %) bytes Max Waves: 374530 -> 374342 (-0.05 %) Totals from affected shaders: SGPRS: 299664 -> 285688 (-4.66 %) VGPRS: 233844 -> 234932 (0.47 %) Spilled SGPRs: 3959 -> 4006 (1.19 %) Code Size: 14905272 -> 14115976 (-5.30 %) bytes Max Waves: 46202 -> 46014 (-0.41 %) Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38915 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317750 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-09 01:52:17 +00:00
Matt Arsenault	a932c3f118	AMDGPU: Set correct sched model on v_mad_u64_u32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317645 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-08 00:48:25 +00:00
Bjorn Pettersson	4cbab70b62	[MIRPrinter] Use %subreg.xxx syntax for subregister index operands Summary: Print %subreg.<subregidxname> instead of just the subregister index when printing immediate operands corresponding to subreg indices in INSERT_SUBREG, EXTRACT_SUBREG, SUBREG_TO_REG and REG_SEQUENCE. Reviewers: qcolombet, MatzeB Reviewed By: MatzeB Subscribers: nhaehnle, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39696 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317513 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 21:46:06 +00:00
Matt Arsenault	bd04b64cd1	AMDGPU: Select v_mad_u64_u32 and v_mad_i64_i32 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317492 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 17:04:37 +00:00
Yaxun Liu	ec1f0cc416	[AMDGPU] Change alloca addr space of r600 to 5 for amdgiz environment Differential Revision: https://reviews.llvm.org/D39657 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317479 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 14:32:33 +00:00
Yaxun Liu	13a223e6e6	[AMDGPU] Fix assertion due to assuming pointer in default addr space is 32 bit The backend assumes pointer in default addr space is 32 bit, which is not true for the new addr space mapping and causes assertion for unresolved functions. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39643 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317476 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-06 13:01:33 +00:00
Yaxun Liu	35934f8100	[AMDGPU] Remove hardcoded address space value from AMDGPULibFunc AMDGPULibFunc hardcodes address space values of the old address space mapping, which causes invalid addrspacecast instructions and undefined functions in APPSDK sample MonteCarloAsianDP. This patch fixes that. Differential Revision: https://reviews.llvm.org/D39616 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317409 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-04 17:37:43 +00:00
Marek Olsak	0ce6825d9e	AMDGPU: Select s_buffer_load_dword with a non-constant SGPR offset Summary: Apps that benefit: - alien isolation - bioshock infinite - civilization: beyond earth - company of heroes 2 - dirt showdown - dota 2 - F1 2015 - grid autosport - hitman - legend of grimrock - serious sam 3: bfe - shadow warrior - talos principle - total war: warhammer - UE4 demos: effects cave, elemental, sun temple Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D38914 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317038 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-31 21:06:42 +00:00
Yaxun Liu	a52756b2c9	[AMDGPU] Emit metadata for hidden arguments for kernel enqueue Identifies kernels which performs device side kernel enqueues and emit metadata for the associated hidden kernel arguments. Such kernels are marked with calls-enqueue-kernel function attribute by AMDGPUOpenCLEnqueueKernelLowering pass and later on hidden kernel arguments metadata HiddenDefaultQueue and HiddenCompletionAction are emitted for them. Differential Revision: https://reviews.llvm.org/D39255 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316907 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-30 14:30:28 +00:00
Tom Stellard	f94e4f3ed7	AMDGPU/GlobalISel: Mark 32-bit G_FADD as legal Reviewers: arsenm Reviewed By: arsenm Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, igorb, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38439 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316815 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-27 23:57:41 +00:00
Matt Arsenault	aa108d865d	DAG: Fold fma (fneg x), K, y -> fma x, -K, y git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316753 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-27 09:06:07 +00:00
Konstantin Zhuravlyov	fdd275fc25	AMDGPU: Commit missing fence-barrier test This should have been committed with memory model implementation git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316680 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-26 17:54:09 +00:00
Marek Olsak	b25352f371	AMDGPU: Handle s_buffer_load_dword hazard on SI Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye Differential Revision: https://reviews.llvm.org/D39171 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316666 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-26 14:43:02 +00:00
Alexander Richardson	5d1f72d3b5	Fix CodeGen/AMDGPU/fcanonicalize-elimination.ll on FreeBSD 11.0 Summary: On FreeBSD11.0 the FileCheck NOT string "1.0" will be matched by `.amd_amdgpu_isa "amdgcn-unknown-freebsd11.0--gfx802"` at the end of the file. Add a CHECK for that directive to avoid failing the test. Reviewers: rampitec, kzhuravl Reviewed By: rampitec, kzhuravl Subscribers: emaste, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, krytarowski Differential Revision: https://reviews.llvm.org/D39306 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316616 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-25 21:44:21 +00:00
Konstantin Zhuravlyov	a3d8d8b25f	AMDGPU: Cleanup memory legalizer load/store tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316590 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-25 17:04:46 +00:00
Konstantin Zhuravlyov	e62a0cb491	AMDGPU/NFC: Rename memory legalizer tests: - memory-legalizer-atomic-load.ll -> memory-legalizer-load.ll - memory-legalizer-atomic-store.ll -> memory-legalizer-store.ll git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316586 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-25 16:45:28 +00:00
Daniil Fukalov	65f1735a62	[inlineasm] Fix crash when number of matched input constraint operands overflows signed char In a case when number of output constraint operands that has matched input operands doesn't fit to signed char, TargetLowering::ParseConstraints() can try to access ConstraintOperands (that is std::vector) with negative index. Reviewers: rampitec, arsenm Differential Review: https://reviews.llvm.org/D39125 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316574 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-25 12:51:32 +00:00
Matt Arsenault	7bb5f9ead4	DAG: Fix creating select with wrong condition type This code added in r297930 assumed that it could create a select with a condition type that is just an integer bitcast of the selected type. For AMDGPU any vselect is going to be scalarized (although the vector types are legal), and all select conditions must be i1 (the same as getSetCCResultType). This logic doesn't really make sense to me, but there's never really been a consistent policy in what the select condition mask type is supposed to be. Try to extend the logic for skipping the transform for condition types that aren't setccs. It doesn't seem quite right to me though, but checking conditions that seem more sensible (like whether the vselect is going to be expanded) doesn't work since this seems to depend on that also. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316554 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-25 07:14:07 +00:00
Justin Bogner	edab757966	MIR: Print the register class or bank in vreg defs This updates the MIRPrinter to include the regclass when printing virtual register defs, which is already valid syntax for the parser. That is, given 64 bit %0 and %1 in a "gpr" regbank, %1(s64) = COPY %0(s64) would now be written as %1:gpr(s64) = COPY %0(s64) While this change alone introduces a bit of redundancy with the registers block, it allows us to update the tests to be more concise and understandable and brings us closer to being able to remove the registers block completely. Note: We generally only print the class in defs, but there is one exception. If there are uses without any defs whatsoever, we'll print the class on all uses. I'm not completely convinced this comes up in meaningful machine IR, but for now the MIRParser and MachineVerifier both accept that kind of stuff, so we don't want to have a situation where we can print something we can't parse. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316479 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-24 18:04:54 +00:00
Marek Olsak	4fda278e9b	AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1) Summary: Kill the thread if operand 0 == false. llvm.amdgcn.wqm.vote can be applied to the operand. Also allow kill in all shader stages. Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D38544 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316427 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-24 10:27:13 +00:00
Marek Olsak	7525c08782	AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic Reviewers: arsenm, nhaehnle Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D38543 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316426 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-24 10:26:59 +00:00
Matt Arsenault	367cfd84c3	AMDGPU: Fix default range in non-kernel functions The range should be assumed to be the hardware maximum if a workitem intrinsic is used in a callable function which does not know the restricted limit of the calling kernel. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316346 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-23 17:09:35 +00:00
Justin Bogner	ba2fa173d9	Canonicalize a large number of mir tests using update_mir_test_checks This converts a large and somewhat arbitrary set of tests to use update_mir_test_checks. I ran the script on all of the tests I expect to need to modify for an upcoming mir syntax change and kept the ones that obviously didn't change the tests in ways that might make it harder to understand. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316137 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-18 23:18:12 +00:00
Konstantin Zhuravlyov	28cb7901b7	AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency Differential Revision: https://reviews.llvm.org/D38957 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316097 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-18 17:31:09 +00:00
Wei Ding	607acf30af	AMDGPU : Fix an error for the llvm.cttz implementation. Differential Revision: http://reviews.llvm.org/D39014 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316037 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-17 21:49:52 +00:00
Konstantin Zhuravlyov	cecf102e0c	AMDGPU: Start generating metadata for MaxFlatWorkGroupSize Differential Revision: https://reviews.llvm.org/D38958 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316024 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-17 20:03:21 +00:00
Mark Searles	df60d8e59e	Use the return value of UpdateNodeOperands(); in some cases, UpdateNodeOperands() modifies the node in-place and using the return value isn’t strictly necessary. However, it does not necessarily modify the node, but may return a resultant node if it already exists in the DAG. See comments in UpdateNodeOperands(). In that case, the return value must be used to avoid such scenarios as an infinite loop (node is assumed to have been updated, so added back to the worklist, and re-processed; however, node hasn’t changed so it is once again passed to UpdateNodeOperands(), assumed modified, added back to worklist; cycle infinitely repeats). Differential Revision: https://reviews.llvm.org/D38466 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315957 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-16 23:38:53 +00:00
Alexander Timofeev	f6eb7ff7f8	[AMDGPU] : revert r315908 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315916 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-16 16:57:37 +00:00
Alexander Timofeev	7811640b9a	[AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one Differential revision: https://reviews.llvm.org/D38754 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315908 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-16 14:35:29 +00:00
Konstantin Zhuravlyov	73363df463	AMDGPU: Temporary disable pal metadata check line in llvm-readobj test It fails on mips git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315837 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 23:42:11 +00:00
Konstantin Zhuravlyov	5556d8485b	AMDGPU: Bring HSA metadata on par with the specification Differential Revision: https://reviews.llvm.org/D38753 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315821 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov	7032e50fbc	llvm-readobj: Print AMDGPU note contents Differential Revision: https://reviews.llvm.org/D38752 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315819 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 18:21:42 +00:00
Konstantin Zhuravlyov	3b76e7aa12	AMDGPU: Cleanup elf-notes.ll test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315816 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 17:36:53 +00:00
Konstantin Zhuravlyov	5376db4ce8	llvm-readobj: Print AMDGPU note type names Differential Revision: https://reviews.llvm.org/D38751 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315813 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 16:43:46 +00:00
Konstantin Zhuravlyov	473d951406	AMDGPU: Do not emit deprecated notes for code object v3 Differential Revision: https://reviews.llvm.org/D38749 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315810 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov	eb211af057	AMDGPU: Add support for isa version note - Emit NT_AMD_AMDGPU_ISA - Add assembler parsing for isa version directive - If isa version directive does not match command line arguments, then return error Differential Revision: https://reviews.llvm.org/D38748 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315808 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-14 15:40:33 +00:00
Matt Arsenault	93f0e0c1d1	AMDGPU: Implement hasBitPreservingFPLogic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315754 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-13 21:10:22 +00:00
Matt Arsenault	b1763ab78e	AMDGPU: Look for src mods before fp_extend When selecting modifiers for mad_mix instructions, look at fneg/fabs that occur before the conversion. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315748 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-13 20:45:49 +00:00
Matt Arsenault	9a6875264b	AMDGPU: Implement isFPExtFoldable This helps match v_mad_mix* in some cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315744 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-13 20:18:59 +00:00
Wei Ding	29de9d738e	Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ. Differential Revision: http://reviews.llvm.org/D37348 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315610 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-12 19:37:14 +00:00
Tim Renouf	fadd83b09c	[AMDGPU] For amdpal, widen interpolation mode workaround Summary: The interpolation mode workaround ensures that at least one interpolation mode is enabled in PSInputAddr. It does not also check PSInputEna on the basis that the user might enable bits in that depending on run-time state. However, for amdpal os type, the user does not enable some bits after compilation based on run-time states; the register values being generated here are the final ones set in the hardware. Therefore, apply the workaround to PSInputAddr and PSInputEnable together. (The case where a bit is set in PSInputAddr but not in PSInputEnable is where the frontend set up an input arg for a particular interpolation mode, but nothing uses that input arg. Really we should have an earlier pass that removes such an arg.) Reviewers: arsenm, nhaehnle, dstuttard Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits Differential Revision: https://reviews.llvm.org/D37758 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-12 16:16:41 +00:00
Konstantin Zhuravlyov	257828766c	AMDGPU/NFC: Minor clean ups in PAL metadata - Move PAL metadata definitions to AMDGPUMetadata - Make naming consistent with HSA metadata Differential Revision: https://reviews.llvm.org/D38745 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315523 91177308-0d34-0410-b5e6-96231b3b80d8	2017-10-11 22:41:09 +00:00

1 2 3 4 5 ...

1296 Commits