archived-llvm

mirror of https://github.com/RPCS3/llvm.git synced 2026-01-31 01:25:19 +01:00

Author	SHA1	Message	Date
Yaxun Liu	73f6582e91	CodeGen: Fix assertion in ScheduleDAGMILive::scheduleMI due to llvm.dbg.value Fix a bug in ScheduleDAGMILive::scheduleMI which causes BotRPTracker not tracking CurrentBottom in some rare cases involving llvm.dbg.value. This issues causes amdgcn target to assert when compiling some user codes with -g. Differential Revision: https://reviews.llvm.org/D42394 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323214 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-23 16:04:53 +00:00
Mark Searles	fb6bc12a56	[AMDGPU] SI Load Store Optimizer: When merging with offset, use V_ADD_{I\|U}32_e64 - Change inserted add ( V_ADD_{I\|U}32_e32 ) to _e64 version ( V_ADD_{I\|U}32_e64 ) so that the add uses a vreg for the carry; this prevents inserted v_add from killing VCC; the _e64 version doesn't accept a literal in its encoding, so we need to introduce a mov instr as well to get the imm into a register. - Change pass name to "SI Load Store Optimizer"; this removes the '/', which complicates scripts. Differential Revision: https://reviews.llvm.org/D42124 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323153 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-22 21:46:43 +00:00
Daniel Neilson	afa2e7e6a6	Remove alignment argument from memcpy/memmove/memset in favour of alignment attributes (Step 1) Summary: This is a resurrection of work first proposed and discussed in Aug 2015: http://lists.llvm.org/pipermail/llvm-dev/2015-August/089384.html and initially landed (but then backed out) in Nov 2015: http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html The @llvm.memcpy/memmove/memset intrinsics currently have an explicit argument which is required to be a constant integer. It represents the alignment of the dest (and source), and so must be the minimum of the actual alignment of the two. This change is the first in a series that allows source and dest to each have their own alignments by using the alignment attribute on their arguments. In this change we: 1) Remove the alignment argument. 2) Add alignment attributes to the source & dest arguments. We, temporarily, require that the alignments for source & dest be equal. For example, code which used to read: call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 100, i32 4, i1 false) will now read call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 4 %dest, i8* align 4 %src, i32 100, i1 false) Downstream users may have to update their lit tests that check for @llvm.memcpy/memmove/memset call/declaration patterns. The following extended sed script may help with updating the majority of your tests, but it does not catch all possible patterns so some manual checking and updating will be required. s~declare void @llvm\.mem(set\|cpy\|move)\.p([^(])$(.), i32, i1$~declare void @llvm.mem\1.p\2(\3, i1)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2* \3, i8 \4, i8 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2* \3, i8 \4, i16 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2* \3, i8 \4, i32 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2* \3, i8 \4, i64 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2* \3, i8 \4, i128 \5, i1 \6)~g s~call void @llvm\.memset\.p([^(])i8$i8([^])\ (.), i8 (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i8(i8\2 align \6 \3, i8 \4, i8 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i16$i8([^])\ (.), i8 (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i16(i8\2 align \6 \3, i8 \4, i16 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i32$i8([^])\ (.), i8 (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i32(i8\2 align \6 \3, i8 \4, i32 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i64$i8([^])\ (.), i8 (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i64(i8\2 align \6 \3, i8 \4, i64 \5, i1 \7)~g s~call void @llvm\.memset\.p([^(])i128$i8([^])\ (.), i8 (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.memset.p\1i128(i8\2 align \6 \3, i8 \4, i128 \5, i1 \7)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3 \4, i8\5* \6, i8 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3 \4, i8\5* \6, i16 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3 \4, i8\5* \6, i32 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3 \4, i8\5* \6, i64 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 [01], i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3 \4, i8\5* \6, i128 \7, i1 \8)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i8$i8([^])\ (.), i8([^])\ (.), i8 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i8(i8\3* align \8 \4, i8\5* align \8 \6, i8 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i16$i8([^])\ (.), i8([^])\ (.), i16 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i16(i8\3* align \8 \4, i8\5* align \8 \6, i16 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i32$i8([^])\ (.), i8([^])\ (.), i32 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i32(i8\3* align \8 \4, i8\5* align \8 \6, i32 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i64$i8([^])\ (.), i8([^])\ (.), i64 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i64(i8\3* align \8 \4, i8\5* align \8 \6, i64 \7, i1 \9)~g s~call void @llvm\.mem(cpy\|move)\.p([^(])i128$i8([^])\ (.), i8([^])\ (.), i128 (.), i32 ([0-9]), i1 ([^)])$~call void @llvm.mem\1.p\2i128(i8\3* align \8 \4, i8\5* align \8 \6, i128 \7, i1 \9)~g The remaining changes in the series will: Step 2) Expand the IRBuilder API to allow creation of memcpy/memmove with differing source and dest alignments. Step 3) Update Clang to use the new IRBuilder API. Step 4) Update Polly to use the new IRBuilder API. Step 5) Update LLVM passes that create memcpy/memmove calls to use the new IRBuilder API, and those that use use MemIntrinsicInst::[get\|set]Alignment() to use getDestAlignment() and getSourceAlignment() instead. Step 6) Remove the single-alignment IRBuilder API for memcpy/memmove, and the MemIntrinsicInst::[get\|set]Alignment() methods. Reviewers: pete, hfinkel, lhames, reames, bollu Reviewed By: reames Subscribers: niosHD, reames, jholewinski, qcolombet, jfb, sanjoy, arsenm, dschuff, dylanmckay, mehdi_amini, sdardis, nemanjai, david2050, nhaehnle, javed.absar, sbc100, jgravelle-google, eraman, aheejin, kbarton, JDevlieghere, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, llvm-commits Differential Revision: https://reviews.llvm.org/D41675 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322965 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-19 17:13:12 +00:00
Matthias Braun	10c6c76ce2	Move tests to the correct place test/CodeGen/MIR is for testing the MIR parser/printer. Tests for passes and targets belong to test/CodeGen/TARGETNAME. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322925 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-19 06:08:15 +00:00
Changpeng Fang	1e426c2a34	AMDGPU/SI: Add d16 support for image intrinsics. Summary: This patch implements d16 support for image load, image store and image sample intrinsics. Reviewers: Matt, Brian. Differential Revision: https://reviews.llvm.org/D3991 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322903 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-18 22:08:53 +00:00
Daniil Fukalov	a520636d37	[AMDGPU] add LDS f32 intrinsics added llvm.amdgcn.atomic.{add\|min\|max}.f32 intrinsics to allow generate ds_{add\|min\|max}[_rtn]_f32 instructions needed for OpenCL float atomics in LDS Reviewed by: arsenm Differential Revision: https://reviews.llvm.org/D37985 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322656 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-17 14:05:05 +00:00
Stanislav Mekhanoshin	f20c58e01b	[AMDGPU] Add HW_REG_SH_MEM_BASES symbolic name for s_getreg_b32 Differential Revision: https://reviews.llvm.org/D41617 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322500 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-15 18:49:15 +00:00
Tim Renouf	bc95ef3252	[AMDGPU] stop image_store being moved illegally Summary: A recent change 321556: AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores can allow the machine instruction scheduler to move an image store past an image load using the same descriptor. V2: Fixed by marking image ops as mayAlias and isAliased. This may be overly conservative, and we may need to revisit. V3: Reverted test change done on 321556. Reviewers: arsenm, nhaehnle, dstuttard Subscribers: llvm-commits, t-tye, yaxunl, wdng, kzhuravl Differential Revision: https://reviews.llvm.org/D41969 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322419 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-12 22:57:24 +00:00
Changpeng Fang	d6fbd6ac45	AMDGPU/SI: Add d16 support for buffer intrinsics. Differential Revision: https://reviews.llvm.org/D38906 Reviewers: Matt and Brian. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322402 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-12 21:12:19 +00:00
Rafael Espindola	1e1801ccb0	Make internal/private GVs implicitly dso_local. While updating clang tests for having clang set dso_local I noticed that: - There are a lot of tests to update. - Many of the updates are redundant. They are redundant because a GV is "obviously dso_local". This patch starts formalizing that a bit by requiring that internal and private GVs be dso_local too. Since they all are, we don't have to print dso_local to the textual representation, making it a bit more compact and easier to read. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322317 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-11 22:15:05 +00:00
Puyan Lotfi	b931375195	[MIR] Repurposing '$' sigil used by external symbols. Replacing with '&'. Planning to add support for named vregs. This puts is in a conundrum since physregs are named as well. To rectify this we need to use a sigil other than '%' for physregs in MIR. We've settled on using '$' for physregs but first we must repurpose it from external symbols using it, which is what this commit is all about. We think '&' will have familiar semantics for C/C++ users. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322146 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-10 00:56:48 +00:00
Tim Renouf	ebe2b0b004	[SelectionDAG] Fixed f16-from-vector promotion problem Summary: In the case of an fp_extend of v1f16 to v1f32 where the v1f16 is the result of a bitcast from i16, avoid creating an illegal fp16_to_fp where the input is not a vector and the result is a v1f32. V2: The fix is now to avoid vector scalarization creating a v1->scalar bitcast. Reviewers: srhines, t.p.northover Subscribers: nhaehnle, llvm-commits, dstuttard, t-tye, yaxunl, wdng, kzhuravl, arsenm Differential Revision: https://reviews.llvm.org/D41126 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322120 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-09 21:36:25 +00:00
Tim Renouf	0eaae26af3	[AMDGPU] Fixed incorrect uniform branch condition Summary: I had a case where multiple nested uniform ifs resulted in code that did v_cmp comparisons, combining the results with s_and_b64, s_or_b64 and s_xor_b64 and using the resulting mask in s_cbranch_vccnz, without first ensuring that bits for inactive lanes were clear. There was already code for inserting an "s_and_b64 vcc, exec, vcc" to clear bits for inactive lanes in the case that the branch is instruction selected as s_cbranch_scc1 and is then changed to s_cbranch_vccnz in SIFixSGPRCopies. I have added the same code into SILowerControlFlow for the case that the branch is instruction selected as s_cbranch_vccnz. This de-optimizes the code in some cases where the s_and is not needed, because vcc is the result of a v_cmp, or multiple v_cmp instructions combined by s_and/s_or. We should add a pass to re-optimize those cases. Reviewers: arsenm, kzhuravl Subscribers: wdng, yaxunl, t-tye, llvm-commits, dstuttard, timcorringham, nhaehnle Differential Revision: https://reviews.llvm.org/D41292 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322119 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-09 21:34:43 +00:00
Francis Visoiu Mistrih	93f5fcdff3	[CodeGen] Don't print register classes in -debug output Since register classes and banks are already printed with the register definition, don't print it at the end of every instruction anymore. This follows MIR in this regard and is another step to the unification of the two formats. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322086 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-09 15:39:44 +00:00
Matt Arsenault	aca5381bd1	StructurizeCFG: Fix broken backedge detection The work order was changed in r228186 from SCC order to RPO with an arbitrary sorting function. The sorting function attempted to move inner loop nodes earlier. This was was apparently relying on an assumption that every block in a given loop / the same loop depth would be seen before visiting another loop. In the broken testcase, a block outside of the loop was encountered before moving onto another block in the same loop. The testcase would then structurize such that one blocks unconditional successor could never be reached. Revert to plain RPO for the analysis phase. This fixes detecting edges as backedges that aren't really. The processing phase does use another visited set, and I'm unclear on whether the order there is as important. An arbitrary order doesn't work, and triggers some infinite loops. The reversed RPO list seems to work and is closer to the order that was used before, minus the arbitary custom sorting. A few of the changed tests now produce smaller code, and a few are slightly worse looking. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321751 91177308-0d34-0410-b5e6-96231b3b80d8	2018-01-03 18:45:37 +00:00
Philip Reames	3516b9f977	2nd attempt at "fixing" amdgpu tests after r321575 The test needs to be changed; it was exercising UB and that likely wasn't the intent of the test author. I simply removed the checks because I have absolutely no idea what this test was trying to accomplish. With multiple check patterns, no explanation, and no familiarity on my part with the ISA a true fix is going to have to come from someone familiar with the target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321591 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-31 03:34:36 +00:00
Philip Reames	7f276b9a88	Test fix after r321575 The test in question was checking for a particular intepretation of undefined behavior. Relax the test to check that we simply don't crash. Sorry for the breakage, I don't generally build AMDGPU locally and just saw the failure this morning. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321589 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-30 18:42:37 +00:00
Matt Arsenault	8923569a3d	AMDGPU: Remove mayLoad/hasSideEffects from MIMG stores Atomics still have hasSideEffects set on them because of the mess that is the memory properties. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321556 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-29 17:18:18 +00:00
Mark Searles	a7ce6d3cc0	[AMDGPU] Turn off MergeConsecutiveStores() before Instruction Selection for AMDGPU. Commit `dbbb6c5fc3` turned on MergeConsecutiveStores() before Instruction Selection for all targets. Enough AMDGPU compiles go into an infinite loop ( MergeConsecutiveStores() merges two stores; LegalizeStoreOps() un-merges; MergeConsecutiveStores() re-merges, etc. ) to warrant turning it off until the issues can be addressed. Differential Revision: https://reviews.llvm.org/D41377 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321100 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-19 19:26:23 +00:00
Sean Fertile	b71c6a9b39	[Memcpy Loop Lowering] Remove the fixed int8 lowering. Switch over to the lowering that uses target supplied operand types. Differential Revision: https://reviews.llvm.org/D41201 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320989 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-18 15:31:14 +00:00
Yaxun Liu	06d39e2dc8	Recommit CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value The regression on ppc64 was not due to this commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320788 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-15 03:56:57 +00:00
Yaxun Liu	92d81a8d46	Revert CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value This commit might have caused regression on ppc64. Revert it to verify that. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320712 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-14 16:12:04 +00:00
Yaxun Liu	084f87947f	CodeGen: Fix assertion in machine inst sheduler due to llvm.dbg.value Two issues were found about machine inst scheduler when compiling ProRender with -g for amdgcn target: GCNScheduleDAGMILive::schedule tries to update LiveIntervals for DBG_VALUE, which it should not since DBG_VALUE is not mapped in LiveIntervals. when DBG_VALUE is the last instruction of MBB, ScheduleDAGInstrs::buildSchedGraph and ScheduleDAGMILive::scheduleMI does not move RPTracker properly, which causes assertion. This patch fixes that. Differential Revision: https://reviews.llvm.org/D41132 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320650 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-13 22:38:09 +00:00
Geoff Berry	3b391fe80e	[MachineOperand][MIR] Add isRenamable to MachineOperand. Summary: Add isRenamable() predicate to MachineOperand. This predicate can be used by machine passes after register allocation to determine whether it is safe to rename a given register operand. Register operands that aren't marked as renamable may be required to be assigned their current register to satisfy constraints that are not captured by the machine IR (e.g. ABI or ISA constraints). Reviewers: qcolombet, MatzeB, hfinkel Subscribers: nemanjai, mcrosier, javed.absar, llvm-commits Differential Revision: https://reviews.llvm.org/D39400 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320503 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-12 17:53:59 +00:00
Matt Arsenault	c009e3de76	LSR: Check more intrinsic pointer operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320424 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-11 21:38:43 +00:00
Konstantin Zhuravlyov	88dc958162	AMDGPU/GCN: Bring processors in sync with AMDGPUUsage - Add gfx704 - Change bonaire to gfx704 - Remove gfx804 - Remove gfx901 - Remove gfx903 Differential Revision: https://reviews.llvm.org/D40046 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320194 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-08 20:52:28 +00:00
Matt Arsenault	75d7256af4	AMDGPU: image_getlod and image_getresinfo do not read memory git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320187 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-08 20:00:57 +00:00
Konstantin Zhuravlyov	9621d21ae2	AMDGPU: Report Arg's Value name in metadata if kernel_arg_name metadata is not available Differential Revision: https://reviews.llvm.org/D40924 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320176 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-08 19:22:12 +00:00
Mark Searles	82e0652b95	[AMDGPU] Revert "[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output." Patch caused a buildbot failure; http://lab.llvm.org:8011/builders/lld-x86_64-darwin13/builds/15733/steps/build_Lld/logs/stdio : lib/Target/AMDGPU/SIInsertWaitcnts.cpp:396:11: error: private field 'InstCnt' is not used [-Werror,-Wunused-private-field] int32_t InstCnt = 0; ^ 1 error generated. " This reverts commit `71627f7901`. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320086 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 21:14:41 +00:00
Mark Searles	71627f7901	[AMDGPU] Add options for waitcnt pass debugging; add instr count in debug output. -amdgpu-waitcnt-forcezero={1\|0} Force all waitcnt instrs to be emitted as s_waitcnt vmcnt(0) expcnt(0) lgkmcnt(0) -amdgpu-waitcnt-forceexp=<n> Force emit a s_waitcnt expcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcelgkm=<n> Force emit a s_waitcnt lgkmcnt(0) before the first <n> instrs -amdgpu-waitcnt-forcevm=<n> Force emit a s_waitcnt vmcnt(0) before the first <n> instrs Differential Revision: https://reviews.llvm.org/D40091 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320084 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 20:36:39 +00:00
Mark Searles	5e07001873	[AMDGPU] Add GCNHazardRecognizer::checkInlineAsmHazards() and GCNHazardRecognizer::checkVALUHazardsHelper(). checkInlineAsmHazards() checks INLINEASM for hazards that we particularly care about (so not exhaustive); this patch adds a check for INLINEASM that defs vregs that hold data-to-be stored by immediately preceding store of more than 8 bytes. If the instr were not within an INLINEASM, this scenario would be handled by checkVALUHazard(). Add checkVALUHazardsHelper(), which will be called by both checkVALUHazards() and checkInlineAsmHazards(). Differential Revision: https://reviews.llvm.org/D40098 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320083 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 20:34:25 +00:00
Francis Visoiu Mistrih	fd11bc0813	[CodeGen] Use MachineOperand::print in the MIRPrinter for MO_Register. Work towards the unification of MIR and debug output by refactoring the interfaces. For MachineOperand::print, keep a simple version that can be easily called from `dump()`, and a more complex one which will be called from both the MIRPrinter and MachineInstr::print. Add extra checks inside MachineOperand for detached operands (operands with getParent() == nullptr). https://reviews.llvm.org/D40836 * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+)<def> ([^ ]+)/kill: \1 def \2 \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/kill: def ([^ ]+) ([^ ]+) ([^ ]+)<def>/kill: def \1 \2 def \3/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/<def>//g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<kill>/killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use,kill>/implicit killed \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<def[ ],[ ]dead>/dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def[ ],[ ]dead>/implicit-def dead \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-def>/implicit-def \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<imp-use>/implicit \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name ".s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<internal>/internal \1/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" -o -name "*.s" $ -type f -print0 \| xargs -0 sed -i '' -E 's/([^ ]+)<undef>/undef \1/g' git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320022 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-07 10:40:31 +00:00
Zvi Rackover	e67db33ecd	AMDGPU Tests: Change a case to be run with -O0 D40231 requires to run case with -O0 to prevent InstructionSimplify from transforming an extractelement with undef index. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319907 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-06 17:40:09 +00:00
Matt Arsenault	dc79c646e7	AMDGPU: Fix SDWA crash on inline asm This was only searching for explicit defs, and asserting for any implicit or variadic instruction defs, like inline asm. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319826 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 20:32:01 +00:00
Matt Arsenault	30bcf2f789	AMDGPU: Fix infinite loop with dbg_value Surprisingly SIOptimizeExecMaskingPreRA can infinite loop in some case with DBG_VALUE. Most tests using dbg_value are run at -O0, so don't run this pass. This seems to only happen when the value argument is undef. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319808 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 18:23:17 +00:00
Matt Arsenault	a50e269370	AMDGPU: Fix crash when scheduling DBG_VALUE This calls handleMove with a DBG_VALUE instruction, which isn't tracked by LiveIntervals. I'm not sure this is the correct place to fix this. The generic scheduler seems to have more deliberate region selection that skips dbg_value. The test is also really hard to reduce. I haven't been able to figure out what exactly causes this particular case to try moving the dbg_value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319732 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-05 03:09:23 +00:00
Jan Vesely	f68b9beeb9	AMDGPU/EG: Add a new FeatureFMA and use it to selectively enable FMA instruction Only used by pre-GCN targets v2: fix predicate setting for FMA_Common Differential Revision: https://reviews.llvm.org/D40692 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319712 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 23:07:28 +00:00
Jan Vesely	85c02734d1	AMDGPU: Disable fp64 support on pre GCN asics It's not implemented. Passing +fp64-fp16-denormal feature enables fp64 even on asics that don't support it v2: fix hasFP64 query Differential Revision: https://reviews.llvm.org/D39931 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319709 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 22:57:29 +00:00
Matt Arsenault	79f2fee592	AMDGPU: Fix creating invalid copy when adjusting dmask Move the entire optimization to one place. Before it was possible to adjust dmask without changing the register class of the output instruction, since they were done in separate places. Fix all lane sizes and move all of the optimization into the DAG folding. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319705 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 22:18:27 +00:00
Francis Visoiu Mistrih	ca0df55065	[CodeGen] Unify MBB reference format in both MIR and debug output As part of the unification of the debug format and the MIR format, print MBB references as '%bb.5'. The MIR printer prints the IR name of a MBB only for block definitions. * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)->getNumber/" << printMBBReference(\1)/g' find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#" << ([a-zA-Z0-9_]+)\.getNumber/" << printMBBReference(\1)/g' * find . $ -name ".txt" -o -name ".s" -o -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E 's/BB#([0-9]+)/%bb.\1/g' * grep -nr 'BB#' and fix Differential Revision: https://reviews.llvm.org/D40422 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319665 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 17:18:51 +00:00
Sam Kolton	cf56d5d898	[AMDGPU] SDWA: add support for PRESERVE into SDWA peephole. Summary: Reviewers: arsenm, vpykhtin, rampitec Subscribers: kzhuravl, wdng, nhaehnle, mgorny, yaxunl, dstuttard, tpr, t-tye Differential Revision: https://reviews.llvm.org/D37817 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319662 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-04 16:22:32 +00:00
Yaxun Liu	8b47f7bedd	CodeGen: Fix SelectionDAGISel::LowerArguments for sret addr space SelectionDAGISel::LowerArguments assumes sret addr space is 0, which is not true for amdgcn---amdgiz target. This patch fixes that. Differential Revision: https://reviews.llvm.org/D40255 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319630 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-03 03:31:45 +00:00
Yaxun Liu	dcc00b1fac	CodeGen: Fix pointer info in SplitVecOp_EXTRACT_VECTOR_ELT/SplitVecRes_INSERT_VECTOR_ELT Two issues found when doing codegen for splitting vector with non-zero alloca addr space: DAGTypeLegalizer::SplitVecRes_INSERT_VECTOR_ELT/SplitVecOp_EXTRACT_VECTOR_ELT uses dummy pointer info for creating SDStore. Since one pointer operand contains multiply and add, InferPointerInfo is unable to infer the correct pointer info, which ends up with a dummy pointer info for the target to lower store and results in isel failure. The fix is to introduce MachinePointerInfo::getUnknownStack to represent MachinePointerInfo which is known in alloca address space but without other information. TargetLowering::getVectorElementPointer uses value type of pointer in addr space 0 for multiplication of index and then add it to the pointer. However the pointer may be in an addr space which has different size than addr space 0. The fix is to use the pointer value type for index multiplication. Differential Revision: https://reviews.llvm.org/D39758 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319622 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-02 22:13:22 +00:00
Alexander Timofeev	b0efc4fd66	[AMDGPU] SiFixSGPRCopies should not modify non-divergent PHI Differential revision: https://reviews.llvm.org/D40556 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319534 91177308-0d34-0410-b5e6-96231b3b80d8	2017-12-01 11:56:34 +00:00
Matt Arsenault	1182bea00f	AMDGPU: Use carry-less adds in FI elimination git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319501 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 23:42:30 +00:00
Matt Arsenault	421983a9de	AMDGPU: Use gfx9 carry-less add/sub instructions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319491 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 22:51:26 +00:00
Francis Visoiu Mistrih	e6b89910eb	[CodeGen] Always use `printReg` to print registers in both MIR and debug output As part of the unification of the debug format and the MIR format, always use `printReg` to print all kinds of registers. Updated the tests using '_' instead of '%noreg' until we decide which one we want to be the default one. Differential Revision: https://reviews.llvm.org/D40421 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319445 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 16:12:24 +00:00
Francis Visoiu Mistrih	7384652668	[CodeGen] Print "%vreg0" as "%0" in both MIR and debug output As part of the unification of the debug format and the MIR format, avoid printing "vreg" for virtual registers (which is one of the current MIR possibilities). Basically: * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g" * grep -nr '%vreg' . and fix if needed * find . $ -name ".mir" -o -name ".cpp" -o -name ".h" -o -name ".ll" $ -type f -print0 \| xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g" * grep -nr 'vreg[0-9]\+' . and fix if needed Differential Revision: https://reviews.llvm.org/D40420 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319427 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 12:12:19 +00:00
Matt Arsenault	1de313a465	AMDGPU: Allow negative MUBUF vaddr for gfx9 GFX9 does not enable bounds checking for the resource descriptors used for private access, so it should be OK to use vaddr with a potentially negative value. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319393 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-30 00:52:40 +00:00
Matt Arsenault	c7ecbb28c3	AMDGPU: Use stricter regexes for add instructions Match the entire _co as one optional piece rather than a set of characters to match multiple times. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319275 91177308-0d34-0410-b5e6-96231b3b80d8	2017-11-29 02:25:14 +00:00

1 2 3 4 5 ...

1381 Commits