archived-llvm

mirror of https://github.com/RPCSX/llvm.git synced 2026-01-31 01:05:23 +01:00

Author	SHA1	Message	Date
Matt Arsenault	e52dfc95ef	AMDGPU: Move cndmask pseudo to be isel pseudo There's only one use of this for the convenience of a pattern. I think v_mov_b64_pseudo should also be moved, but SIFoldOperands does currently make use of it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279901 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-27 01:00:37 +00:00
Tom Stellard	c6ee33f19c	AMDGPU/SI: Canonicalize offset order for merged DS instructions Summary: If the scheduler clusters the loads, then the offsets will be sorted, but it is possible for the scheduler to scheduler loads together without out explicitly clustering them, which would give us non-sorted offsets. Also, we will want to do this if we move the load/store optimizer before the scheduler. Reviewers: arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23776 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279870 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-26 21:36:47 +00:00
Matt Arsenault	f9a7ed710d	Replace subregister uses when processing tied operands This was for some reason skipping operands that are subregisters instead of keeping the same subregister index. v_movreld_b32 expects src0 to be the subregister of the tied super register use/def. e.g. v_movreld_b32 v0, v9, <imp-def, tied3> v[0:3], <imp-use, tied2> v[0:3] was being replaced with v[4:7] = copy v[0:3] v_movreld_b32 v0, v9, <imp-def, tied3> v[4:7], <imp-use, tied2> v[4:7], which really writes to v[0:3] git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279804 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-26 06:31:32 +00:00
Changpeng Fang	fb717b4b98	AMDGCN/SI: Implement readlane/readfirstlane intrinsics Summary: This patch implements readlane/readfirstlane intrinsics. TODO: need to define a new register class to consider the case that the source could be a vector register or M0. Reviewed by: arsenm and tstellarAMD Differential Revision: http://reviews.llvm.org/D22489 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279660 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-24 20:35:23 +00:00
Wei Ding	25d826e483	AMDGPU : Add V_SAD_U32 instruction pattern. Differential Revision: http://reviews.llvm.org/D23069 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279629 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-24 14:59:47 +00:00
Krzysztof Parzyszek	31a5f885bf	Create subranges for new intervals resulting from live interval splitting The register allocator can split a live interval of a register into a set of smaller intervals. After the allocation of registers is complete, the rewriter will modify the IR to replace virtual registers with the corres- ponding physical registers. At this stage, if a register corresponding to a subregister of a virtual register is used, the rewriter will check if that subregister is undefined, and if so, it will add the <undef> flag to the machine operand. The function verifying liveness of the subregis- ter would assume that it is undefined, unless any of the subranges of the live interval proves otherwise. The problem is that the live intervals created during splitting do not have any subranges, even if the original parent interval did. This could result in the <undef> flag placed on a register that is actually defined. Differential Revision: http://reviews.llvm.org/D21189 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279625 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-24 13:37:55 +00:00
Matthias Braun	66489736bf	MIRParser/MIRPrinter: Compute isSSA instead of printing/parsing it. Specifying isSSA is an extra line at best and results in invalid MI at worst. Compute the value instead. Differential Revision: http://reviews.llvm.org/D22722 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279600 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-24 01:32:41 +00:00
Matt Arsenault	7517ed227a	AMDGPU: Split SILowerControlFlow into two pieces Do most of the lowering in a pre-RA pass. Keep the skip jump insertion late, plus a few other things that require more work to move out. One concern I have is now there may be COPY instructions which do not have the necessary implicit exec uses if they will be lowered to v_mov_b32. This has a positive effect on SGPR usage in shader-db. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279464 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-22 19:33:16 +00:00
Matthias Braun	0e6fefdf1f	Revert "RegScavenging: Add scavengeRegisterBackwards()" The ppc64 multistage bot fails on this. This reverts commit r279124. Also Revert "CodeGen: Add/Factor out LiveRegUnits class; NFCI" because it depends on the previous change This reverts commit r279171. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279199 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-19 03:03:24 +00:00
Tom Stellard	2b3323e046	AMDGPU/SI: Fix a test in wqm.ll to always use s_cbranch_vcc* Summary: We need to use floating-point compares to ensure that s_cbranch_vcc* instructions are always generated. With integer compares, future optimizations could cause s_cbranch_scc* to be generated instead. Reviewers: arsenm, nhaehnle Subscribers: llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23401 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279148 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-18 21:21:53 +00:00
Wei Ding	443f72b62d	AMDGPU : Fix QSAD and MQSAD instructions' incorrect data type. Differential Revision: http://reviews.llvm.org/D23689 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279126 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-18 19:51:14 +00:00
Matthias Braun	f65766df39	RegScavenging: Add scavengeRegisterBackwards() Re-apply r276044 with off-by-1 instruction fix for the reload placement. This is a variant of scavengeRegister() that works for enterBasicBlockEnd()/backward(). The benefit of the backward mode is that it is not affected by incomplete kill flags. This patch also changes PrologEpilogInserter::doScavengeFrameVirtualRegs() to use the register scavenger in backwards mode. Differential Revision: http://reviews.llvm.org/D21885 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279124 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-18 19:47:59 +00:00
Valery Pykhtin	bc2ba716f4	[AMDGPU] add s_incperflevel/s_decperflevel intrinsics. Differential revision: https://reviews.llvm.org/D23666 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@279106 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-18 18:06:20 +00:00
Jan Vesely	fc94e66b10	AMDGPU/R600: Convert buffer id to VTX_READ input Use patterns instead of multiple instructions Add buffer id to asm string https://reviews.llvm.org/D22650 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278749 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-15 21:38:30 +00:00
Matt Arsenault	8f1b18be38	AMDGPU: Don't fold subregister extracts into tied operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278676 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-15 16:18:36 +00:00
James Molloy	bd7c3fb3bf	[LSR] Don't try and create post-inc expressions on non-rotated loops If a loop is not rotated (for example when optimizing for size), the latch is not the backedge. If we promote an expression to post-inc form, we not only increase register pressure and add a COPY for that IV expression but for all IVs! Motivating testcase: void f(float a, float b, float c, int n) { while (n-- > 0) c++ = a++ + b++; } It's imperative that the pointer increments be located in the latch block and not the header block; if not, we cannot use post-increment loads and stores and we have to keep both the post-inc and pre-inc values around until the end of the latch which bloats register usage. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278658 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-15 07:53:03 +00:00
Mehdi Amini	804c815e77	Revert "Revert "Invariant start/end intrinsics overloaded for address space"" This reverts commit 32fc6488e48eafc0ca1bac1bd9cbf0008224d530. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278609 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-13 23:31:24 +00:00
Mehdi Amini	2eb84a568a	Revert "Invariant start/end intrinsics overloaded for address space" This reverts commit r276447. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278608 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-13 23:27:32 +00:00
Matt Arsenault	b24aaff187	AMDGPU: Fix missing test for addressing mode with odd offsets Add test if the constant offset looks unaligned. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278589 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-13 01:43:51 +00:00
Wei Ding	2daf966c6f	AMDGPU : Add intrinsic for instruction v_cvt_pk_u8_f32 Differential Revision: http://reviews.llvm.org/D23336 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278403 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-11 20:34:48 +00:00
Matt Arsenault	d751c97ce5	AMDGPU: Fix crashes on memory functions git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278369 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-11 17:31:42 +00:00
Wei Ding	523717c1fe	AMDGPU : Fix SAD related instruction LIT tests function atttibute issues. Differential Revision: http://reviews.llvm.org/D23133 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278360 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-11 17:14:17 +00:00
Wei Ding	9bcebab62b	AMDGPU : Add LLVM intrinsics for SAD related instructions. Differential Revision: http://reviews.llvm.org/D23133 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278354 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-11 16:33:53 +00:00
Changpeng Fang	4a0373d17a	AMDGPU/SI: Implement amdgcn image intrinsics with sampler Summary: This patch define and implement amdgcn image intrinsics with sampler. 1. define vdata type to be llvm_anyfloat_ty, address type to be llvm_anyfloat_ty, and rsrc type to be llvm_anyint_ty. As a result, we expect the intrinsics name to have three suffixes to overload each of these three types; 2. D128 as well as two other flags are implied in the three types, for example, if you use v8i32 as resource type, then r128 is 0! 3. don't expose TFE flag, and other flags are exposed in the instruction order: unrm, glc, slc, lwe and da. Differential Revision: http://reviews.llvm.org/D22838 Reviewed by: arsenm and tstellarAMD git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278291 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-10 21:15:30 +00:00
Matt Arsenault	34c6b123f7	AMDGPU: Change insertion point of si_mask_branch Insert before the skip branch if one is created. This is a somewhat more natural placement relative to the skip branches, and makes it possible to implement analyzeBranch for skip blocks. The test changes are mostly due to a quirk where the block label is not emitted if there is a terminator that is not also a branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278273 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-10 19:11:42 +00:00
Nicolai Haehnle	be7124c9bf	LiveIntervalAnalysis: fix a crash in repairOldRegInRange Summary: See the new test case for one that was (non-deterministically) crashing on trunk and deterministically hit the assertion that I added in D23302. Basically, the machine function contains a sequence DS_WRITE_B32 %vreg4, %vreg14:sub0, ... DS_WRITE_B32 %vreg4, %vreg14:sub0, ... %vreg14:sub1<def> = COPY %vreg14:sub0 and SILoadStoreOptimizer::mergeWrite2Pair merges the two DS_WRITE_B32 instructions into one before calling repairIntervalsInRange. Now repairIntervalsInRange wants to repair %vreg14, in particular, and ends up trying to repair %vreg14:sub1 as well, but that only becomes active _after_ the range that is to be repaired, hence the crash due to LR.find(...) == LR.begin() at the start of repairOldRegInRange. I believe that just skipping those subrange is fine, but again, not too familiar with that code. Reviewers: MatzeB, kparzysz, tstellarAMD Subscribers: llvm-commits, MatzeB Differential Revision: https://reviews.llvm.org/D23303 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278268 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-10 18:51:14 +00:00
Marek Olsak	716b378a48	AMDGPU/SI: Increase SGPR limit to 96 on Tonga/Iceland Summary: This is the setting of the Vulkan closed source driver. It decreases the max wave count from 10 to 8. 26010 shaders in 14650 tests Totals: VGPRS: 829593 -> 808440 (-2.55 %) Spilled SGPRs: 81878 -> 42226 (-48.43 %) Spilled VGPRs: 367 -> 358 (-2.45 %) Scratch VGPRs: 1764 -> 1748 (-0.91 %) dwords per thread Code Size: 36677864 -> 35923932 (-2.06 %) bytes There is a massive decrease in SGPR spilling in general and -7.4% spilled VGPRs for DiRT Showdown (= SGPRs spilled to scratch?) Reviewers: arsenm, tstellarAMD, nhaehnle Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D23034 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277867 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-05 21:23:29 +00:00
Yaxun Liu	77990e6e79	[OpenCL] Add missing tests for getOCLTypeName Adding missing tests for OCL type names for half, float, double, char, short, long, and unknown. Patch by Aaron En Ye Shi. Differential Revision: https://reviews.llvm.org/D22964 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277759 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 19:45:00 +00:00
Matt Arsenault	579ced7d96	AMDGPU: Fix a slow test by using basic regalloc This just tests that the register limit isn't exceeded, so the regisetr allocation doesn't need to be great.' The critically slow part is all in greedy RA, so switch to basic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277700 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-04 07:04:54 +00:00
Matthias Braun	8669eef6f1	RenameIndependentSubregs: Fix liveness query in rewriteOperands() rewriteOperands() always performed liveness queries at the base index rather than the RegSlot/Base as apropriate for the machine operand. This could lead to illegal rewriting in some cases. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277661 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-03 22:37:47 +00:00
Matt Arsenault	94166e75ac	AMDGPU: fdiv -1, x -> rcp -x git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277535 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-02 22:25:04 +00:00
Nicolai Haehnle	87d298325f	AMDGPU: Stay in WQM for non-intrinsic stores Summary: Two types of stores are possible in pixel shaders: stores to memory that are explicitly requested at the API level, and stores that are an implementation detail of register spilling or lowering of arrays. For the first kind of store, we must ensure that helper pixels have no effect and hence WQM must be disabled. The second kind of store must always be executed, because the written value may be loaded again in a way that is relevant for helper pixels as well -- and there are no externally visible effects anyway. This is a candidate for the 3.9 release branch. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: https://reviews.llvm.org/D22675 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277504 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-02 19:31:14 +00:00
Nicolai Haehnle	a8461187eb	AMDGPU: Track physical registers in SIWholeQuadMode Summary: There are cases where uniform branch conditions are computed in VGPRs, and we didn't correctly mark those as WQM. The stray change in basic-branch.ll is because invoking the LiveIntervals analysis leads to the detection of a dead register that would otherwise not be seen at -O0. This is a candidate for the 3.9 branch, as it fixes a possible hang. Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22673 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277500 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-02 19:17:37 +00:00
Matt Arsenault	4fd45ebabd	AMDGPU: Fix shouldConvertConstantLoadToIntImm behavior This should really be true for any immediate, not just inline ones. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277260 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-30 01:40:36 +00:00
Changpeng Fang	539fec5dc2	AMDGPU/SI: Don't handle a loop if there is no loop at all for a terminator BB. Differential Revision: http://reviews.llvm.org/D22021 Reviewed by: arsenm git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277073 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 23:01:45 +00:00
Wei Ding	ee8c4ca1e1	AMDGPU : Add intrinsics for compare with the full wavefront result Differential Revision: http://reviews.llvm.org/D22482 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276998 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 16:42:13 +00:00
Nicolai Haehnle	b18ca96c79	AMDGPU: add execfix flag to SI_ELSE Summary: SI_ELSE is lowered into two parts: s_or_saveexec_b64 dst, src (at the start of the basic block) s_xor_b64 exec, exec, dst (at the end of the basic block) The idea is that dst contains the exec mask of the preceding IF block. It can happen that SIWholeQuadMode decides to switch from WQM to Exact mode inside the basic block that contains SI_ELSE, in which case it introduces an instruction s_and_b64 exec, exec, s[...] which masks out bits that can correspond to both the IF and the ELSE paths. So the resulting sequence must be: s_or_savexec_b64 dst, src s_and_b64 exec, exec, s[...] <-- added by SIWholeQuadMode s_and_b64 dst, dst, exec <-- added by SILowerControlFlow s_xor_b64 exec, exec, dst Whether to add the additional s_and_b64 dst, dst, exec is currently determined via the ExecModified tracking. With this change, it is instead determined by an additional flag on SI_ELSE which is set by SIWholeQuadMode. Finally: It also occured to me that an alternative approach for the long run is for SILowerControlFlow to unconditionally emit s_or_saveexec_b64 dst, src ... s_and_b64 dst, dst, exec s_xor_b64 exec, exec, dst and have a pass that detects and cleans up the "redundant AND with exec" pattern where possible. This could be useful anyway, because we also add instructions s_and_b64 vcc, exec, vcc before s_cbranch_scc (in moveToALU), and those are often redundant. I have some pending changes to how KILL is lowered that could also benefit from such a cleanup pass. In any case, this current patch could help in the short term with the whole ExecModified business. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: https://reviews.llvm.org/D22846 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276972 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-28 11:39:24 +00:00
Matt Arsenault	f799c706db	AMDGPU: Use rcp for fdiv 1, x with fpmath metadata Using rcp should be OK for safe math usually, so this should not be replacing the original fdiv. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276823 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 23:25:44 +00:00
Matt Arsenault	c43677a11d	AMDGPU: Add more tests for LDS size with occupancy git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276821 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 23:15:59 +00:00
Matthias Braun	ad0f5f6b52	MIRParser: Use dot instead of colon to mark subregisters Change the syntax to use `%0.sub8` to denote a subregister. This seems like a more natural fit to denote subregisters; I also plan to introduce a new ":classname" syntax in upcoming patches to denote the register class of a vreg. Note that this commit disallows plain identifiers to start with a '.' character. This shouldn't affect anything as external names/IR references are all prefixed with '$'/'%', plain identifiers are only used for instruction names, register mask names and subreg indexes. Differential Revision: https://reviews.llvm.org/D22390 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276815 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 21:49:34 +00:00
Tim Northover	d96170e773	GlobalISel: omit braces on MachineInstr types when there's only one. Tidies up the representation a bit in the common case. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276772 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 17:28:01 +00:00
Matt Arsenault	cc67a0a36a	AMDGPU: Add missing tests for xnack option for HSA git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276765 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:50 +00:00
Matt Arsenault	ee4cdb7b75	AMDGPU: Add fp legacy instruction intrinsics This could use some additional optimization work to use mad/mac legacy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276764 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-26 16:45:45 +00:00
Jan Vesely	4a44da0c82	AMDGPU: Remove read_workdim intrinsic Differential revision: https://reviews.llvm.org/D22732 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276682 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-25 20:17:02 +00:00
Matt Arsenault	9b4a967989	AMDGPU: Fix missing verify-machineinstrs in control flow test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276679 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-25 19:39:06 +00:00
Tom Stellard	a6b9e20623	Revert "[AMDGPU] Emit read-only data to .rodata for hsa" This reverts commit r276298. Data stored in .rodata can have a negative offset from .text, but we don't support negative values in relocations yet. This caused a regression in one of the amp conformance tests: 5_Data_Cont/5_2_a_v/5_2_3_m/Assignment/Test.02.01 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276498 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 23:46:40 +00:00
Tim Northover	3921674c30	GlobalISel: allow multiple types on MachineInstrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276481 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 22:13:36 +00:00
Anna Thomas	80ee170cb3	Invariant start/end intrinsics overloaded for address space Summary: The llvm.invariant.start and llvm.invariant.end intrinsics currently support specifying invariant memory objects only in the default address space. With this change, these intrinsics are overloaded for any adddress space for memory objects and we can use these llvm invariant intrinsics in non-default address spaces. Example: llvm.invariant.start.p1i8(i64 4, i8 addrspace(1)* %ptr) This overloaded intrinsic is needed for representing final or invariant memory in managed languages. Reviewers: apilipenko, reames Subscribers: llvm-commits git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276447 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:49:40 +00:00
Matt Arsenault	c5a5706d17	AMDGPU: Remove redundant test git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276439 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:36 +00:00
Matt Arsenault	9da217ee1e	AMDGPU: Fix groupstaticsize for large LDS The size can exceed s_movk_i32's limit, and we don't want to use it this early since it inhibits optimizations. This should probably be merged to the release branch. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276438 91177308-0d34-0410-b5e6-96231b3b80d8	2016-07-22 17:01:33 +00:00

1 2 3 4 5 ...

542 Commits