RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-01-26 14:25:18 +00:00

Author	SHA1	Message	Date
Matt Arsenault	f51a2196a5	DAGCombiner: Relax sqrt NaN folding check This is OK for +0 since compares to +/-0 give the same result. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262125 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 09:38:05 +00:00
Matt Arsenault	788be52946	AMDGPU: Add s_sleep intrinsic git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262120 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 08:53:52 +00:00
Matt Arsenault	a164276e20	AMDGPU: Implement readcyclecounter This matches the behavior of the HSAIL clock instruction. s_realmemtime is used if the subtarget supports it, and falls back to s_memtime if not. Also introduces new intrinsics for each of s_memtime / s_memrealtime. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262119 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-27 08:53:46 +00:00
Nikolay Haustov	1c038cf2fa	[AMDGPU] Assembler: Basic support for MIMG Add parsing and printing of image operands. Matches legacy sp3 assembler. Change image instruction order to have data/image/sampler operands in the beginning. This is needed because optional operands in MC are always last. Update SITargetLowering for new order. Add basic MC test. Update CodeGen tests. Review: http://reviews.llvm.org/D17574 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261995 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-26 09:51:05 +00:00
Matthias Braun	e0761c4899	MachineCopyPropagation: Catch copies of the form A<-B;A<-B Differential Revision: http://reviews.llvm.org/D17475 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261966 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-26 03:18:55 +00:00
Matt Arsenault	4a5938727a	AMDGPU: Add failing testcase for register coalescer git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261592 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-22 23:45:42 +00:00
Matt Arsenault	e518fdc6db	AMDGPU: Fix alignments in test I don't think this test was intending to test unaligned load/store. Change it to use the natural alignment to avoid regressing. Also adds missing SI checks. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261571 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-22 21:04:23 +00:00
Matt Arsenault	2f65ff664c	AMDGPU/R600: Implement allowsMisalignedMemoryAccess This avoids some test regressions in a future commit when unaligned operations are expanded when they have custom lowering. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261570 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-22 21:04:16 +00:00
Tom Stellard	090553bea6	AMDGPU/SI: Use v_readfirstlane to legalize SMRD with VGPR base pointer Summary: Instead of trying to replace SMRD instructions with a VGPR base pointer with an equivalent MUBUF instruction, we now copy the base pointer to SGPRs using v_readfirstlane. This is safe to do, because any load selected as an SMRD instruction has been proven to have a uniform base pointer, so each thread in the wave will have the same pointer value in VGPRs. This will fix some errors on VI from trying to replace SMRD instructions with addr64-enabled MUBUF instructions that don't exist. Reviewers: arsenm, cfang, nhaehnle Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17305 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261385 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-20 00:37:25 +00:00
Tom Stellard	aced110517	AMDGPU/SI: Fix s_waitcnt insertion for flat instructions Summary: This was broken in r260694 which swapped the address and data operands for flat store instructions. The code in SIInsertWaits assumes that the data operand always comes before the address operand, so we need to add a special case for flat. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17366 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261330 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-19 15:33:13 +00:00
Nicolai Haehnle	c93ef07817	AMDGPU/SI: add llvm.amdgcn.image.load/store[.mip] intrinsics Summary: These correspond to IMAGE_LOAD/STORE[_MIP] and are going to be used by Mesa for the GL_ARB_shader_image_load_store extension. IMAGE_LOAD is already matched by llvm.SI.image.load. That intrinsic has a legacy name and pretends not to read memory. Differential Revision: http://reviews.llvm.org/D17276 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261224 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-18 16:44:18 +00:00
Matt Arsenault	626ceb277f	AMDGPU: Prepare for reducing private element size. Tests for the new scalarize all private access options will be included with a future commit. The only functional change is to make the split/scalarize behavior for private access of > 4 element vectors to be consistent with the flat/global handling. This makes the spilling worse in the two changed tests. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260804 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-13 04:18:53 +00:00
Tom Stellard	224ee47ca1	AMDGPU/SI: Add llvm.amdgcn.mov.dpp intrinsic This intrinsic will be used to expose dpp functionality to higher-level languages. It will map to the dpp version of v_mov_b32. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260792 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-13 02:09:49 +00:00
Matt Arsenault	53ea122b3d	AMDGPU: Add intrinsics for sin/cos These provide direct access to the hardware instruction without the unit version required like llvm.sin/llvm.cos lowering requires. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260782 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-13 01:19:56 +00:00
Matt Arsenault	a4c1dc826a	AMDGPU: Rename intrinsic to better match instruction name Also fixes missing f32 test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260780 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-13 01:03:00 +00:00
Tom Stellard	98ef447825	AMDGPU/SI: Detect uniform branches and emit s_cbranch instructions Reviewers: arsenm Subscribers: mareko, MatzeB, qcolombet, arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D16603 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260765 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 23:45:29 +00:00
Tom Stellard	abf168408a	[AMDGPU] Assembler: Swap operands of flat_store instructions to match AMD assembler Historically, AMD internal sp3 assembler has flat_store* addr, data format. To match existing code and to enable reuse, change LLVM definitions to match. Also update MC and CodeGen tests. Differential Revision: http://reviews.llvm.org/D16927 Patch by: Nikolay Haustov git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260694 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 17:57:54 +00:00
Changpeng Fang	3a0161ac77	AMDGPU/SI: Annotate Loops with Constant Condition in SIAnnotateControlFlow pass. Summary: It is possible that the loop condition can be a boolean constant (infinite loop, for example). So we sould handle constant condition in annotating a loop. This patch adds this functionality to support annotating constant condition. Reviewers: tstellarAMD, arsenm Subscribers: llvm-commits, arsenm Differential Revision: http://reviews.llvm.org/D15093 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260692 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 17:11:04 +00:00
Matt Arsenault	c3aa2775c2	AMDGPU: Set flat_scratch from flat_scratch_init reg This was hardcoded to the static private size, but this would be missing the offset and additional size for someday when we have dynamic sizing. Also stops always initializing flat_scratch even when unused. In the future we should stop emitting this unless flat instructions are used to access private memory. For example this will initialize it almost always on VI because flat is used for global access. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260658 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 06:31:30 +00:00
Matt Arsenault	e3601c75c9	AMDGPU: Set element_size in private resource descriptor Introduce a subtarget feature for this, and leave the default with the current behavior which assumes up to 16-byte loads/stores can be used. The field also seems to have the ability to be set to 2 bytes, but I'm not sure what that would be used for. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260651 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 02:40:47 +00:00
Nicolai Haehnle	e22674efca	AMDGPU: Quick fix for extreme slowness in spill-scavenge-offset.ll test Summary: Also, some cosmetic fixes. Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, llvm-commits Differential Revision: http://reviews.llvm.org/D17161 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260625 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-12 00:05:34 +00:00
Tom Stellard	dfa41e52a9	AMDGPU/SI: Make sure MIMG descriptors and samplers stay in SGPRs Summary: It's possible to have resource descriptors and samplers stored in VGPRs, either by a VMEM instruction or in the case of samplers, floating-point calculations. When this happens, we need to use v_readfirstlane to copy these values back to sgprs. Reviewers: mareko, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D17102 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260599 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 21:45:07 +00:00
Matt Arsenault	d581c66591	AMDGPU: Fix constant bus use check with subregisters If the two operands to an instruction were both subregisters of the same super register, it would incorrectly think this counted as the same constant bus use. This fixes the verifier error in fmin_legacy.ll which was missing -verify-machineinstrs. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260495 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 06:15:39 +00:00
Matt Arsenault	fae18e933b	AMDGPU: Remove some old intrinsic uses from tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260493 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 06:02:01 +00:00
Nicolai Haehnle	ac2300f5ed	AMDGPU: Release the scavenged offset register during VGPR spill Summary: This fixes a crash where subsequent spills would be unable to scavenge a register. In particular, it fixes a crash in piglit's spec@glsl-1.50@execution@geometry@max-input-components (the test still has a shader that fails to compile because of too many SGPR spills, but at least it doesn't crash any more). This is a candidate for the release branch. Reviewers: arsenm, tstellarAMD Subscribers: qcolombet, arsenm Differential Revision: http://reviews.llvm.org/D16558 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260427 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-10 20:13:58 +00:00
Matt Arsenault	60a32b5936	AMDGPU: Remove bfi and bfm intrinsics Nothing is using them. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260123 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-08 19:06:01 +00:00
Matt Arsenault	98d69cc318	SelectionDAG: Lower some range metadata to AssertZext If a range has a lower bound of 0, add an AssertZext from the nearest floor power of two. This allows operations with some workitem intrinsics with known maximum ranges to use fast 24-bit multiplies. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260109 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-08 16:28:19 +00:00
Matt Arsenault	c0410c6002	AMDGPU: Account for LDS alignment The current situation isn't great, because the amount of padding requires is determined by the inverse order of the first encountered use. We should eventually somehow sort these to minimize wasted space. Another problem is the alignment of kernel arguments isn't respected. The group_segment_alignment is always emitted as the default 16, and typed arguments with higher alignments or an explicitly set alignment are also ignored. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259912 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-05 19:47:29 +00:00
Matt Arsenault	d1d0a1a39d	AMDGPU: Preserve alignments on new created globals Also switch to internal linkage, and include the name of the function in the name. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259911 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-05 19:47:23 +00:00
Jonas Paulsson	8e9339f574	[ScheduleDAGInstrs::buildSchedGraph()] Handling of memory dependecies rewritten. Recommited, after some fixing with test cases. Updated test cases: test/CodeGen/AArch64/arm64-misched-memdep-bug.ll test/CodeGen/AArch64/tailcall_misched_graph.ll Temporarily disabled test cases: test/CodeGen/AMDGPU/split-vector-memoperand-offsets.ll test/CodeGen/PowerPC/ppc64-fastcc.ll (partially updated) test/CodeGen/PowerPC/vsx-fma-m.ll test/CodeGen/PowerPC/vsx-fma-sp.ll http://reviews.llvm.org/D8705 Reviewers: Hal Finkel, Andy Trick. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259673 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-03 17:52:29 +00:00
Matt Arsenault	f1f2dd4ca2	AMDGPU: Do not promote allocas with non-inbounds GEPs If we can't assume the pointer value isn't within the bounds of the object, it seems risky to try to replace the pointer calculations. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259573 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-02 21:16:12 +00:00
Matt Arsenault	551787639e	AMDGPU: Handle promoting memmove Also add missing tests for the others. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259558 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-02 20:28:10 +00:00
Matt Arsenault	ec856e4504	AMDGPU: Skip promote alloca with no optimizations git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259551 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-02 19:32:42 +00:00
Matt Arsenault	53b80ebb13	AMDGPU: Whitelist handled intrinsics We shouldn't crash on unhandled intrinsics. Also simplify failure handling in loop. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259546 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-02 19:18:53 +00:00
Matt Arsenault	374613d697	AMDGPU: Use inbounds when calculating workitem offset When promoting allocas to LDS, we know we are indexing into a specific area just created, and the calculation will also never overflow. Also emit some of the muls as nsw nuw, because instcombine infers this already from the range metadata. I think putting this on the other adds and muls might be OK too, but I'm not 100% sure. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259545 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-02 19:18:48 +00:00
Oliver Stannard	9ed9eb72f4	Refactor backend diagnostics for unsupported features Re-commit of r258951 after fixing layering violation. The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259498 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-02 13:52:43 +00:00
Matt Arsenault	084a86c382	AMDGPU: Fix emitting invalid workitem intrinsics for HSA The AMDGPUPromoteAlloca pass was emitting the read.local.size calls, which with HSA was incorrectly selected to reading from the offset mesa uses off of the kernarg pointer. Error on intrinsics which aren't supported by HSA, and start emitting the correct IR to read the workgroup size out of the dispatch pointer. Also initialize the pass so it can be tested with opt, and start moving towards not depending on the subtarget as an argument. Start emitting errors for the intrinsics not handled with HSA. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259297 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-30 05:19:45 +00:00
Matt Arsenault	9a30bf46a7	AMDGPU: Stop checking intrinsics not used by HSA for dispatch-ptr Only the dispatch.ptr intrinsic is supposed to be used now to get the workgroup size, and the read.local.size intrinsics do not work correctly. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259296 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-30 05:10:59 +00:00
Matt Arsenault	27e98d1c21	AMDGPU: Add new amdgcn workitem intrinsics These use the correct prefix and follow the HSA naming convention rather than the config register option names. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259293 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-30 04:25:19 +00:00
Matt Arsenault	3d679fa973	AMDGPU: Remove 24-bit intrinsics The known bit matching code seems to work reasonably well, so these shouldn't really be needed. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259180 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-29 10:05:16 +00:00
Matt Arsenault	e26a9f0de4	AMDGPU: Match fmed3 patterns with legacy fmin/fmax git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259090 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 20:53:48 +00:00
Matt Arsenault	6a2bf372b8	AMDGPU: Match some med3 patterns git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259089 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 20:53:42 +00:00
Matt Arsenault	ed6685cf17	AMDGPU: Set DX10Clamp bit git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259088 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 20:53:35 +00:00
Oliver Stannard	b95072ef89	Revert r259035, it introduces a cyclic library dependency git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259045 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 13:19:47 +00:00
Oliver Stannard	ef19a274ad	Add backend dignostic printer for unsupported features Re-commit of r258951 after fixing layering violation. The related LLVM patch adds a backend diagnostic type for reporting unsupported features, this adds a printer for them to clang. In the case where debug location information is not available, I've changed the printer to report the location as the first line of the function, rather than the closing brace, as the latter does not give the user any information. This also affects optimisation remarks. Differential Revision: http://reviews.llvm.org/D16590 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259035 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 10:07:27 +00:00
NAKAMURA Takumi	c1aeea845d	Revert r258951 (and r258950), "Refactor backend diagnostics for unsupported features" It broke layering violation in LLVMIR. clang r258950 "Add backend dignostic printer for unsupported features" llvm r258951 "Refactor backend diagnostics for unsupported features" git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259016 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-28 04:41:32 +00:00
Oliver Stannard	bf8415a84d	Refactor backend diagnostics for unsupported features The BPF and WebAssembly backends had identical code for emitting errors for unsupported features, and AMDGPU had very similar code. This merges them all into one DiagnosticInfo subclass, that can be used by any backend. There should be minimal functional changes here, but some AMDGPU tests have been updated for the new format of errors (it used a slightly different format to BPF and WebAssembly). The AMDGPU error messages will now benefit from having precise source locations when debug info is available. The implementation of DiagnosticInfoUnsupported::print must be in lib/Codegen rather than in the existing file in lib/IR/ to avoid introducing a dependency from IR to CodeGen. Differential Revision: http://reviews.llvm.org/D16590 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258951 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-27 17:30:33 +00:00
Marek Olsak	73be6ab813	AMDGPU/SI: Stoney has only 16 LDS banks Summary: This is a candidate for stable, along with all patches that add the "stoney" processor. Reviewers: tstellarAMD Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16485 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258922 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-27 11:19:45 +00:00
Matt Arsenault	de2c3bc98d	AMDGPU: Fix default device handling When no device name is specified, default to kaveri for HSA since SI is not supported and it woud fail. Default to "tahiti" instead of "SI" since these are effectively the same, and tahiti is an actual device. Move default device handling to the TargetMachine rather than the AMDGPUSubtarget. The module ISA version is computed from the device name provided with the target machine, so the attributes printed by the AsmPrinter were inconsistent with those computed in the subtarget. Also remove DevName field from subtarget since it's redundant with getCPU() in the superclass. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258901 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-27 02:17:49 +00:00
Matt Arsenault	ce00361269	AMDGPU: Make v32i8/v64i8 illegal types Old intrinsics were forcing these, but they have now all been removed. This fixes large i8 vector operations generally being broken. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258788 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-26 04:43:48 +00:00

1 2 3 4 5

228 Commits