RPCSX/llvm - llvm - Gitea: Git with a cup of tea

RPCSX/llvm

mirror of https://github.com/RPCSX/llvm.git synced 2025-01-09 13:41:35 +00:00

Author	SHA1	Message	Date
Matt Arsenault	115244a728	AMDGPU: Fix kernel argument alignment impacting stack size Don't use AllocateStack because kernel arguments have nothing to do with the stack. The ensureMaxAlignment call was still changing the stack alignment. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273080 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-18 05:15:53 +00:00
Matt Arsenault	863cff46f2	AMDGPU: Temporarily select trap to s_endpgm This should select to s_trap, but that requires additonal work to setup and enable the trap handler. For now emit s_endpgm so bugpoint stops getting stuck on the unsupported call to abort. Emit a warning that this will only terminate the wave and not really trap. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273062 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-17 22:27:03 +00:00
Matt Arsenault	310a3752c0	AMDGPU: Remove llvm.SI.tid intrinsic Mesa doesn't emit this for llvm >= 3.8 anymore. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273050 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-17 21:18:41 +00:00
Matt Arsenault	11e5e3bbe1	AMDGPU: Disable scheduling in some slow tests Disabling the pre-RA scheduler on large-work-group-registers causes it to be ~50% slower. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272860 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-16 00:56:47 +00:00
Nicolai Haehnle	682fc3e780	AMDGPU: Fix MUBUF offset bugs affecting llvm.amdgcn.buffer.* intrinsics Summary: This fixes two related bugs. First, the generic optimization passes unfortunately generate negative constant offsets but the hardware treats SOffset as an unsigned value. Second, there is a hardware bug on SI and CI, where address clamping in MUBUF instructions does not work correctly when SOffset is larger than the buffer size. This patch works around this bug by never using SOffset. An alternative workaround would be to do the clamping manually when SOffset is too large, but generating the required code sequence during instruction selection would be rather involved, and in any case the resulting code would probably be worse. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96360 Reviewers: arsenm, tstellarAMD Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D21326 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272761 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 07:13:05 +00:00
Matt Arsenault	6af03e5068	AMDGPU: Run pointer optimization passes git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272736 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-15 00:11:01 +00:00
Marek Olsak	760c36c5ae	AMDGPU/SI: Set INDEX_STRIDE for scratch coalescing Summary: Mesa and other users must set this to enable coalescing: - STRIDE = 0 - SWIZZLE_ENABLE = 1 This makes one particular compute shader 8x faster. Reviewers: tstellarAMD, arsenm Subscribers: arsenm, kzhuravl Differential Revision: http://reviews.llvm.org/D21136 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272556 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-13 16:05:57 +00:00
Tom Stellard	4ee3d0cb4d	AMDGPU/SI: Don't use fixup_si_rodata for scratch rsrc relocations Summary: We need to set the fixup type to FK_Data_4 for the SCRATCH_RSRC_DWORD[01] symbols, since these require absolute relocations, and fixup_si_rodata is for relative relocations. Reviewers: arsenm, kzhuravl Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D21153 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272417 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 19:26:38 +00:00
Matt Arsenault	dbaa4b4486	AMDGPU: v_cndmask_b32 does not def vcc Fixes verifier errors after SIShrinkInstructions. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272351 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:18:41 +00:00
Tom Stellard	60f588f570	AMDGPU/SI: Make sure to emit TargetConstant nodes when matching ds_permute Summary: This fixes a bug with ds_permute instructions where if it was passed a constant address, then the offset operand would get assigned a register operand instead of an immediate. Reviewers: scchan, arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D19994 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272349 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-10 00:01:04 +00:00
Matt Arsenault	4080a06a24	AMDGPU: Fix flat atomics The flat atomics could already be selected, but only when using flat instructions for global memory. Add patterns for flat addresses. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272345 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:54 +00:00
Matt Arsenault	bada556f73	AMDGPU: Fix i64 global cmpxchg This was using extract_subreg sub0 to extract the low register of the result instead of sub0_sub1, producing an invalid copy. There doesn't seem to be a way to use the compound subreg indices in tablegen since those are generated, so manually select it. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272344 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:48 +00:00
Matt Arsenault	003d842e7f	AMDGPU: Fix missing and broken check lines in atomic tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272343 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 23:42:44 +00:00
Wei Ding	39ce7152a2	AMDGPU/SI: Fix 32-bit fdiv lowering We were using the fast fdiv lowering for all division, implementation of IEEE754 fdiv is added. http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272292 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 19:17:15 +00:00
Jan Vesely	406c47ff89	SelectionDAG: Implement expansion of {S,U}MIN/MAX in integer legalization Fixes {u,}long_{min,max,clamp} opencl piglit regressions on EG. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D17898 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272272 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-09 16:04:00 +00:00
Nicolai Haehnle	2ac1fa00c9	AMDGPU: Add amdgpu-ps-wqm-outputs function attributes Summary: The presence of this attribute indicates that VGPR outputs should be computed in whole quad mode. This will be used by Mesa for prolog pixel shaders, so that derivatives can be taken of shader inputs computed by the prolog, fixing a bug. The generated code could certainly be improved: if a prolog pixel shader is used (which isn't common in modern OpenGL - they're used for gl_Color, polygon stipples, and forcing per-sample interpolation), Mesa will use this attribute unconditionally, because it has to be conservative. So WQM may be used in the prolog when it isn't really needed, and furthermore a silly back-and-forth switch is likely to happen at the boundary between prolog and main shader parts. Fixing this is a bit involved: we'd first have to add a mechanism by which LLVM writes the WQM-related input requirements to the main shader part binary, and then Mesa specializes the prolog part accordingly. At that point, we may as well just compile a monolithic shader... Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95130 Reviewers: arsenm, tstellarAMD, mareko Subscribers: arsenm, llvm-commits, kzhuravl Differential Revision: http://reviews.llvm.org/D20839 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272063 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 21:37:17 +00:00
Eric Christopher	c2a7f10882	Revert "Differential Revision: http://reviews.llvm.org/D20557 " Author: Wei Ding <wei.ding2@amd.com> Date: Tue Jun 7 19:04:44 2016 +0000 Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8 as it was breaking the bots. This reverts commit r272044. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272056 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 20:27:12 +00:00
Wei Ding	e2d1122183	Differential Revision: http://reviews.llvm.org/D20557 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272044 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-07 19:04:44 +00:00
Matt Arsenault	bc1b8d5b49	AMDGPU: Fix constantexpr addrspacecasts If we had a constant group address space cast the queue pointer wasn't enabled for the function, resulting in a crash on noreg later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271935 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 20:03:31 +00:00
Artem Tamazov	7049ac906c	[AMDGPU][llvm-mc] v_cndmask_b32: src2 is mandatory; do not enforce VOP2 when src2 == VCC. Another step for unification llvm assembler/disassembler with sp3. Besides, CodeGen output is a bit improved, thus changes in CodeGen tests. Assembler/Disassembler tests updated/added. Differential Revision: http://reviews.llvm.org/D20796 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271900 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-06 15:23:43 +00:00
Matt Arsenault	29d0ea4bc8	AMDGPU: Cleanup load tests There are a lot of different kinds of loads to test for, and these were scattered around inconsistently with some redundancy. Try to comprehensively test all loads in a consistent way. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271571 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 19:54:26 +00:00
Matt Arsenault	747c0a6e8b	AMDGPU: Temporary fix for broken store combine git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271567 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 19:00:55 +00:00
Matt Arsenault	9d6fa96f46	AMDGPU: Fix crashes on unknown processor name If the processor name failed to parse for amdgcn, the resulting output would have R600 ISA in it. If the processor name was missing or invalid for R600, the wavefront size would not be set and there would be crashes from missing itinerary data. Fixes crashes in future commit caused by dividing by the unset/0 wavefront size. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271561 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-02 18:37:16 +00:00
Matthias Braun	1cd242fe11	CodeGen: Refactor renameDisconnectedComponents() as a pass Refactor LiveIntervals::renameDisconnectedComponents() to be a pass. Also change the name to "RenameIndependentSubregs": - renameDisconnectedComponents() worked on a MachineFunction at a time so it is a natural candidate for a machine function pass. - The algorithm is testable with a .mir test now. - This also fixes a problem where the lazy renaming as part of the MachineScheduler introduced IMPLICIT_DEF instructions after the number of a nodes in a region were counted leading to a mismatch. Differential Revision: http://reviews.llvm.org/D20507 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271345 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-31 22:38:06 +00:00
Matt Arsenault	c3eeba0f4c	AMDGPU: Cleanup vector insert/extract tests This mostly makes sure that 3-vector dynamic inserts and extracts are covered. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271082 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-28 00:51:06 +00:00
Matt Arsenault	14cb586d5e	AMDGPU: Add fract intrinsic Remove broken patterns matching it. This was matching the unsafe math pattern and expanding the fix for the buggy instruction from the pattern. The problems are also on CI. Remove the workarounds and only use fract with unsafe math or from the intrinsic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@271078 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-28 00:19:52 +00:00
Changpeng Fang	faf7289db3	AMDGPU/SI: Enable load-store-opt by default. Summary: Enable load-store-opt by default, and update LIT tests. Reviewers: arsenm Differential Revision: http://reviews.llvm.org/D20694 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270894 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-26 19:35:29 +00:00
Diana Picus	f46038dc53	[AMDGPU] Remove exit-on-error flag from test (PR27762) Similar to r269948, but for argument lowering. Fixes PR27762 Differential Revision: http://reviews.llvm.org/D20430 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270856 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-26 15:24:55 +00:00
Matt Arsenault	2997ae6e3e	AMDGPU: Fix v2i64/v2f64 bitcasts These operations tend to get promoted away to v4i32 so this doesn't happen often. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270740 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 18:07:36 +00:00
Matt Arsenault	211d1cd5a3	AMDGPU: Fix missing br_cc i1 test coverage Also un xfail a test. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270739 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 17:58:27 +00:00
Matt Arsenault	53d233a178	AMDGPU: Make vectorization defeating test changes Simplifies test updates in the future. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270736 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 17:42:39 +00:00
Matt Arsenault	068cdecac2	AMDGPU: Fix inconsistent lowering of select of vectors f32 vectors would use a sequence of BFI instructions instead of unrolled cmp + select. This was better in the case of a VALU select with SGPR inputs, but we don't have a way of dealing with that in the DAG. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270731 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-25 17:34:58 +00:00
Konstantin Zhuravlyov	d7b9b912dd	[AMDGPU][NFC] Rename ReserveTrapVGPRs -> ReserveRegs Differential Revision: http://reviews.llvm.org/D20081 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270594 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-24 18:37:18 +00:00
Matt Arsenault	03ca6fb151	AMDGPU: Define priorities for register classes Allocating larger register classes first should give better allocation results (and more importantly for myself, make the lit tests more stable with respect to scheduler changes). Patch by Matthias Braun git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270312 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 03:55:07 +00:00
Matt Arsenault	be522c6214	AMDGPU: Cleanup lowering actions These are kind of a mess and hard to follow, particularly for loads and stores. Fix various redundant, unnecessary and dead settings. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270307 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 02:27:49 +00:00
Matt Arsenault	4e5b30a0a9	AMDGPU: Fix high bits after division optimization This is essentially doing a 24-bit signed division with FP. We need to truncate to the N bit result. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270305 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 01:53:33 +00:00
Matt Arsenault	6416e4c521	AMDGPU: Fix verifier error when spilling SGPRs The current SGPR spilling test does not stress this because it is using s_buffer_load instructions to increase SGPR pressure and spill, but their output operands have the same SReg_32_XM0 constraint. This fixes an error when the SReg_32 output from most instructions is spilled. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270301 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:53:42 +00:00
Matt Arsenault	9d922be248	AMDGPU: Handle cbranch vccz/vccnz git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270297 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:29:40 +00:00
Matt Arsenault	dcb6543de5	AMDGPU: Implement ReverseBranchCondition git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270296 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:29:34 +00:00
Matt Arsenault	f91238f391	AMDGPU: Implement AnalyzeBranch Original patch by Tom Stellard git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270295 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-21 00:29:27 +00:00
Matthias Braun	6054e84d82	LiveIntervalAnalysis: Rework constructMainRangeFromSubranges() We now use LiveRangeCalc::extendToUses() instead of a specially designed algorithm in constructMainRangeFromSubranges(): - The original motivation for constructMainRangeFromSubranges() were differences between the main liverange and subranges because of hidden dead definitions. This case however cannot happen anymore with the DetectDeadLaneMasks pass in place. - It simplifies the code. - This fixes a longstanding bug where we did not properly create new SSA values on merging control flow (the MachineVerifier missed most of these cases). - Move constructMainRangeFromSubranges() to LiveIntervalAnalysis and LiveRangeCalc to better match the implementation/available helper functions. This re-applies r269016. The fixes from r270290 and r270259 should avoid the machine verifier problems this time. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270291 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-20 23:14:56 +00:00
Matthias Braun	d8eb7dec3e	MachineVerifier: subregs so not require defs/valnos on every path It is fine for subregister ranges to be undefined on some CFG paths as we may have a "vregX:other_subreg<read-undef> =" def on that path. We do not (and should not) have live segments for the subregister ranges. The MachineVerifier should not complain about this. This is a slight variant of http://llvm.org/PR27705 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270290 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-20 23:02:13 +00:00
Matthias Braun	2a73788c72	LiveIntervalAnalysis: Fix missing defs in renameDisconnectedComponents(). Fix renameDisconnectedComponents() creating vreg uses that can be reached from function begin withouthaving a definition (or explicit live-in). Fix this by inserting IMPLICIT_DEF instruction before control-flow joins as necessary. Removes an assert from MachineScheduler because we may now get additional IMPLICIT_DEF when preparing the scheduling policy. This fixes the underlying problem of http://llvm.org/PR27705 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270259 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-20 19:46:13 +00:00
Matt Arsenault	44aaff08ed	AMDGPU: Fix promote alloca for pointer loads If the load has a pointer type, we don't want to change its type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270000 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-18 23:20:24 +00:00
Matt Arsenault	3cd52aec7c	AMDGPU: Other sizes of popcnt are fast We can chain bcnt instructions together, so any width popcnt is pretty fast. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269950 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-18 16:10:19 +00:00
Matt Arsenault	5d9f8fb9d4	AMDGPU: Fix assert when erroring on a call For some reason an assert is now hit when a valid chain is not returned, so return the entry chain. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269948 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-18 16:10:11 +00:00
Matt Arsenault	41cf920df5	AMDGPU: Handle alloca promoting with null operands If the second pointer in a multi-pointer instruction is a constant, we can replace the type. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269945 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-18 15:57:21 +00:00
Matt Arsenault	c33f9cd287	AMDGPU: Fix a few slightly broken tests Fix minor bugs and uses of undef which break when pointer related optimization passes are run. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269944 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-18 15:48:44 +00:00
Jan Vesely	350e40ffb2	AMDGPU/R600: Use correct number of vector elements when lowering private loads Reviewer: tstellardAMD, arsenm Subscribers: arsenm, kzhuravl, llvm-commits Differential Revision: http://reviews.llvm.org/D20032 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269725 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-16 23:56:32 +00:00
Matt Arsenault	abc9f47dfe	AMDGPU: Add some private element size tests git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269712 91177308-0d34-0410-b5e6-96231b3b80d8	2016-05-16 22:17:27 +00:00

1 2 3 4 5 ...

409 Commits