RPCS3/llvm - llvm - Gitea: Git with a cup of tea

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2025-05-16 10:26:23 +00:00

Author	SHA1	Message	Date
Matt Arsenault	6939475a93	AMDGPU: Cleanup immediate folding code Move code down to use, reorder to avoid hard to follow immediate folding logic. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287818 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-23 21:51:07 +00:00
Matt Arsenault	ef97727654	AMDGPU: Fix debug printing The uint8_t was printed as a char which didn't really work. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@287817 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-23 21:51:05 +00:00
Konstantin Zhuravlyov	9027123253	[AMDGPU] Add f16 support (VI+) Differential Revision: https://reviews.llvm.org/D25975 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@286753 91177308-0d34-0410-b5e6-96231b3b80d8	2016-11-13 07:01:11 +00:00
Matt Arsenault	5f58f8ecb6	AMDGPU: Don't fold undef uses or copies with implicit uses git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283476 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-06 18:12:13 +00:00
Matt Arsenault	de50b32fba	AMDGPU: Remove leftover implicit operands when folding immediates When constant folding an operation to a copy or an immediate mov, the implicit uses/defs of the old instruction were left behind, e.g. replacing v_or_b32 left the implicit exec use on the new copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283471 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-06 17:54:30 +00:00
Mehdi Amini	67f335d992	Use StringRef in Pass/PassManager APIs (NFC) git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8	2016-10-01 02:56:57 +00:00
Matt Arsenault	0f7125844e	AMDGPU: Support folding FrameIndex operands This avoids test regressions in a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281491 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:51:33 +00:00
Matt Arsenault	8bc95d0a47	AMDGPU: Improve splitting 64-bit bit ops by constants This addresses a TODO to handle operations besides and. This also starts eliminating no-op operations with a constant that can emerge later. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281488 91177308-0d34-0410-b5e6-96231b3b80d8	2016-09-14 15:19:03 +00:00
Matt Arsenault	8f1b18be38	AMDGPU: Don't fold subregister extracts into tied operands git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278676 91177308-0d34-0410-b5e6-96231b3b80d8	2016-08-15 16:18:36 +00:00
Duncan P. N. Exon Smith	567409db69	CodeGen: Use MachineInstr& in TargetInstrInfo, NFC This is mostly a mechanical change to make TargetInstrInfo API take MachineInstr& (instead of MachineInstr* or MachineBasicBlock::iterator) when the argument is expected to be a valid MachineInstr. This is a general API improvement. Although it would be possible to do this one function at a time, that would demand a quadratic amount of churn since many of these functions call each other. Instead I've done everything as a block and just updated what was necessary. This is mostly mechanical fixes: adding and removing `` and `&` operators. The only non-mechanical change is to split ARMBaseInstrInfo::getOperandLatencyImpl out from ARMBaseInstrInfo::getOperandLatency. Previously, the latter took a `MachineInstr` which it updated to the instruction bundle leader; now, the latter calls the former either with the same `MachineInstr&` or the bundle leader. As a side effect, this removes a bunch of MachineInstr* to MachineBasicBlock::iterator implicit conversions, a necessary step toward fixing PR26753. Note: I updated WebAssembly, Lanai, and AVR (despite being off-by-default) since it turned out to be easy. I couldn't run tests for AVR since llc doesn't link with it turned on. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@274189 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-30 00:01:54 +00:00
Matt Arsenault	759ed7e410	AMDGPU: Cleanup subtarget handling. Split AMDGPUSubtarget into amdgcn/r600 specific subclasses. This removes most of the static_casting of the basic codegen classes everywhere, and tries to restrict the features visible on the wrong target. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273652 91177308-0d34-0410-b5e6-96231b3b80d8	2016-06-24 06:30:11 +00:00
Andrew Kaylor	c7ca1302cf	Add optimization bisect opt-in calls for AMDGPU passes Differential Revision: http://reviews.llvm.org/D19450 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267485 91177308-0d34-0410-b5e6-96231b3b80d8	2016-04-25 22:23:44 +00:00
Matt Arsenault	33e18796f1	AMDGPU: Fix passes depending on dominator tree for no reason git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260494 91177308-0d34-0410-b5e6-96231b3b80d8	2016-02-11 06:15:34 +00:00
Marek Olsak	bc61f352af	AMDGPU/SI: Fix a bug in SIFoldOperands Summary: ret.ll will contain a test for this Reviewers: tstellarAMD, arsenm Subscribers: arsenm Differential Revision: http://reviews.llvm.org/D16029 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257590 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-13 11:44:29 +00:00
Nicolai Haehnle	702b589510	AMDGPU/SI: Fold operands with sub-registers Summary: Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs, increasing the code size and VGPR pressure. These moves are now folded away. Note that this lack of operand folding was not a problem for VMEM loads, because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register coalescer. Some tests are updated, note that the fsub.ll test explicitly checks that the move is elided. With the IR generated by current Mesa, the changes are obviously relatively minor: 7063 shaders in 3531 tests Totals: SGPRS: 351872 -> 352560 (0.20 %) VGPRS: 199984 -> 200732 (0.37 %) Code Size: 9876968 -> 9881112 (0.04 %) bytes LDS: 91 -> 91 (0.00 %) blocks Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave Wait states: 295164 -> 295337 (0.06 %) Totals from affected shaders: SGPRS: 65784 -> 66472 (1.05 %) VGPRS: 38064 -> 38812 (1.97 %) Code Size: 1993828 -> 1997972 (0.21 %) bytes LDS: 42 -> 42 (0.00 %) blocks Scratch: 795648 -> 783360 (-1.54 %) bytes per wave Wait states: 54026 -> 54199 (0.32 %) Reviewers: tstellarAMD, arsenm, mareko Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D15875 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257074 91177308-0d34-0410-b5e6-96231b3b80d8	2016-01-07 17:10:29 +00:00
Matt Arsenault	caa831cabd	AMDGPU: Fix verifier error in SIFoldOperands There may be other use operands that also need their kill flags cleared. This happens in a few tests when SIFoldOperands is moved after PeepholeOptimizer. PeepholeOptimizer rewrites cases that look like: %vreg0 = ... %vreg1 = COPY %vreg0 use %vreg1<kill> %vreg2 = COPY %vreg0 use %vreg2<kill> to use the earlier source to %vreg0 = ... use %vreg0 use %vreg0 Currently SIFoldOperands sees the copied registers, so there is only one use. So far I haven't managed to come up with a test that currently has multiple uses of a foldable VGPR -> VGPR copy. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250960 91177308-0d34-0410-b5e6-96231b3b80d8	2015-10-21 22:37:50 +00:00
Andrew Kaylor	aac3c943f3	Improved the interface of methods commuting operands, improved X86-FMA3 mem-folding&coalescing. Patch by Slava Klochkov (vyacheslav.n.klochkov@intel.com) Differential Revision: http://reviews.llvm.org/D11370 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248735 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-28 20:33:22 +00:00
Matt Arsenault	7a6a7f2409	AMDGPU: Fix recomputing dominator tree unnecessarily SIFixSGPRCopies does not modify the CFG, but this was being recomputed before running SIFoldOperands. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@248587 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-25 17:21:28 +00:00
Matt Arsenault	117c014fc5	AMDGPU/SI: Fix creating v_mov_b32s without exec uses This will be caught by existing tests with a verifier check to be added in a future commit. git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247229 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-10 01:06:06 +00:00
Tom Stellard	6680fc3579	AMDGPU/SI: Fold operands through REG_SEQUENCE instructions Summary: This helps mostly when we use add instructions for address calculations that contain immediates. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12256 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247157 91177308-0d34-0410-b5e6-96231b3b80d8	2015-09-09 15:43:26 +00:00
Tom Stellard	127a3d74f1	AMDGPU/SI: Fix some invaild assumptions when folding 64-bit immediates Summary: We were assuming tha if the use operand had a sub-register that the immediate was 64-bits, but this was breaking the case of folding a 64-bit immediate into another 64-bit instruction. Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12255 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246354 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-29 01:58:21 +00:00
Tom Stellard	0554ee323c	AMDGPU/SI: Factor operand folding code into its own function Reviewers: arsenm Subscribers: arsenm, llvm-commits Differential Revision: http://reviews.llvm.org/D12254 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246353 91177308-0d34-0410-b5e6-96231b3b80d8	2015-08-28 23:45:19 +00:00
Tom Stellard	f5be357d37	AMDGPU/SI: Select mad patterns to v_mac_f32 The two-address instruction pass will convert these back to v_mad_f32 if necessary. Differential Revision: http://reviews.llvm.org/D11060 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@242038 91177308-0d34-0410-b5e6-96231b3b80d8	2015-07-13 15:47:57 +00:00
Tom Stellard	953c681473	R600 -> AMDGPU rename git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8	2015-06-13 03:28:10 +00:00

1 2

74 Commits