llvm/lib/Target/AMDGPU
Marek Olsak 716b378a48 AMDGPU/SI: Increase SGPR limit to 96 on Tonga/Iceland
Summary:
This is the setting of the Vulkan closed source driver.

It decreases the max wave count from 10 to 8.

26010 shaders in 14650 tests
Totals:
VGPRS: 829593 -> 808440 (-2.55 %)
Spilled SGPRs: 81878 -> 42226 (-48.43 %)
Spilled VGPRs: 367 -> 358 (-2.45 %)
Scratch VGPRs: 1764 -> 1748 (-0.91 %) dwords per thread
Code Size: 36677864 -> 35923932 (-2.06 %) bytes

There is a massive decrease in SGPR spilling in general and -7.4% spilled
VGPRs for DiRT Showdown (= SGPRs spilled to scratch?)

Reviewers: arsenm, tstellarAMD, nhaehnle

Subscribers: arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D23034

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277867 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-05 21:23:29 +00:00
..
AsmParser [AMDGPU] Assembler: fix row_bcast parsing 2016-07-14 14:50:35 +00:00
Disassembler AMDGPU: Expand register indexing pseudos in custom inserter 2016-07-19 00:35:03 +00:00
InstPrinter AMDGPU: Remove unnecessary string usage in AsmPrinter 2016-07-05 22:06:56 +00:00
MCTargetDesc MC] Provide an MCTargetOptions to implementors of MCAsmBackendCtorTy, NFC 2016-07-25 17:18:28 +00:00
TargetInfo Remove autoconf support 2016-01-26 21:29:08 +00:00
Utils [AMDGPU] Enable absolute expression initializer for amd_kernel_code_t fields. 2016-06-23 14:13:06 +00:00
AMDGPU.h AMDGPU: Change fdiv lowering based on !fpmath metadata 2016-07-19 23:16:53 +00:00
AMDGPU.td AMDGPU: Add feature for unaligned access 2016-07-01 23:03:44 +00:00
AMDGPUAlwaysInlinePass.cpp Cloning: Clean up the interface to the CloneFunction function. 2016-05-10 20:23:24 +00:00
AMDGPUAnnotateKernelFeatures.cpp AMDGPU: Add HSA dispatch id intrinsic 2016-07-22 17:01:30 +00:00
AMDGPUAnnotateUniformValues.cpp Add optimization bisect opt-in calls for AMDGPU passes 2016-04-25 22:23:44 +00:00
AMDGPUAsmPrinter.cpp [OpenCL] Add missing tests for getOCLTypeName 2016-08-04 19:45:00 +00:00
AMDGPUAsmPrinter.h AMDGPU: Delete more dead code 2016-07-22 17:01:25 +00:00
AMDGPUCallingConv.td AMDGPU: Fix kernel argument alignment impacting stack size 2016-06-18 05:15:53 +00:00
AMDGPUCallLowering.cpp AMDGPU: Add skeleton GlobalIsel implementation 2016-04-14 19:09:28 +00:00
AMDGPUCallLowering.h AMDGPU: Add skeleton GlobalIsel implementation 2016-04-14 19:09:28 +00:00
AMDGPUCodeGenPrepare.cpp AMDGPU: Use rcp for fdiv 1, x with fpmath metadata 2016-07-26 23:25:44 +00:00
AMDGPUFrameLowering.cpp MachineFunction: Return reference for getFrameInfo(); NFC 2016-07-28 18:40:00 +00:00
AMDGPUFrameLowering.h AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
AMDGPUInstrInfo.cpp AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
AMDGPUInstrInfo.h AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
AMDGPUInstrInfo.td AMDGPU : Add intrinsics for compare with the full wavefront result 2016-07-28 16:42:13 +00:00
AMDGPUInstructions.td AMDGPU: Fix i1 fp_to_int 2016-07-22 17:01:21 +00:00
AMDGPUIntrinsicInfo.cpp AMDGPU: Change fdiv lowering based on !fpmath metadata 2016-07-19 23:16:53 +00:00
AMDGPUIntrinsicInfo.h AMDGPU: Change fdiv lowering based on !fpmath metadata 2016-07-19 23:16:53 +00:00
AMDGPUIntrinsics.td AMDGPU: Remove read_workdim intrinsic 2016-07-25 20:17:02 +00:00
AMDGPUISelDAGToDAG.cpp MachineFunction: Return reference for getFrameInfo(); NFC 2016-07-28 18:40:00 +00:00
AMDGPUISelLowering.cpp [X86] Heuristic to selectively build Newton-Raphson SQRT estimation 2016-08-04 12:47:28 +00:00
AMDGPUISelLowering.h [X86] Heuristic to selectively build Newton-Raphson SQRT estimation 2016-08-04 12:47:28 +00:00
AMDGPUMachineFunction.cpp AMDGPU: Make AMDGPUMachineFunction fields private 2016-07-26 16:45:58 +00:00
AMDGPUMachineFunction.h AMDGPU: Make AMDGPUMachineFunction fields private 2016-07-26 16:45:58 +00:00
AMDGPUMCInstLower.cpp AMDGPU/SI: Add support for R_AMDGPU_GOTPCREL 2016-07-13 14:23:33 +00:00
AMDGPUMCInstLower.h AMDGPU: R600 code splitting cleanup 2016-03-11 08:00:27 +00:00
AMDGPUOpenCLImageTypeLoweringPass.cpp [NFC] Header cleanup 2016-04-18 09:17:29 +00:00
AMDGPUPromoteAlloca.cpp AMDGPU: Remove pointless dyn_cast_or_null 2016-07-18 19:00:07 +00:00
AMDGPURegisterInfo.cpp AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
AMDGPURegisterInfo.h AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
AMDGPURegisterInfo.td AMDGPU: Set SubRegIndex size and offset 2015-07-30 17:03:11 +00:00
AMDGPURuntimeMetadata.h Re-commit [AMDGPU] Add metadata for runtime 2016-07-16 05:09:21 +00:00
AMDGPUSubtarget.cpp AMDGPU: Delete dead code 2016-07-25 19:06:25 +00:00
AMDGPUSubtarget.h AMDGPU/SI: Increase SGPR limit to 96 on Tonga/Iceland 2016-08-05 21:23:29 +00:00
AMDGPUTargetMachine.cpp [GlobalISel] Introduce an instruction selector. 2016-07-27 14:31:55 +00:00
AMDGPUTargetMachine.h AMDGPU: Delete more dead code 2016-07-22 17:01:25 +00:00
AMDGPUTargetObjectFile.cpp Revert "[AMDGPU] Emit read-only data to .rodata for hsa" 2016-07-22 23:46:40 +00:00
AMDGPUTargetObjectFile.h AMDGPU/SI: Add support for AMD code object version 2. 2016-05-05 17:03:33 +00:00
AMDGPUTargetTransformInfo.cpp AMDGPU: Implement getLoadStoreVecRegBitWidth 2016-07-01 00:56:27 +00:00
AMDGPUTargetTransformInfo.h [TTI] The cost model should not assume vector casts get completely scalarized 2016-07-06 17:30:56 +00:00
AMDILCFGStructurizer.cpp AMDGPU: Remove implicit iterator conversions, NFC 2016-07-08 19:16:05 +00:00
AMDKernelCodeT.h [AMDGPU] fix amd_kernel_code_t bit field position as per spec (added missing reserved fields) 2016-02-24 10:54:25 +00:00
CaymanInstructions.td AMDGPU/R600: Add PatFrags for selecting the correct vtx id for loads 2016-07-05 00:12:51 +00:00
CIInstructions.td [AMDGPU] refactor DS instruction definitions. NFC. 2016-08-01 14:21:30 +00:00
CMakeLists.txt AMDGPU/R600: Delete/rename intrinsics no longer used by mesa 2016-07-14 05:47:17 +00:00
DSInstructions.td [AMDGPU] refactor DS instruction definitions. NFC. 2016-08-01 14:21:30 +00:00
EvergreenInstructions.td AMDGPU/R600: Replace barrier intrinsics 2016-07-18 18:34:59 +00:00
GCNHazardRecognizer.cpp AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
GCNHazardRecognizer.h AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
LLVMBuild.txt AMDGPU: Prune AMDGPUAsmParser in libdeps. 2016-07-09 07:54:27 +00:00
Processors.td AMDGPU: Fix crashes on unknown processor name 2016-06-02 18:37:16 +00:00
R600ClauseMergePass.cpp AMDGPU: Remove implicit iterator conversions, NFC 2016-07-08 19:16:05 +00:00
R600ControlFlowFinalizer.cpp AMDGPU: Delete more dead code 2016-07-22 17:01:25 +00:00
R600Defines.h AMDGPU: R600 code splitting cleanup 2016-03-11 08:00:27 +00:00
R600EmitClauseMarkers.cpp AMDGPU: Remove implicit iterator conversions, NFC 2016-07-08 19:16:05 +00:00
R600ExpandSpecialInstrs.cpp AMDGPU: Delete more dead code 2016-07-22 17:01:25 +00:00
R600FrameLowering.cpp AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
R600FrameLowering.h AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
R600InstrFormats.td
R600InstrInfo.cpp [AMDGPU] Fix lifetime of SmallVector temporaries. 2016-07-30 11:31:16 +00:00
R600InstrInfo.h AMDGPU/R600: Delete dead code. 2016-07-15 21:26:46 +00:00
R600Instructions.td AMDGPU: Fix TargetPrefix for remaining r600 intrinsics 2016-07-15 21:27:08 +00:00
R600Intrinsics.td AMDGPU: Fix TargetPrefix for remaining r600 intrinsics 2016-07-15 21:27:08 +00:00
R600ISelLowering.cpp AMDGPU/R600: Remove dead custom inserters 2016-07-26 21:03:38 +00:00
R600ISelLowering.h AMDGPU: Fix i1 fp_to_int 2016-07-22 17:01:21 +00:00
R600MachineFunctionInfo.cpp AMDGPU: Delete more dead code 2016-07-22 17:01:25 +00:00
R600MachineFunctionInfo.h AMDGPU: Delete more dead code 2016-07-22 17:01:25 +00:00
R600MachineScheduler.cpp CodeGen: Use MachineInstr& in TargetInstrInfo, NFC 2016-06-30 00:01:54 +00:00
R600MachineScheduler.h AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
R600OptimizeVectorRegisters.cpp AMDGPU: Remove implicit iterator conversions, NFC 2016-07-08 19:16:05 +00:00
R600Packetizer.cpp CodeGen: Use MachineInstr& in TargetInstrInfo, NFC 2016-06-30 00:01:54 +00:00
R600RegisterInfo.cpp AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
R600RegisterInfo.h AMDGPU: Move R600 only pieces into R600 classes 2016-07-09 18:11:15 +00:00
R600RegisterInfo.td
R600Schedule.td AMDGPU: Fix trailing whitespace 2016-06-10 02:18:02 +00:00
R700Instructions.td
SIAnnotateControlFlow.cpp AMDGPU/SI: Don't handle a loop if there is no loop at all for a terminator BB. 2016-07-28 23:01:45 +00:00
SIDebuggerInsertNops.cpp AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
SIDefines.h AMDGPU: Stay in WQM for non-intrinsic stores 2016-08-02 19:31:14 +00:00
SIFixControlFlowLiveIntervals.cpp Revert "AMDGPU: Remove unused control flow intrinsic" 2016-07-09 17:18:39 +00:00
SIFixSGPRCopies.cpp Revert "AMDGPU: Remove unused control flow intrinsic" 2016-07-09 17:18:39 +00:00
SIFoldOperands.cpp CodeGen: Use MachineInstr& in TargetInstrInfo, NFC 2016-06-30 00:01:54 +00:00
SIFrameLowering.cpp MachineFunction: Return reference for getFrameInfo(); NFC 2016-07-28 18:40:00 +00:00
SIFrameLowering.h [AMDGPU] Emit debugger prologue and emit the rest of the debugger fields in the kernel code header 2016-06-25 03:11:28 +00:00
SIInsertWaits.cpp AMDGPU: Remove implicit iterator conversions, NFC 2016-07-08 19:16:05 +00:00
SIInstrFormats.td AMDGPU: Stay in WQM for non-intrinsic stores 2016-08-02 19:31:14 +00:00
SIInstrInfo.cpp MachineFunction: Return reference for getFrameInfo(); NFC 2016-07-28 18:40:00 +00:00
SIInstrInfo.h AMDGPU: Stay in WQM for non-intrinsic stores 2016-08-02 19:31:14 +00:00
SIInstrInfo.td AMDGPU: Stay in WQM for non-intrinsic stores 2016-08-02 19:31:14 +00:00
SIInstructions.td AMDGPU: Stay in WQM for non-intrinsic stores 2016-08-02 19:31:14 +00:00
SIIntrinsics.td AMDGPU: Change fdiv lowering based on !fpmath metadata 2016-07-19 23:16:53 +00:00
SIISelLowering.cpp LoadStoreVectorizer: Remove TargetBaseAlign. Keep alignment for stack adjustments. 2016-08-04 16:38:44 +00:00
SIISelLowering.h AMDGPU: Remove analyzeImmediate 2016-07-28 00:32:02 +00:00
SILoadStoreOptimizer.cpp AMDGPU: Move subtarget feature checks into passes 2016-06-27 20:32:13 +00:00
SILowerControlFlow.cpp AMDGPU: add execfix flag to SI_ELSE 2016-07-28 11:39:24 +00:00
SILowerI1Copies.cpp AMDGPU: Cleanup subtarget handling. 2016-06-24 06:30:11 +00:00
SIMachineFunctionInfo.cpp MachineFunction: Return reference for getFrameInfo(); NFC 2016-07-28 18:40:00 +00:00
SIMachineFunctionInfo.h AMDGPU: Make AMDGPUMachineFunction fields private 2016-07-26 16:45:58 +00:00
SIMachineScheduler.cpp AMDGPU/SI: Fix SI scheduler refcount issue 2016-07-19 00:35:22 +00:00
SIMachineScheduler.h AMDGPU: R600 code splitting cleanup 2016-03-11 08:00:27 +00:00
SIRegisterInfo.cpp MachineFunction: Return reference for getFrameInfo(); NFC 2016-07-28 18:40:00 +00:00
SIRegisterInfo.h AMDGPU/SI: Don't use reserved VGPRs for SGPR spilling 2016-07-28 14:30:43 +00:00
SIRegisterInfo.td [AMDGPU] Some code cleaning in SIRegisterInfo.td 2016-07-21 13:29:57 +00:00
SISchedule.td AMDGPU: Define a schedule class for COPY. 2016-06-24 23:52:11 +00:00
SIShrinkInstructions.cpp AMDGPU: Expand register indexing pseudos in custom inserter 2016-07-19 00:35:03 +00:00
SITypeRewriter.cpp AMDGPU: Add a shader calling convention 2016-04-06 19:40:20 +00:00
SIWholeQuadMode.cpp AMDGPU: Stay in WQM for non-intrinsic stores 2016-08-02 19:31:14 +00:00
VIInstrFormats.td [AMDGPU] refactor DS instruction definitions. NFC. 2016-08-01 14:21:30 +00:00
VIInstructions.td [AMDGPU] refactor DS instruction definitions. NFC. 2016-08-01 14:21:30 +00:00