llvm/NVPTX at d8214db0868f228026fa6ab1d5ca6c76ab3b5ce0 - llvm

RPCS3/llvm

mirror of https://github.com/RPCS3/llvm.git synced 2024-12-23 12:40:17 +00:00

History

Jingyue Wu 75d77cd179 [MachineSink] Use the real post dominator tree Summary: Fixes a FIXME in MachineSinking. Instead of using the simple heuristics in isPostDominatedBy, use the real MachinePostDominatorTree and MachineLoopInfo. The old heuristics caused instructions to sink unnecessarily, and might create register pressure. This is the second try of the fix. The first one (D4814) caused a performance regression due to failing to sink instructions out of loops (PR21115). This patch fixes PR21115 by sinking an instruction from a deeper loop to a shallower one regardless of whether the target block post-dominates the source. Thanks Alexey Volkov for reporting PR21115! Test Plan: Added a NVPTX codegen test to verify that our change prevents the backend from over-sinking. It also shows the unnecessary register pressure caused by over-sinking. Added an X86 test to verify we can sink instructions out of loops regardless of the dominance relationship. This test is reduced from Alexey's test in PR21115. Updated an affected test in X86. Also ran SPEC CINT2006 and llvm-test-suite for compilation time and runtime performance. Results are attached separately in the review thread. Reviewers: Jiangning, resistor, hfinkel Reviewed By: hfinkel Subscribers: hfinkel, bruno, volkalexey, llvm-commits, meheff, eliben, jholewinski Differential Revision: http://reviews.llvm.org/D5633 git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@219773 91177308-0d34-0410-b5e6-96231b3b80d8		2014-10-15 03:27:43 +00:00
..
access-non-generic.ll
add-128bit.ll
addrspacecast-gvar.ll
addrspacecast.ll
aggr-param.ll
annotations.ll
arg-lowering.ll	[NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors	2014-06-27 18:35:44 +00:00
arithmetic-fp-sm20.ll	[NVPTX] Improve handling of FP fusion	2014-07-17 18:10:09 +00:00
arithmetic-int.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
atomics.ll	Add some tests for NVPTX lowering of cmpxchg	2014-07-21 22:54:44 +00:00
bfe.ll	[NVPTX] Add isel patterns for bit-field extract (bfe)	2014-06-27 18:35:27 +00:00
bug17709.ll
call-with-alloca-buffer.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
callchain.ll
calling-conv.ll
compare-int.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
constant-vectors.ll
convert-fp.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
convert-int-sm20.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
ctlz.ll
ctpop.ll
cttz.ll
div-ri.ll
envreg.ll
fast-math.ll
fma-disable.ll
fma.ll	[NVPTX] Improve handling of FP fusion	2014-07-17 18:10:09 +00:00
fp16.ll	NVPTX: support direct f16 <-> f64 conversions via intrinsics.	2014-07-18 08:30:10 +00:00
fp-contract.ll	[NVPTX] Improve handling of FP fusion	2014-07-17 18:10:09 +00:00
fp-literals.ll	[NVPTX] Improve handling of FP fusion	2014-07-17 18:10:09 +00:00
generic-to-nvvm.ll
global-ordering.ll
gvar-init.ll	[NVPTX] Error out if initializer is given for variable in an address space that does not support initialization	2014-06-27 18:36:01 +00:00
half.ll	NVPTX: support fpext/fptrunc to and from f16.	2014-07-18 13:01:43 +00:00
i1-global.ll
i1-int-to-fp.ll
i1-param.ll
i8-param.ll
imad.ll	[NVPTX] Implement fma and imad contraction as target DAGCombiner patterns	2014-06-27 18:35:37 +00:00
implicit-def.ll	[NVPTX] Improve handling of FP fusion	2014-07-17 18:10:09 +00:00
inline-asm.ll	[NVPTX] Add 'b' asm constraint	2014-06-27 18:36:06 +00:00
intrin-nocapture.ll
intrinsic-old.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
intrinsics.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
isspacep.ll	[NVPTX] Add support for isspacep instruction	2014-06-27 18:35:24 +00:00
ld-addrspace.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
ld-generic.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
ldparam-v4.ll
ldu-i8.ll	[NVPTX] Make the alignment an explicit argument to ldu/ldg	2014-08-29 15:30:20 +00:00
ldu-ldg.ll	[NVPTX] Make the alignment an explicit argument to ldu/ldg	2014-08-29 15:30:20 +00:00
ldu-reg-plus-offset.ll	[NVPTX] Make the alignment an explicit argument to ldu/ldg	2014-08-29 15:30:20 +00:00
lit.local.cfg
load-sext-i1.ll
local-stack-frame.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
machine-sink.ll	[MachineSink] Use the real post dominator tree	2014-10-15 03:27:43 +00:00
managed.ll	[NVPTX] Add support for .managed variables for UVM	2014-06-27 18:35:58 +00:00
misaligned-vector-ldst.ll	[NVPTX] Honor alignment on vector loads/stores	2014-07-16 19:45:35 +00:00
module-inline-asm.ll
mulwide.ll	[NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types	2014-07-23 20:23:49 +00:00
noduplicate-syncthreads.ll
nvvm-reflect.ll	[NVPTX] Add reflect intrinsic (better than matching by function name)	2014-06-27 18:36:11 +00:00
param-align.ll
pr13291-i1-store.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
pr16278.ll
pr17529.ll
ptx-version-30.ll
ptx-version-31.ll
refl1.ll
rotate.ll	[NVPTX] Add support for efficient rotate instructions on SM 3.2+	2014-06-27 18:35:33 +00:00
rsqrt.ll
sched1.ll
sched2.ll
sext-in-reg.ll
sext-params.ll
shift-parts.ll	[NVPTX] Add support for [SHL,SRA,SRL]_PARTS	2014-06-27 18:35:40 +00:00
simple-call.ll
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll
sm-version-35.ll
st-addrspace.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
st-generic.ll	[NVPTX] Rename registers %fl -> %fd and %rl -> %rd	2014-07-16 16:26:58 +00:00
surf-read-cuda.ll	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch	2014-07-17 11:59:04 +00:00
surf-read.ll
surf-write-cuda.ll	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch	2014-07-17 11:59:04 +00:00
surf-write.ll
symbol-naming.ll
tex-read-cuda.ll	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch	2014-07-17 11:59:04 +00:00
tex-read.ll	[NVPTX] Add more surface/texture intrinsics, including CUDA unified texture fetch	2014-07-17 11:59:04 +00:00
texsurf-queries.ll	[NVPTX] Flag surface/texture query instructions with IsTexSurfQuery	2014-07-17 14:51:33 +00:00
tuple-literal.ll
vec8.ll
vec-param-load.ll
vector-args.ll
vector-call.ll	[NVPTX] Add missing .v4 qualifier on vector store instruction	2014-07-17 16:58:56 +00:00
vector-compare.ll
vector-loads.ll
vector-select.ll
vector-stores.ll
weak-global.ll	[NVPTX] Emit .weak linkage for link_once, weak, available_externally, and common linkage	2014-06-27 18:35:56 +00:00
weak-linkage.ll