llvm/test/CodeGen/NVPTX
Artem Belevich a710d0d215 [NVPTX] Improve lowering of byval args of device functions.
Avoid unnecessary spills of such vars to local space on SASS level and
pointer space conversion.

Instead, make a local copy with appropriate addrspacecasts and let
LLVM optimize them away when possible.

This allows loading value of the argument using [symbol+offset]
instead of converting argument to general space pointer and using it
for indexing (which also implicitly converts param space pointer to
local space one on SASS level and triggers copying of argument into
local space in the process).

This reduces call overhead, uses less registers and reduces overall
SASS size by 2-4%.

Differential Review: http://reviews.llvm.org/D21421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273313 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-21 20:30:26 +00:00
..
access-non-generic.ll [NVPTX] Adds a new address space inference pass. 2016-03-20 20:59:20 +00:00
add-128bit.ll Revert revisions r234755, r234759, r234760 2015-04-13 17:47:15 +00:00
addrspacecast-gvar.ll [NVPTX] Handle addrspacecast constant expressions in aggregate initializers 2015-04-28 17:18:30 +00:00
addrspacecast.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
aggr-param.ll [NVPTX] Fix emitting aggregate parameters 2014-01-28 18:35:29 +00:00
alias.ll [CUDA] Die gracefully when trying to output an LLVM alias. 2016-01-23 21:12:20 +00:00
annotations.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
arg-lowering.ll [NVPTX] Clean up argument lowering code and properly handle alignment for structs and vectors 2014-06-27 18:35:44 +00:00
arithmetic-fp-sm20.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
arithmetic-int.ll [NVPTX] expand mul_lohi to mul_lo and mul_hi 2016-01-22 19:47:26 +00:00
atomics.ll Add some tests for NVPTX lowering of cmpxchg 2014-07-21 22:54:44 +00:00
bfe.ll [NVPTX] Add isel patterns for bit-field extract (bfe) 2014-06-27 18:35:27 +00:00
branch-fold.ll Roll forward r242871 2015-07-29 18:59:09 +00:00
bug17709.ll Remove the linker_private and linker_private_weak linkages. 2014-03-13 23:18:37 +00:00
bug21465.ll [NVPTX] Improve lowering of byval args of device functions. 2016-06-21 20:30:26 +00:00
bug22246.ll [NVPTX] Generate a more optimal sequence for select of i1 2015-01-26 19:52:20 +00:00
bug22322.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
bug26185-2.ll [NVPTX] Fix sign/zero-extending ldg/ldu instruction selection 2016-05-02 18:12:02 +00:00
bug26185.ll [NVPTX] Handle ldg created from sign-/zero-extended load 2016-04-05 12:38:01 +00:00
bypass-div.ll Use 32-bit divides instead of 64-bit divides where possible. 2015-08-11 22:16:34 +00:00
call-with-alloca-buffer.ll Add NVPTXPeephole pass to reduce unnecessary address cast 2015-06-24 20:20:16 +00:00
callchain.ll [NVPTX] Fix handling of indirect calls 2013-11-15 12:30:04 +00:00
calling-conv.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
combine-min-max.ll [NVPTX] Let NVPTX backend detect integer min and max patterns. 2015-08-26 23:22:02 +00:00
compare-int.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
constant-vectors.ll [NVPTX] Make constant vector test case endian-independent 2013-09-19 13:14:44 +00:00
convergent-mir-call.ll [NVPTX] Use different, convergent MIs for convergent calls. 2016-03-01 19:24:03 +00:00
convert-fp.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
convert-int-sm20.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
ctlz.ll [NVPTX] Add support for cttz/ctlz/ctpop 2013-06-28 17:58:07 +00:00
ctpop.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
cttz.ll [NVPTX] Add support for cttz/ctlz/ctpop 2013-06-28 17:58:07 +00:00
debug-file-loc.ll [PR27284] Reverse the ownership between DICompileUnit and DISubprogram. 2016-04-15 15:57:41 +00:00
disable-opt.ll [NVPTX] Disable performance optimizations when OptLevel==None 2016-02-04 04:15:36 +00:00
div-ri.ll [NVPTX] Add missing patterns for div.approx with immediate denominator 2014-01-21 14:40:05 +00:00
envreg.ll [NVPTX] Add support for envreg reads 2014-06-27 18:35:21 +00:00
extloadv.ll [NVPTX] expand extload/truncstore for vectors of floats 2015-07-01 21:32:42 +00:00
fast-math.ll [NVPTX] Use approximate FP ops when unsafe-fp-math is used, and append 2013-07-22 12:18:04 +00:00
fma-assoc.ll SelectionDAG: Prefer to combine multiplication with less uses for fma 2015-08-11 19:21:46 +00:00
fma-disable.ll
fma.ll Check that the TLI callback enableAggressiveFMAFusion has the desired effect on FMA folding. 2015-01-14 15:36:28 +00:00
fp16.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
fp-contract.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
fp-literals.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
function-align.ll [NVPTXAsmPrinter] do not print .align on function headers 2015-03-12 01:50:30 +00:00
generic-to-nvvm.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
global-addrspace.ll [NVPTX] Allow undef value as global initializer 2015-08-22 05:40:26 +00:00
global-ctor-empty.ll [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. 2016-01-30 01:07:38 +00:00
global-ctor.ll [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. 2016-01-30 01:07:38 +00:00
global-dtor.ll [CUDA] Die if we ask the NVPTX backend to emit a global ctor/dtor. 2016-01-30 01:07:38 +00:00
global-ordering.ll
global-visibility.ll [NVPTX] Do not emit .hidden or .protected directives as they are not allowed by PTX. 2016-01-15 23:57:53 +00:00
globals_init.ll The constant initialization for globals in NVPTX is generated as an 2015-06-09 16:29:34 +00:00
globals_lowering.ll Force relocation mode to be default, regardless of what is passed to the backend. 2015-06-30 17:18:00 +00:00
gvar-init.ll [NVPTX] Error out if initializer is given for variable in an address space that does not support initialization 2014-06-27 18:36:01 +00:00
half.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
i1-global.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
i1-int-to-fp.ll [NVPTX] Add missing patterns for i1 [s,u]int_to_fp 2013-08-06 14:13:34 +00:00
i1-param.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
i8-param.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
imad.ll [NVPTX] Implement fma and imad contraction as target DAGCombiner patterns 2014-06-27 18:35:37 +00:00
implicit-def.ll [NVPTX] Improve handling of FP fusion 2014-07-17 18:10:09 +00:00
inline-asm.ll [NVPTX] Add 'b' asm constraint 2014-06-27 18:36:06 +00:00
intrin-nocapture.ll Reapply 239795 - [InstCombine] Propagate non-null facts to call parameters 2015-06-16 20:24:25 +00:00
intrinsic-old.ll [NVPTX] Added NVVMIntrRange pass 2016-05-26 17:02:56 +00:00
intrinsics.ll [NVPTX] Added missing test case for llvm.nvvm.sqrt.f NVPTX intrinsic 2015-06-23 18:22:17 +00:00
isspacep.ll [NVPTX] Add support for isspacep instruction 2014-06-27 18:35:24 +00:00
ld-addrspace.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
ld-generic.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
ldparam-v4.ll [NVPTX] Fix off-by-one error when creating the VT list for an SDNode 2013-12-05 12:58:00 +00:00
ldu-i8.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
ldu-ldg.ll [NVPTX] Make the alignment an explicit argument to ldu/ldg 2014-08-29 15:30:20 +00:00
ldu-reg-plus-offset.ll [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction 2015-02-27 19:29:02 +00:00
lit.local.cfg Reduce verbiage of lit.local.cfg files 2014-06-09 22:42:55 +00:00
load-sext-i1.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
load-with-non-coherent-cache.ll [NVPTX] Use LDG for pointer induction variables. 2015-08-05 23:11:57 +00:00
local-stack-frame.ll [NVPTX] Move NVPTXPeephole after NVPTXPrologEpilogPass 2015-07-01 20:08:06 +00:00
loop-vectorize.ll [NVPTX] declare no vector registers 2015-07-10 04:31:56 +00:00
lower-aggr-copies.ll Revert "Change memcpy/memset/memmove to have dest and source alignments." 2015-11-19 05:56:52 +00:00
lower-alloca.ll Add NVPTXLowerAlloca pass to convert alloca'ed memory to local address 2015-06-17 22:31:02 +00:00
lower-kernel-ptr-arg.ll [NVPTX] Improve lowering of byval args of device functions. 2016-06-21 20:30:26 +00:00
machine-sink.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
MachineSink-call.ll [NVPTX] Annotate call machine instructions as calls. 2016-02-17 17:46:50 +00:00
MachineSink-convergent.ll [NVPTX] Test that MachineSink won't sink across llvm.cuda.syncthreads. 2016-02-17 17:46:52 +00:00
managed.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
misaligned-vector-ldst.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
module-inline-asm.ll [NVPTX] Add support for module-scope inline asm 2013-07-01 13:00:14 +00:00
mulwide.ll [NVPTX] Add some extra tests for mul.wide to test non-power-of-two source types 2014-07-23 20:23:49 +00:00
noduplicate-syncthreads.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
nounroll.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
nvcl-param-align.ll [NVPTX] Fix bugs related to isSingleValueType 2014-12-17 17:59:04 +00:00
nvvm-reflect-module-flag.ll [NVPTX] Read __CUDA_FTZ from module flags in NVVMReflect. 2016-04-01 01:09:07 +00:00
nvvm-reflect.ll Add support for __nvvm_reflect changes in libdevice in CUDA-7.0 2015-03-19 17:05:35 +00:00
param-align.ll
pr13291-i1-store.ll [NVPTX] roll forward r239082 2015-06-04 21:28:26 +00:00
pr16278.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
pr17529.ll [opaque pointer type] Add textual IR support for explicit type parameter to getelementptr instruction 2015-02-27 19:29:02 +00:00
refl1.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
reg-copy.ll [NVPTX] allow register copy between float and int 2015-08-01 18:02:12 +00:00
rotate.ll [NVPTX] Add support for efficient rotate instructions on SM 3.2+ 2014-06-27 18:35:33 +00:00
rsqrt.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
sched1.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
sched2.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
sext-in-reg.ll Convert CodeGen/*/*.ll tests to use the new CHECK-LABEL for easier debugging. No functionality change and all tests pass after conversion. 2013-07-13 20:38:47 +00:00
sext-params.ll [NVPTX] Handle signext/zeroext attributes properly 2013-07-01 12:58:58 +00:00
shfl.ll [NVPTX] Add intrinsics for shfl instructions. 2016-06-09 20:04:08 +00:00
shift-parts.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
simple-call.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
sm-version-20.ll
sm-version-21.ll
sm-version-30.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-32.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-35.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-37.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-50.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-52.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
sm-version-53.ll [NVPTX] Associate a minimum PTX version for each SM architecture 2015-03-30 19:30:55 +00:00
speculative-execution-divergent-target.ll Move divergent-target test into CodeGen/NVPTX because it requires an NVPTX target. 2016-04-15 01:20:52 +00:00
st-addrspace.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
st-generic.ll [NVPTX] Rename registers %fl -> %fd and %rl -> %rd 2014-07-16 16:26:58 +00:00
surf-read-cuda.ll [NVPTX] roll forward r239082 2015-06-04 21:28:26 +00:00
surf-read.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
surf-write-cuda.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
surf-write.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
symbol-naming.ll Have a single way for creating unique value names. 2015-11-22 00:16:24 +00:00
TailDuplication-convergent.ll Don't tail-duplicate blocks that contain convergent instructions. 2016-02-22 17:50:52 +00:00
tex-read-cuda.ll [NVPTX] roll forward r239082 2015-06-04 21:28:26 +00:00
tex-read.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
texsurf-queries.ll IR: Make metadata typeless in assembly 2014-12-15 19:07:53 +00:00
tuple-literal.ll
vec8.ll [NVPTX] Fix logic error in loading vector parameters of more than 4 components 2013-11-11 19:28:16 +00:00
vec-param-load.ll Fix non-deterministic SDNodeOrder-dependent codegen 2014-01-12 14:09:17 +00:00
vector-args.ll [NVPTX] Add support for vectorized function return values 2013-06-28 17:57:55 +00:00
vector-call.ll Fix a bunch of trivial cases of 'CHECK[^:]*$' in the tests. NFCI 2015-08-10 19:01:27 +00:00
vector-compare.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
vector-global.ll [NVPTX] Fix bugs related to isSingleValueType 2014-12-17 17:59:04 +00:00
vector-loads.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
vector-return.ll [NVPTX] aligned byte-buffers for vector return types 2014-10-25 03:46:16 +00:00
vector-select.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
vector-stores.ll Add a target legalize hook for SplitVectorOperand (again) 2013-07-26 13:28:29 +00:00
weak-global.ll [opaque pointer type] Add textual IR support for explicit type parameter to load instruction 2015-02-27 21:17:42 +00:00
weak-linkage.ll [NVPTX] Do not emit .weak symbols for NVPTX 2014-12-01 21:16:17 +00:00