We don't want to prevent inlining because of target-cpu and -features
attributes that were added to newer versions of LLVM/Clang: There are
no incompatible functions in PTX, ptxas will throw errors in such cases.
Differential Revision: https://reviews.llvm.org/D47691
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334904 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
They've been deprecated in favor of UADDO/ADDCARRY or USUBO/SUBCARRY for a while.
Target that uses these opcodes are changed in order to ensure their behavior doesn't change.
Reviewers: efriedma, craig.topper, dblaikie, bkramer
Subscribers: jholewinski, arsenm, jyknight, sdardis, nemanjai, nhaehnle, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, jordy.potman.lists, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D47422
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333748 91177308-0d34-0410-b5e6-96231b3b80d8
This reapplies commits: r330271, r330592, r330779.
[DEBUG] Initial adaptation of NVPTX target for debug info emission.
Summary:
Patch adds initial emission of the debug info for NVPTX target.
Currently, only .file and .loc directives are emitted, everything else is
commented out to not break the compilation of Cuda.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332689 91177308-0d34-0410-b5e6-96231b3b80d8
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.
In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.
Differential Revision: https://reviews.llvm.org/D43624
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332240 91177308-0d34-0410-b5e6-96231b3b80d8
Const/local/shared address spaces are all < 4GB and we can always use
32-bit pointers to access them. This has substantial performance impact
on kernels that uses shared memory for intermediary results.
The feature is disabled by default.
Differential Revision: https://reviews.llvm.org/D46147
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331941 91177308-0d34-0410-b5e6-96231b3b80d8
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.
Patch produced by
for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done
Differential Revision: https://reviews.llvm.org/D46290
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8
This appears to have some issues associated with the file directive output
causing multiple global symbols with the name "file" to be emitted into a
startup section. I'm investigating more specific causes and working with the
original author.
This reverts commit r330271.
Also Revert "[DEBUGINFO, NVPTX] Add the test for the debug info of the local"
This reverts commit r330592 and the follow up of 330779 as the testcase is dependent upon r330271.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331237 91177308-0d34-0410-b5e6-96231b3b80d8
Since PTX has grown a <2 x half> datatype vectorization has become more
important. The late LoadStoreVectorizer intentionally only does loads
and stores, but now arithmetic has to be vectorized for optimal
throughput too.
This is still very limited, SLP vectorization happily creates <2 x half>
if it's a legal type but there's still a lot of register moving
happening to get that fed into a vectorized store. Overall it's a small
performance win by reducing the amount of arithmetic instructions.
I haven't really checked what the loop vectorizer does to PTX code, the
cost model there might need some more tweaks. I didn't see it causing
harm though.
Differential Revision: https://reviews.llvm.org/D46130
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331035 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Patch adds initial emission of the debug info for NVPTX target.
Currently, only .file and .loc directives are emitted, everything else is
commented out to not break the compilation of Cuda.
Reviewers: echristo, jlebar, tra, jholewinski
Subscribers: mgorny, aprantl, JDevlieghere, llvm-commits
Differential Revision: https://reviews.llvm.org/D41827
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@330271 91177308-0d34-0410-b5e6-96231b3b80d8
When NVPTX TARGET_BUILTIN specifies sm_XX or ptxYY as required feature,
consider those features available if we're compiling for GPU >= sm_XX or have
enabled PTX version >= ptxYY.
Differential Revision: https://reviews.llvm.org/D45061
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329829 91177308-0d34-0410-b5e6-96231b3b80d8
Makes it easier to see mistakes such as the one fixed in r329178 and makes
the different target CMakeLists more consistent.
Also remove some stale-looking comments from the Nios2 target cmakefile.
No intended behavior change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329181 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change declare that PostRAMachineSinking and ShrinkWrap require NoVRegs
property, so now the MachineFunctionPass can enforce this check.
These passes are disabled in NVPTX & WebAssembly.
Reviewers: dschuff, jlebar, tra, jgravelle-google, MatzeB, sebpop, thegameg, mcrosier
Reviewed By: dschuff, thegameg
Subscribers: jholewinski, jfb, sbc100, aheejin, sunfish, llvm-commits
Differential Revision: https://reviews.llvm.org/D45183
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329095 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Make NVPTX require structured CFG. Added a temporary flag to
"roll back" the behavior for easy deployment.
Combined with D45008, this fixes several internal Nvidia GPU test
failures that we suspect to be ptxas miscompiles (PR27738).
Reviewers: jlebar
Subscribers: jholewinski, sanjoy, llvm-commits, hiraditya
Differential Revision: https://reviews.llvm.org/D45070
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328885 91177308-0d34-0410-b5e6-96231b3b80d8
Currently EVT is in the IR layer only because of Function.cpp needing a very small piece of the functionality of EVT::getEVTString(). The rest of EVT is used in codegen making CodeGen a better place for it.
The previous code converted a Type* to EVT and then called getEVTString. This was only expected to handle the primitive types from Type*. Since there only a few primitive types, we can just print them as strings directly.
Differential Revision: https://reviews.llvm.org/D45017
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328806 91177308-0d34-0410-b5e6-96231b3b80d8
This is used by llvm tblgen as well as by LLVM Targets, so the only
common place is Support for now. (maybe we need another target for these
sorts of things - but for now I'm at least making them correct & we can
make them better if/when people have strong feelings)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328395 91177308-0d34-0410-b5e6-96231b3b80d8
This way we can support address-space specific variants without explicitly
encoding the space in the name of the intrinsic. Less intrinsics to deal with ->
less boilerplate.
Added a bit of tablegen magic to match/replace an intrinsics with a pointer
argument in particular address space with the space-specific instruction
variant.
Updated tests to use non-default address spaces.
Differential Revision: https://reviews.llvm.org/D43268
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328006 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
After D43914, loads from global variables in addrspace(1) happen with
ld.global. But since they're constants, even better would be to use
ld.global.nc, aka ldg.
Reviewers: tra
Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D43915
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326390 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
NVPTXGenericToNVVM was using target-specific intrinsics to do address
space casts. Using the addrspacecast instruction is (a lot) simpler.
But it also has the advantage of being understandable to other passes.
In particular, InferAddrSpaces is able to understand these address space
casts and remove them in most cases.
Reviewers: tra
Subscribers: jholewinski, sanjoy, hiraditya, llvm-commits
Differential Revision: https://reviews.llvm.org/D43914
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326389 91177308-0d34-0410-b5e6-96231b3b80d8
NVPTX stopped supporting GPUs older than sm_20 (Fermi) quite a while back.
Removal of support of pre-Fermi GPUs made a lot of predicates in the NVPTX
backend pointless as they can't ever be false any more.
It's time to retire them. NFC intended.
Differential Revision: https://reviews.llvm.org/D43843
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@326349 91177308-0d34-0410-b5e6-96231b3b80d8
This avoids playing games with pseudo pass IDs and avoids using an
unreliable MRI::isSSA() check to determine whether register allocation
has happened.
Note that this renames:
- MachineLICMID -> EarlyMachineLICM
- PostRAMachineLICMID -> MachineLICMID
to be consistent with the EarlyTailDuplicate/TailDuplicate naming.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322927 91177308-0d34-0410-b5e6-96231b3b80d8
Re-land r321234. It had to be reverted because it broke the shared
library build. The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target. As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).
Original commit message:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321375 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This makes the TargetMachine interface a bit simpler. We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.
See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html
I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.
Reviewers: echristo, MatzeB, hfinkel
Reviewed By: hfinkel
Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits
Differential Revision: https://reviews.llvm.org/D41464
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321234 91177308-0d34-0410-b5e6-96231b3b80d8
Rather than adding more bits to express every
MMO flag you could want, just directly use the
MMO flags. Also fixes using a bunch of bool arguments to
getMemIntrinsicNode.
On AMDGPU, buffer and image intrinsics should always
have MODereferencable set, but currently there is no
way to do that directly during the initial intrinsic
lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320746 91177308-0d34-0410-b5e6-96231b3b80d8
PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The
existing pass already ensured this for globals and this patch adds
the cleanup for functions with local linkage.
However, there was a different problem in the case of collisions
of the adjusted name: The ValueSymbolTable then automatically
appended ".N" with increasing Ns to get a unique name while helping
the ABI demangling. Special case this behavior to omit the dots and
append N directly. This will always give us legal names according
to the PTX requirements.
Differential Revision: https://reviews.llvm.org/D40573
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319657 91177308-0d34-0410-b5e6-96231b3b80d8
As part of the unification of the debug format and the MIR format, avoid
printing "vreg" for virtual registers (which is one of the current MIR
possibilities).
Basically:
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/%vreg([0-9]+)/%\1/g"
* grep -nr '%vreg' . and fix if needed
* find . \( -name "*.mir" -o -name "*.cpp" -o -name "*.h" -o -name "*.ll" \) -type f -print0 | xargs -0 sed -i '' -E "s/ vreg([0-9]+)/ %\1/g"
* grep -nr 'vreg[0-9]\+' . and fix if needed
Differential Revision: https://reviews.llvm.org/D40420
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319427 91177308-0d34-0410-b5e6-96231b3b80d8
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318490 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Make it possible to feed runtime information back to tablegen to enable
profile-guided tablegen-eration, detection of untested tablegen definitions, etc.
Being a cross-compiler by nature, LLVM will potentially collect data for multiple
architectures (e.g. when running 'ninja check'). We therefore need a way for
TableGen to figure out what data applies to the backend it is generating at the
time. This patch achieves that by including the name of the 'def X : Target ...'
for the backend in the TargetRegistry.
Reviewers: qcolombet
Reviewed By: qcolombet
Subscribers: jholewinski, arsenm, jyknight, aditya_nandakumar, sdardis, nemanjai, ab, nhaehnle, t.p.northover, javed.absar, qcolombet, llvm-commits, fedor.sergeev
Differential Revision: https://reviews.llvm.org/D39742
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318352 91177308-0d34-0410-b5e6-96231b3b80d8
It's needed to model the fact that they do access data from other threads in a
warp and thus can't be CSE'd.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318173 91177308-0d34-0410-b5e6-96231b3b80d8