2728 Commits

Author SHA1 Message Date
Matt Arsenault
8f2b72e7d1 AMDGPU: Use existing function to check for VGPRs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337621 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 21:20:36 +00:00
Matt Arsenault
06b493f7f0 Reapply "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
Reverts r337079 with fix for msan error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337535 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-20 09:05:08 +00:00
Farhana Aleen
ac8c393bd5 [AMDGPU] [AMDGPU] Support a fdot2 pattern.
Summary: Optimize fma((float)S0.x, (float)S1.x fma((float)S0.y, (float)S1.y, z))
                   -> fdot2((v2f16)S0, (v2f16)S1, (float)z)

Author: FarhanaAleen

Reviewed By: rampitec, b-sumner

Subscribers: AMDGPU

Differential Revision: https://reviews.llvm.org/D49146

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337198 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 18:19:59 +00:00
Mark Searles
a720958169 [AMDGPU][Waitcnt] Re-apply fix "comparison of integers of different signs" build error"
Re-apply "[AMDGPU][Waitcnt] fix "comparison of integers of different signs" build error""
( fe0a456510131f268e388c4a18a92f575c0db183 ), which was inadvertantly reverted via
2b2ee080f0164485562593b1b87291a48cea4a9a .

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337156 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 10:21:36 +00:00
Mark Searles
249b255c78 run post-RA hazard recognizer pass late
Memory legalizer, waitcnt, and shrink  passes can perturb the instructions,
which means that the post-RA hazard recognizer pass should run after them.
Otherwise, one of those passes may invalidate the work done by the hazard
recognizer. Note that this has adverse side-effect that any consecutive
S_NOP 0's, emitted by the hazard recognizer, will not be shrunk into a
single S_NOP <N>. This should be addressed in a follow-on patch.

Differential Revision: https://reviews.llvm.org/D49288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337154 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 10:02:41 +00:00
Mark Searles
356f9421a1 Revert "[AMDGPU][Waitcnt] fix "comparison of integers of different signs" build error"
This reverts commit fe0a456510131f268e388c4a18a92f575c0db183.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337153 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 10:02:40 +00:00
Evgeniy Stepanov
1382a3a7e8 Revert "AMDGPU: Fix handling of alignment padding in DAG argument lowering"
This reverts commit r337021.

WARNING: MemorySanitizer: use-of-uninitialized-value
    #0 0x1415cd65 in void write_signed<long>(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:95:7
    #1 0x1415c900 in llvm::write_integer(llvm::raw_ostream&, long, unsigned long, llvm::IntegerStyle) /code/llvm-project/llvm/lib/Support/NativeFormatting.cpp:121:3
    #2 0x1472357f in llvm::raw_ostream::operator<<(long) /code/llvm-project/llvm/lib/Support/raw_ostream.cpp:117:3
    #3 0x13bb9d4 in llvm::raw_ostream::operator<<(int) /code/llvm-project/llvm/include/llvm/Support/raw_ostream.h:210:18
    #4 0x3c2bc18 in void printField<unsigned int, &(amd_kernel_code_s::amd_kernel_code_version_major)>(llvm::StringRef, amd_kernel_code_s const&, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:78:23
    #5 0x3c250ba in llvm::printAmdKernelCodeField(amd_kernel_code_s const&, int, llvm::raw_ostream&) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:104:5
    #6 0x3c27ca3 in llvm::dumpAmdKernelCode(amd_kernel_code_s const*, llvm::raw_ostream&, char const*) /code/llvm-project/llvm/lib/Target/AMDGPU/Utils/AMDKernelCodeTUtils.cpp:113:5
    #7 0x3a46e6c in llvm::AMDGPUTargetAsmStreamer::EmitAMDKernelCodeT(amd_kernel_code_s const&) /code/llvm-project/llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUTargetStreamer.cpp:161:3
    #8 0xd371e4 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:204:26

[...]

Uninitialized value was created by an allocation of 'KernelCode' in the stack frame of function '_ZN4llvm16AMDGPUAsmPrinter21EmitFunctionBodyStartEv'
    #0 0xd36650 in llvm::AMDGPUAsmPrinter::EmitFunctionBodyStart() /code/llvm-project/llvm/lib/Target/AMDGPU/AMDGPUAsmPrinter.cpp:192

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337079 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-14 01:20:53 +00:00
Tom Stellard
37f081b80d AMDGPU/GlobalISel: Implement select() for 32-bit @llvm.minnun and @llvm.maxnum
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D46172

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337056 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 22:16:03 +00:00
Tom Stellard
504eed44cd AMDGPU/GlobalISel: Implement select() for @llvm.amdgcn.exp
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D45882

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337046 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 21:05:14 +00:00
Matt Arsenault
9c21e67b4c AMDGPU: Properly handle shader inputs with split arguments
This needs to refer to arguments by their original argument
index, not the argument split index which depends on what
the type splitting decides to do.

Also avoid increment PSInputNum for each split piece.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337022 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 16:40:37 +00:00
Matt Arsenault
e61b6779e4 AMDGPU: Fix handling of alignment padding in DAG argument lowering
This was completely broken if there was ever a struct argument, as
this information is thrown away during the argument analysis.

The offsets as passed in to LowerFormalArguments are not useful,
as they partially depend on the legalized result register type,
and they don't consider the alignment in the first place.

Ignore the Ins array, and instead figure out from the raw IR type
what we need to do. This seems to fix the padding computation
if the DAG lowering is forced (and stops breaking arguments
following padded arguments if the arguments were only partially
lowered in the IR)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337021 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-13 16:40:25 +00:00
Matt Arsenault
4835c04d29 AMDGPU: Fix assert in truncate combine with vectors
The piece above probably has the same problem, but I need
to try to come up with a test for it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336935 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-12 19:40:16 +00:00
Tom Stellard
469c8eebd9 AMDGPU/SI: Initialize InstrInfo before TargetLoweringInfo in GCNSubtarget
SITargetLowering queries SIInstrInfo in its constructor, so SIInstrInfo
must be initialized first.  This fixes msan buildbot failures and was
introduced by r336851.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336861 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 22:15:15 +00:00
Tom Stellard
d2f9dac4f9 AMDGPU: Remove duplicate call to initializeSubtargetDependencies()
This was added in r336851.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336853 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 21:12:03 +00:00
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Konstantin Zhuravlyov
04804b2f45 AMDGPU/NFC: Use already available explicit kernarg
size instead of calculating it again when filling
out the metadata.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336825 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 17:27:17 +00:00
Richard Trieu
2c1b42aecb Fix -Wmismatched-tags warning
class -> struct in forward declaration.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336733 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 22:09:33 +00:00
Scott Linder
47362da967 [AMDGPU] Fix layering issue with AMDGPUHSAMetadataStreamer (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336722 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 20:07:22 +00:00
Scott Linder
5c37ae1e46 [AMDGPU] Refactor HSAMetadataStream::emitKernel (NFC)
Move all metadata construction into AMDGPUHSAMetadataStreamer.

Differential Revision: https://reviews.llvm.org/D48176


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336707 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 17:31:32 +00:00
Konstantin Zhuravlyov
febd9a0f02 AMDGPU: Make hidden argument metadata consistent with
amdgpu-implicitarg-num-bytes attribute

Differential Revision: https://reviews.llvm.org/D49096



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336697 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 16:12:51 +00:00
Matt Arsenault
e07c9538b5 Reapply "AMDGPU: Force inlining if LDS global address is used"
This reverts commit r336623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336675 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 14:03:41 +00:00
Vlad Tsyrklevich
3bda4ed0ec Revert "AMDGPU: Force inlining if LDS global address is used"
This reverts commit r336587, it was causing test failures on the
sanitizer bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336623 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 00:46:07 +00:00
Mark Searles
28793aedf6 [AMDGPU][Waitcnt] fix "comparison of integers of different signs" build error
Build error on Android; reported by and fix provided by (thanks) by Mauro Rossi <issor.oruam@gmail.com>

Fixes the following building error:

external/llvm/lib/Target/AMDGPU/SIInsertWaitcnts.cpp:1903:61:
error: comparison of integers of different signs:
'typename iterator_traits<__wrap_iter<MachineBasicBlock **> >::difference_type'
(aka 'int') and 'unsigned int' [-Werror,-Wsign-compare]
                      BlockWaitcntProcessedSet.end(), &MBB) < Count)) {
                      ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ ~~~~~
1 error generated.

Differential Revision: https://reviews.llvm.org/D49089

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336588 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 19:28:14 +00:00
Matt Arsenault
12d30e1e27 AMDGPU: Force inlining if LDS global address is used
These won't work for the forseeable future. These aren't allowed
from OpenCL, but IPO optimizations can make them appear.

Also directly set the attributes on functions, regardless
of the linkage rather than cloning functions like before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336587 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 19:22:22 +00:00
Tom Stellard
3f928c753c AMDGPU: Fix UBSan error caused by r335942
Summary: Fixes PR38071.

Reviewers: arsenm, dstenb

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48979

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336448 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-06 17:16:17 +00:00
Matt Arsenault
e5d3d15134 AMDGPU/GlobalISel: Implement custom kernel arg lowering
Avoid using allocateKernArg / AssignFn. We do not want any
of the type splitting properties of normal calling convention
lowering.

For now at least this exists alongside the IR argument lowering
pass. This is necessary to handle struct padding correctly while
some arguments are still skipped by the IR argument lowering
pass.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336373 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 17:01:20 +00:00
Ryan Taylor
e941c76442 [AMDGPU] Add VALU to V_INTERP Instructions
Wait states are not properly being inserted after buffer_store for v_interp instructions.

Add VALU to V_INTERP instructions so that the GCNHazardRecognizer can
check and insert the appropriate wait states when needed.

Differential Revision: https://reviews.llvm.org/D48772

Change-Id: Id540c9b074fc69b5c1de6b182276aa089c74aa64

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336339 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-05 12:02:07 +00:00
Piotr Padlewski
c2f24d9ea8 Implement strip.invariant.group
Summary:
This patch introduce new intrinsic -
strip.invariant.group that was described in the
RFC: Devirtualization v2

Reviewers: rsmith, hfinkel, nlopes, sanjoy, amharc, kuhar

Subscribers: arsenm, nhaehnle, JDevlieghere, hiraditya, xbolva00, llvm-commits

Differential Revision: https://reviews.llvm.org/D47103

Co-authored-by: Krzysztof Pszeniczny <krzysztof.pszeniczny@gmail.com>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336073 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-02 04:49:30 +00:00
Tom Stellard
8eb696d509 AMDGPU/GlobalISel: Make IMPLICIT_DEF of all sizes < 512 legal.
Summary:
We could split sizes that are not power of two into smaller sized
G_IMPLICIT_DEF instructions, but this ends up generating
G_MERGE_VALUES instructions which we then have to handle in the instruction
selector.  Since G_IMPLICIT_DEF is really a no-op it's easier just to
keep everything that can fit into a register legal.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D48777

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336041 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-30 04:09:44 +00:00
Matt Arsenault
eac8acfa94 AMDGPU: Don't use struct type for argument layout
This was introducing unnecessary padding after the explicit
arguments, depending on the alignment of the total struct type.
Also has the side effect of avoiding creating an extra GEP for
the offset from the base kernel argument to the explicit kernel
argument offset.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335999 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 17:31:42 +00:00
Stanislav Mekhanoshin
0e1a98e255 [AMDGPU] Enable LICM in the BE pipeline
This allows to hoist code portion to compute reciprocal of loop
invariant denominator in integer division after codegen prepare
expansion.

Differential Revision: https://reviews.llvm.org/D48604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335988 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 16:26:53 +00:00
Tom Stellard
cba2181e77 AMDGPU: Separate R600 and GCN TableGen files
Summary:
We now have two sets of generated TableGen files, one for R600 and one
for GCN, so each sub-target now has its own tables of instructions,
registers, ISel patterns, etc.  This should help reduce compile time
since each sub-target now only has to consider information that
is specific to itself.  This will also help prevent the R600
sub-target from slowing down new features for GCN, like disassembler
support, GlobalISel, etc.

Reviewers: arsenm, nhaehnle, jvesely

Reviewed By: arsenm

Subscribers: MatzeB, kzhuravl, wdng, mgorny, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D46365

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335942 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 23:47:12 +00:00
Stanislav Mekhanoshin
0e7f42b639 [AMDGPU] Early expansion of 32 bit udiv/urem
This allows hoisting of a common code, for instance if denominator
is loop invariant. Current change is expansion only, adding licm to
the target pass list going to be a separate patch. Given this patch
changes to codegen are minor as the expansion is similar to that on
DAG. DAG expansion still must remain for R600.

Differential Revision: https://reviews.llvm.org/D48586

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335868 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 15:59:18 +00:00
Stanislav Mekhanoshin
8746f255cd [AMDGPU] Overload llvm.amdgcn.fmad.ftz to support f16
Differential Revision: https://reviews.llvm.org/D48677

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335866 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 15:24:46 +00:00
Matt Arsenault
90f8cc80db AMDGPU: Remove MFI::ABIArgOffset
We have too many mechanisms for tracking the various offsets
used for kernel arguments, so remove one. There's still a lot of
confusion with these because there are two different "implicit"
argument areas located at the beginning and end of the kernarg
segment.

Additionally, the offset was determined based on the memory
size of the split element types. This would break in a future
commit where v3i32 is decomposed into separate i32 pieces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335830 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 10:18:55 +00:00
Matt Arsenault
fe32d1437e AMDGPU: Error on calls from graphics shaders
In principle nothing should stop these from working, but
work is necessary to create an ABI for dealing with the stack
related registers.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335829 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 10:18:36 +00:00
Matt Arsenault
2f8e5b3099 AMDGPU: Fix AMDGPUCodeGenPrepare using uninitialized AMDGPUAS struct
Not sure how this wasn't noticed before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335828 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 10:18:23 +00:00
Matt Arsenault
5f4316253c AMDGPU: Fix assert on aggregate type kernel arguments
Just fix the crash for now by not doing the optimization since
figuring out how to properly convert the bits for an arbitrary
struct is a pain.

Also fix a crash when there is only an empty struct argument.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335827 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-28 10:18:11 +00:00
Stanislav Mekhanoshin
bc547571e7 [AMDGPU] Convert rcp to rcp_iflag
If a source of rcp instruction is a result of any conversion from
an integer convert it into rcp_iflag instruction. No FP exception
can ever happen except division by zero if a single precision rcp
argument is a representation of an integral number.

Differential Revision: https://reviews.llvm.org/D48569

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335742 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 15:33:33 +00:00
Konstantin Zhuravlyov
fe1e773676 AMDGPU/NFC: Fix typo in comment
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335707 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-27 05:36:03 +00:00
Konstantin Zhuravlyov
784d2a8499 AMDGPU: Silence unused warnings in waitcnt insertion pass in release build
Differential Revision: https://reviews.llvm.org/D48607


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335669 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 21:33:38 +00:00
Stanislav Mekhanoshin
aac117a9ba [AMDGPU] Add llvm.amdgcn.fmad.ftz intrinsic
This intrinsic selects v_mad_f32 regardless of fp32 denorm support.

Differential Revision: https://reviews.llvm.org/D48573

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335654 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 20:04:19 +00:00
Matt Arsenault
a2ba13d731 AMDGPU: Add pass to lower kernel arguments to loads
This replaces most argument uses with loads, but for
now not all.

The code in SelectionDAG for calling convention lowering
is actively harmful for amdgpu_kernel. It attempts to
split the argument types into register legal types, which
results in low quality code for arbitary types. Since
all kernel arguments are passed in memory, we just want the
raw types.

I've tried a couple of methods of mitigating this in SelectionDAG,
but it's easier to just bypass this problem alltogether. It's
possible to hack around the problem in the initial lowering,
but the real problem is the DAG then expects to be able to use
CopyToReg/CopyFromReg for uses of the arguments outside the block.

Exposing the argument loads in the IR also has the advantage
that the LoadStoreVectorizer can merge them.

I'm not sure the best approach to dealing with the IR
argument list is. The patch as-is just leaves the IR arguments
in place, so all the existing code will still compute the same
kernarg size and pointlessly lowers the arguments.

Arguably the frontend should emit kernels with an empty argument
list in the first place. Alternatively a dummy array could be
inserted as a single argument just to reserve space.

This does have some disadvantages. Local pointer kernel arguments can
no longer have AssertZext placed  on them as the equivalent !range
metadata is not valid on pointer  typed loads. This is mostly bad
for SI which needs to know about the known bits in order to use the
DS instruction offset, so in this case this is not done.

More importantly, this skips noalias arguments since this pass
does not yet convert this to the equivalent !alias.scope and !noalias
metadata. Producing this metadata correctly seems to be tricky,
although this logically is the same as inlining into a function which
doesn't exist. Additionally, exposing these loads to the vectorizer
may result in degraded aliasing information if a pointer load is
merged with another argument load.

I'm also not entirely sure this is preserving the current clover
ABI, although I would greatly prefer if it would stop widening
arguments and match the HSA ABI. As-is I think it is extending
< 4-byte arguments to 4-bytes but doesn't align them to 4-bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335650 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 19:10:00 +00:00
Matt Arsenault
529b26551c AMDGPU/GlobalISel: Add support for llvm.amdgcn.kernarg.segment.ptr
Note a normal select test is not currently possible because this
relies on input registers tracked in SIMachineFunctionInfo which
are not currently serializable in MIR, but this does work end-to-end
from the IR.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335490 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 16:17:48 +00:00
Matt Arsenault
b34fc164bd AMDGPU: Remove commented out code
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335486 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 15:42:20 +00:00
Matt Arsenault
c4c340a047 AMDGPU/GlobalISel: Fix G_IMPLICIT_DEF for pointers
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335485 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 15:42:12 +00:00
Matt Arsenault
d3d2228f8c AMDGPU: Respect align argument parameter
This should avoid relying on the pointee type
to get the alignment, particularly since pointee
types are supposed to be removed at some point.

Also fixes not getting the alignment for unsized types.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335478 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-25 14:29:04 +00:00
Reid Kleckner
00f00bab00 [AMDGPU] Update includes for intrinsic changes :(
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335409 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-23 03:05:39 +00:00
Reid Kleckner
af7c445dfa [IR] Split Intrinsics.inc into enums and implementations
Implements PR34259

Intrinsics.h is a very popular header. Most LLVM TUs care about things
like dbg_value, but they don't care how they are implemented. After I
split these out, IntrinsicImpl.inc is 1.7 MB, so this saves each LLVM TU
from scanning 1.7 MB of source that gets pre-processed away.

It also means we can modify intrinsic properties without triggering a
full rebuild, but that's probably less of a win.

I think the next best thing to do would be to split out the target
intrinsics into their own header. Very, very few TUs care about
target-specific intrinsics. It's very hard to split up the target
independent intrinsics like llvm.expect, assume, and dbg.value, though.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335407 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-23 02:02:38 +00:00
Matt Arsenault
ae175dfe4a AMDGPU: Add patterns for i32/i64 local atomic load/store
Not sure why the 32/64 split is needed in the atomic_load
store hierarchies. The regular PatFrags do this, but we don't
do it for the existing handling for global.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335325 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-22 08:39:52 +00:00