Commit Graph

1269 Commits

Author SHA1 Message Date
Marek Olsak
4fda278e9b AMDGPU: Add new intrinsic llvm.amdgcn.kill(i1)
Summary:
Kill the thread if operand 0 == false.
llvm.amdgcn.wqm.vote can be applied to the operand.

Also allow kill in all shader stages.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D38544

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316427 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 10:27:13 +00:00
Marek Olsak
7525c08782 AMDGPU: Add llvm.amdgcn.wqm.vote intrinsic
Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye

Differential Revision: https://reviews.llvm.org/D38543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316426 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 10:26:59 +00:00
Matt Arsenault
367cfd84c3 AMDGPU: Fix default range in non-kernel functions
The range should be assumed to be the hardware maximum
if a workitem intrinsic is used in a callable function
which does not know the restricted limit of the calling
kernel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316346 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-23 17:09:35 +00:00
Justin Bogner
ba2fa173d9 Canonicalize a large number of mir tests using update_mir_test_checks
This converts a large and somewhat arbitrary set of tests to use
update_mir_test_checks. I ran the script on all of the tests I expect
to need to modify for an upcoming mir syntax change and kept the ones
that obviously didn't change the tests in ways that might make it
harder to understand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316137 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 23:18:12 +00:00
Konstantin Zhuravlyov
28cb7901b7 AMDGPU: Rename MaxFlatWorkgroupSize to MaxFlatWorkGroupSize for consistency
Differential Revision: https://reviews.llvm.org/D38957


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316097 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 17:31:09 +00:00
Wei Ding
607acf30af AMDGPU : Fix an error for the llvm.cttz implementation.
Differential Revision: http://reviews.llvm.org/D39014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316037 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-17 21:49:52 +00:00
Konstantin Zhuravlyov
cecf102e0c AMDGPU: Start generating metadata for MaxFlatWorkGroupSize
Differential Revision: https://reviews.llvm.org/D38958


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316024 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-17 20:03:21 +00:00
Mark Searles
df60d8e59e Use the return value of UpdateNodeOperands(); in some cases, UpdateNodeOperands() modifies the node in-place and using the return value isn’t strictly necessary. However, it does not necessarily modify the node, but may return a resultant node if it already exists in the DAG. See comments in UpdateNodeOperands(). In that case, the return value must be used to avoid such scenarios as an infinite loop (node is assumed to have been updated, so added back to the worklist, and re-processed; however, node hasn’t changed so it is once again passed to UpdateNodeOperands(), assumed modified, added back to worklist; cycle infinitely repeats).
Differential Revision: https://reviews.llvm.org/D38466

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315957 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-16 23:38:53 +00:00
Alexander Timofeev
f6eb7ff7f8 [AMDGPU] : revert r315908
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315916 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-16 16:57:37 +00:00
Alexander Timofeev
7811640b9a [AMDGPU] Prevent Machine Copy Propagation from replacing live copy with the dead one
Differential revision: https://reviews.llvm.org/D38754

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315908 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-16 14:35:29 +00:00
Konstantin Zhuravlyov
73363df463 AMDGPU: Temporary disable pal metadata check line in llvm-readobj test
It fails on mips


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315837 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 23:42:11 +00:00
Konstantin Zhuravlyov
5556d8485b AMDGPU: Bring HSA metadata on par with the specification
Differential Revision: https://reviews.llvm.org/D38753


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315821 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 19:03:51 +00:00
Konstantin Zhuravlyov
7032e50fbc llvm-readobj: Print AMDGPU note contents
Differential Revision: https://reviews.llvm.org/D38752


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 18:21:42 +00:00
Konstantin Zhuravlyov
3b76e7aa12 AMDGPU: Cleanup elf-notes.ll test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 17:36:53 +00:00
Konstantin Zhuravlyov
5376db4ce8 llvm-readobj: Print AMDGPU note type names
Differential Revision: https://reviews.llvm.org/D38751


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315813 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 16:43:46 +00:00
Konstantin Zhuravlyov
473d951406 AMDGPU: Do not emit deprecated notes for code object v3
Differential Revision: https://reviews.llvm.org/D38749


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315810 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 15:59:07 +00:00
Konstantin Zhuravlyov
eb211af057 AMDGPU: Add support for isa version note
- Emit NT_AMD_AMDGPU_ISA
  - Add assembler parsing for isa version directive
    - If isa version directive does not match command line arguments, then return error

Differential Revision: https://reviews.llvm.org/D38748


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315808 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-14 15:40:33 +00:00
Matt Arsenault
93f0e0c1d1 AMDGPU: Implement hasBitPreservingFPLogic
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315754 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 21:10:22 +00:00
Matt Arsenault
b1763ab78e AMDGPU: Look for src mods before fp_extend
When selecting modifiers for mad_mix instructions,
look at fneg/fabs that occur before the conversion.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315748 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 20:45:49 +00:00
Matt Arsenault
9a6875264b AMDGPU: Implement isFPExtFoldable
This helps match v_mad_mix* in some cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315744 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 20:18:59 +00:00
Wei Ding
29de9d738e Implement custom lowering for ISD::CTTZ_ZERO_UNDEF and ISD::CTTZ.
Differential Revision: http://reviews.llvm.org/D37348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315610 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 19:37:14 +00:00
Tim Renouf
fadd83b09c [AMDGPU] For amdpal, widen interpolation mode workaround
Summary:
The interpolation mode workaround ensures that at least one
interpolation mode is enabled in PSInputAddr. It does not also check
PSInputEna on the basis that the user might enable bits in that
depending on run-time state.

However, for amdpal os type, the user does not enable some bits after
compilation based on run-time states; the register values being
generated here are the final ones set in the hardware. Therefore, apply
the workaround to PSInputAddr and PSInputEnable together. (The case
where a bit is set in PSInputAddr but not in PSInputEnable is where the
frontend set up an input arg for a particular interpolation mode, but
nothing uses that input arg. Really we should have an earlier pass that
removes such an arg.)

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37758

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 16:16:41 +00:00
Konstantin Zhuravlyov
257828766c AMDGPU/NFC: Minor clean ups in PAL metadata
- Move PAL metadata definitions to AMDGPUMetadata
  - Make naming consistent with HSA metadata

Differential Revision: https://reviews.llvm.org/D38745


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315523 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-11 22:41:09 +00:00
Konstantin Zhuravlyov
44bc30dd6d AMDGPU/NFC: Rename code object metadata as HSA metadata
- Rename AMDGPUCodeObjectMetadata to AMDGPUMetadata (PAL metadata will be included in this file in the follow up change)
  - Rename AMDGPUCodeObjectMetadataStreamer to AMDGPUHSAMetadataStreamer
  - Introduce HSAMD namespace
  - Other minor name changes in function and test names


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315522 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-11 22:18:53 +00:00
Matt Arsenault
2d1082f2c0 AMDGPU: Fix missing skipFunction calls
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315361 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 20:48:36 +00:00
Matt Arsenault
16961e0820 AMDGPU: Fix failure to select branch with optnone
opt-bisect/optnone disable the AMDGPUUniformAnnotateValues pass.
The heuristic in the custom selector for brcond deferred the
branch uniformity check to the pattern, which would fail.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315360 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 20:34:49 +00:00
Yaxun Liu
091c043b90 [AMDGPU] Lower enqueued blocks and generate runtime metadata
This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle) and adds
runtime-handle attribute to the enqueued block kernel.

In LLVM CodeGen the runtime-handle metadata will be translated to
RuntimeHandle metadata in code object. Runtime allocates a global buffer
for each kernel with RuntimeHandel metadata and saves the kernel address
required for the AQL packet into the buffer. __enqueue_kernel function
in device library knows that the invoke function pointer in the block
literal is actually runtime handle and loads the kernel address from it
and puts it into AQL packet for dispatching.

This cannot be done in FE since FE cannot create a unique global variable
with external linkage across LLVM modules. The global variable with internal
linkage does not work since optimization passes will try to replace loads
of the global variable with its initialization value.

Differential Revision: https://reviews.llvm.org/D38610


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315352 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 19:39:48 +00:00
David Stuttard
60e48ff092 [DAGCombine] Fix for shuffle to vector extend for non power 2 vectors
Summary:
See https://llvm.org/PR33743 for more details

It seems that for non-power of 2 vector sizes, the algorithm can produce
non-matching sizes for input and result causing an assert.

This usually isn't a problem as the isAnyExtend check will weed these out, but
in some cases (most often with lots of undefined values for the mask indices) it
can pass this check for non power of 2 vectors.

Adding in an extra check that ensures that bit size will match for the result
and input (as required)

Subscribers: nhaehnle

Differential Revision: https://reviews.llvm.org/D35241

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315307 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 12:45:45 +00:00
Nicolai Haehnle
73de2f1810 AMDGPU: Split MUBUF offset into aligned components
Summary:
Atomic buffer operations do not work (and trap on gfx9) when the
components are unaligned, even if their sum is aligned.

Previously, we generated an offset of 4156 without an SGPR by
splitting it as 4095 + 61 (immediate + inline constant). The
highest offset for which we can do this correctly is 4156 = 4092 + 64.

Fixes dEQP-GLES31.functional.ssbo.atomic.*

Reviewers: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37850

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315302 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 12:22:23 +00:00
Stanislav Mekhanoshin
287608ccc3 [AMDGPU] New 64 bit div/rem expansion
Old expansion was 20 VGPRs, 78 SGPRs and ~380 instructions.
This expansion is 11 VGPRs, 12 SGPRs and ~120 instructions.

Passes OpenCL conformance test_integer_ops quick_[u]long_math

Differential Revision: https://reviews.llvm.org/D38607

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315081 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-06 17:24:45 +00:00
Davide Italiano
300b37a6ad [PassManager] Run global optimizations after the inliner.
The inliner performs some kind of dead code elimination as it goes,
but there are cases that are not really caught by it. We might
at some point consider teaching the inliner about them, but it
is OK for now to run GlobalOpt + GlobalDCE in tandem as their
benefits generally outweight the cost, making the whole pipeline
faster.

This fixes PR34652.

Differential Revision: https://reviews.llvm.org/D38154

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314997 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-05 18:06:37 +00:00
Matt Arsenault
16b0d7598e AMDGPU: Set v2i32 any_extend to expand
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314993 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-05 17:38:30 +00:00
Konstantin Zhuravlyov
9b90153e76 AMDGPU: Add and set AMDGPU-specific e_flags
Differential Revision: https://reviews.llvm.org/D38556


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314987 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-05 16:19:18 +00:00
Matt Arsenault
5eacfa4990 AMDGPU: Do not fold clamp instructions when sources are different
Patch by hakzsam (Samuel Pitoiset)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314951 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-05 00:13:17 +00:00
Matt Arsenault
b151df8f6d AMDGPU: Fix not accounting for instruction size in bundles
These were counted as 0. Fixes branch limit exceeded errors
in some large programs.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314944 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 22:59:12 +00:00
Konstantin Zhuravlyov
f1bd8070e1 AMDGPU: Correctly set EI_OSABI based on the os
Differential Revision: https://reviews.llvm.org/D38555


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314943 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 22:44:13 +00:00
Konstantin Zhuravlyov
b922e2ff02 AMDGPU: Expand setcc for v2i32 and v4i32
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314852 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 21:31:24 +00:00
Tim Renouf
924d87d4be [AMDGPU] implemented pal metadata
Summary:
For the amdpal OS type:

We write an AMDGPU_PAL_METADATA record in the .note section in the ELF
(or as an assembler directive). It contains key=value pairs of 32 bit
ints. It is a merge of metadata from codegen of the shaders, and
metadata provided by the frontend as _amdgpu_pal_metadata IR metadata.
Where both sources have a key=value with the same key, the two values
are ORed together.

This .note record is part of the amdpal ABI and will be documented in
docs/AMDGPUUsage.rst in a future commit.

Eventually the amdpal OS type will stop generating the .AMDGPU.config
section once the frontend has safely moved over to using the .note
records above instead of .AMDGPU.config.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: kzhuravl, wdng, yaxunl, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D37753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314829 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 19:03:52 +00:00
Alexander Timofeev
673f1ddc20 [AMDGPU] Avoid predicated execution of the basic blocks containing scalar
instructions.

Differential revision: https://reviews.llvm.org/D38293

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314828 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 18:55:36 +00:00
Geoff Berry
c3ef7ae13a Revert "Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding""
This reverts commit r314729.

Another bug has been encountered in an out-of-tree target reported by Quentin.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314814 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 16:59:13 +00:00
Geoff Berry
d990d28864 Re-enable "[MachineCopyPropagation] Extend pass to do COPY source forwarding"
Issues addressed since original review:
- Avoid bug in regalloc greedy/machine verifier when forwarding to use
  in an instruction that re-defines the same virtual register.
- Fixed bug when forwarding to use in EarlyClobber instruction slot.
- Fixed incorrect forwarding to register definitions that showed up in
  explicit_uses() iterator (e.g. in INLINEASM).
- Moved removal of dead instructions found by
  LiveIntervals::shrinkToUses() outside of loop iterating over
  instructions to avoid instructions being deleted while pointed to by
  iterator.
- Fixed ARMLoadStoreOptimizer bug exposed by this change in r311907.
- The pass no longer forwards COPYs to physical register uses, since
  doing so can break code that implicitly relies on the physical
  register number of the use.
- The pass no longer forwards COPYs to undef uses, since doing so
  can break the machine verifier by creating LiveRanges that don't
  end on a use (since the undef operand is not considered a use).

  [MachineCopyPropagation] Extend pass to do COPY source forwarding

  This change extends MachineCopyPropagation to do COPY source forwarding.

  This change also extends the MachineCopyPropagation pass to be able to
  be run during register allocation, after physical registers have been
  assigned, but before the virtual registers have been re-written, which
  allows it to remove virtual register COPY LiveIntervals that become dead
  through the forwarding of all of their uses.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314729 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-02 22:01:37 +00:00
Matt Arsenault
c31f9e0d71 AMDGPU: Fix potentially incorrectly matching check lines
These check lines are supposed to make sure the new d16
load instructions aren't used, but the expected instruction
name is a prefix of the incorrect instruction name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314714 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-02 20:31:16 +00:00
Stanislav Mekhanoshin
319e85781b Eliminate ftrunc if source is know to be rounded
Differential Revision: https://reviews.llvm.org/D38421

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314688 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-02 16:57:07 +00:00
Stanislav Mekhanoshin
8b431a9469 [AMDGPU] Set fast-math flags on functions given the options
We have a single library build without relaxation options.
When inlined library functions remove fast math attributes
from the functions they are integrated into.

This patch sets relaxation attributes on the functions after
linking provided corresponding relaxation options are given.
Math instructions inside the inlined functions remain to have
no fast flags, but inlining does not prevent fast math
transformations of a surrounding caller code anymore.

Differential Revision: https://reviews.llvm.org/D38325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314568 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 23:40:19 +00:00
Yaxun Liu
a46819d67a CodeGen: Fix pointer info in expandUnalignedLoad/Store
Currently expandUnalignedLoad/Store uses place holder pointer info for temporary memory operand
in stack, which does not have correct address space. This causes unaligned private double16 load/store to be
lowered to flat_load instead of buffer_load for amdgcn target.

This fixes failures of OpenCL conformance test basic/vload_private/vstore_private on target amdgcn---amdgizcl.

Differential Revision: https://reviews.llvm.org/D35361


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314566 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 23:31:14 +00:00
Nicolai Haehnle
da47d9d5d0 AMDGPU: VALU carry-in and v_cndmask condition cannot be EXEC
The hardware will only forward EXEC_LO; the high 32 bits will be zero.

Additionally, inline constants do not work. At least,

   v_addc_u32_e64 v0, vcc, v0, v1, -1

which could conceivably be used to combine (v0 + v1 + 1) into a single
instruction, acts as if all carry-in bits are zero.

The llvm.amdgcn.ps.live test is adjusted; it would be nice to combine

   s_mov_b64 s[0:1], exec
   v_cndmask_b32_e64 v0, v1, v2, s[0:1]

into

   v_mov_b32 v0, v3

but it's not particularly high priority.

Fixes dEQP-GLES31.functional.shaders.helper_invocation.value.*

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314522 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 15:37:31 +00:00
Tim Renouf
8ba98f908f [AMDGPU] calling conventions for AMDPAL OS type
Summary:
This commit adds comments on how the AMDPAL OS type overloads the
existing AMDGPU_ calling conventions used by Mesa, and adds a couple of
new ones.

Reviewers: arsenm, nhaehnle, dstuttard

Subscribers: mehdi_amini, kzhuravl, wdng, yaxunl, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D37752

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314502 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 09:51:22 +00:00
Tim Renouf
2532de2a09 [AMDGPU] AMDPAL scratch buffer support
Summary:
Added support for scratch (including spilling) for OS type amdpal:
generates code to set up the scratch descriptor if it is needed.

With amdpal, the scratch resource descriptor is loaded from offset 0 of
the global information table. The low 32 bits of the address of the
global information table is passed in s0.

Added amdgpu-git-ptr-high function attribute to hard-wire the high 32
bits of the address of the global information table. If the function
attribute is not specified, or is 0xffffffff, then the backend generates
code to use the high 32 bits of pc.

The documentation for the AMDPAL ABI will be added in a later commit.

Subscribers: arsenm, kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, t-tye

Differential Revision: https://reviews.llvm.org/D37483

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314501 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 09:49:35 +00:00
Tim Renouf
41ca6917d8 [Triple] Add AMDPAL operating system type
Summary:
This operating system type represents the AMDGPU PAL runtime, and will
be required by the AMDGPU backend in order to generate correct code for
this runtime.

Currently it generates the same code as not specifying an OS at all.
That will change in future commits.

Patch from Tim Corringham.

Subscribers: arsenm, nhaehnle

Differential Revision: https://reviews.llvm.org/D37380

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314500 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 09:48:12 +00:00
Matt Arsenault
ae3e3c01b9 AMDGPU: Add option to stress calls
This inverts the behavior of the AlwaysInline pass to mark
every function not already marked alwaysinline as noinline.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313865 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-21 07:00:48 +00:00