202 Commits

Author SHA1 Message Date
Sameer Sahasrabuddhe
116128c1c0 [AMDGPU] restore r342722 which was reverted with r342743
[AMDGPU] lower-switch in preISel as a workaround for legacy DA

Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342956 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-25 09:39:21 +00:00
Sameer Sahasrabuddhe
0ccb4cd734 revert changes from r342722
"[AMDGPU] lower-switch in preISel as a workaround for legacy DA"

This broke regression tests. The first breakage was noticed here:
http://lab.llvm.org:8011/builders/lld-x86_64-freebsd/builds/23549


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342743 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-21 16:31:51 +00:00
Sameer Sahasrabuddhe
5b5e532790 [AMDGPU] lower-switch in preISel as a workaround for legacy DA
Summary:
The default target of the switch instruction may sometimes be an
"unreachable" block, when it is guaranteed that one of the cases is
always taken. The dominator tree concludes that such a switch
instruction does not have an immediate post dominator. This confuses
divergence analysis, which is unable to propagate sync dependence to
the targets of the switch instruction.

As a workaround, the AMDGPU target now invokes lower-switch as a
preISel pass. LowerSwitch is designed to handle the unreachable
default target correctly, allowing the divergence analysis to locate
the correct immediate dominator of the now-lowered switch.

Reviewers: arsenm, nhaehnle

Reviewed By: nhaehnle

Subscribers: kzhuravl, jvesely, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, llvm-commits, simoll

Differential Revision: https://reviews.llvm.org/D52221

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@342722 91177308-0d34-0410-b5e6-96231b3b80d8
2018-09-21 11:26:55 +00:00
Matt Arsenault
c8c005cb52 AMDGPU: Stop forcing internalize at -O0
This doesn't really matter if clang is always emitting
the visibility as hidden by default.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341168 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-31 06:02:36 +00:00
Matt Arsenault
7e212e4168 AMDGPU: Remove remnants of old address space mapping
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@341165 91177308-0d34-0410-b5e6-96231b3b80d8
2018-08-31 05:49:54 +00:00
Mark Searles
249b255c78 run post-RA hazard recognizer pass late
Memory legalizer, waitcnt, and shrink  passes can perturb the instructions,
which means that the post-RA hazard recognizer pass should run after them.
Otherwise, one of those passes may invalidate the work done by the hazard
recognizer. Note that this has adverse side-effect that any consecutive
S_NOP 0's, emitted by the hazard recognizer, will not be shrunk into a
single S_NOP <N>. This should be addressed in a follow-on patch.

Differential Revision: https://reviews.llvm.org/D49288

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@337154 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-16 10:02:41 +00:00
Tom Stellard
1d6fd076a3 AMDGPU: Refactor Subtarget classes
Summary:
This is a follow-up to r335942.
- Merge SISubtarget into AMDGPUSubtarget and rename to GCNSubtarget
- Rename AMDGPUCommonSubtarget to AMDGPUSubtarget
- Merge R600Subtarget::Generation and GCNSubtarget::Generation into
  AMDGPUSubtarget::Generation.

Reviewers: arsenm, jvesely

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, t-tye, javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D49037

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336851 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-11 20:59:01 +00:00
Matt Arsenault
e07c9538b5 Reapply "AMDGPU: Force inlining if LDS global address is used"
This reverts commit r336623

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336675 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 14:03:41 +00:00
Vlad Tsyrklevich
3bda4ed0ec Revert "AMDGPU: Force inlining if LDS global address is used"
This reverts commit r336587, it was causing test failures on the
sanitizer bots.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336623 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-10 00:46:07 +00:00
Matt Arsenault
12d30e1e27 AMDGPU: Force inlining if LDS global address is used
These won't work for the forseeable future. These aren't allowed
from OpenCL, but IPO optimizations can make them appear.

Also directly set the attributes on functions, regardless
of the linkage rather than cloning functions like before.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@336587 91177308-0d34-0410-b5e6-96231b3b80d8
2018-07-09 19:22:22 +00:00
Stanislav Mekhanoshin
0e1a98e255 [AMDGPU] Enable LICM in the BE pipeline
This allows to hoist code portion to compute reciprocal of loop
invariant denominator in integer division after codegen prepare
expansion.

Differential Revision: https://reviews.llvm.org/D48604

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335988 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-29 16:26:53 +00:00
Matt Arsenault
a2ba13d731 AMDGPU: Add pass to lower kernel arguments to loads
This replaces most argument uses with loads, but for
now not all.

The code in SelectionDAG for calling convention lowering
is actively harmful for amdgpu_kernel. It attempts to
split the argument types into register legal types, which
results in low quality code for arbitary types. Since
all kernel arguments are passed in memory, we just want the
raw types.

I've tried a couple of methods of mitigating this in SelectionDAG,
but it's easier to just bypass this problem alltogether. It's
possible to hack around the problem in the initial lowering,
but the real problem is the DAG then expects to be able to use
CopyToReg/CopyFromReg for uses of the arguments outside the block.

Exposing the argument loads in the IR also has the advantage
that the LoadStoreVectorizer can merge them.

I'm not sure the best approach to dealing with the IR
argument list is. The patch as-is just leaves the IR arguments
in place, so all the existing code will still compute the same
kernarg size and pointlessly lowers the arguments.

Arguably the frontend should emit kernels with an empty argument
list in the first place. Alternatively a dummy array could be
inserted as a single argument just to reserve space.

This does have some disadvantages. Local pointer kernel arguments can
no longer have AssertZext placed  on them as the equivalent !range
metadata is not valid on pointer  typed loads. This is mostly bad
for SI which needs to know about the known bits in order to use the
DS instruction offset, so in this case this is not done.

More importantly, this skips noalias arguments since this pass
does not yet convert this to the equivalent !alias.scope and !noalias
metadata. Producing this metadata correctly seems to be tricky,
although this logically is the same as inlining into a function which
doesn't exist. Additionally, exposing these loads to the vectorizer
may result in degraded aliasing information if a pointer load is
merged with another argument load.

I'm also not entirely sure this is preserving the current clover
ABI, although I would greatly prefer if it would stop widening
arguments and match the HSA ABI. As-is I think it is extending
< 4-byte arguments to 4-bytes but doesn't align them to 4-bytes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@335650 91177308-0d34-0410-b5e6-96231b3b80d8
2018-06-26 19:10:00 +00:00
Stanislav Mekhanoshin
e5240df77a [AMDGPU] Construct memory clauses before RA
Memory clauses are formed into bundles in presence of xnack.
Their source operands are marked as early-clobber.

This allows to allocate distinct source and destination registers
within a clause and prevent breaking the clause with s_nop in the
hazard recognizer.

Clauses are undone before post-RA scheduler to allow some rescheduling,
which will not break the clause since artificial edges are created in
the dag to keep memory operations together. Yet this allows a better
ILP in some cases.

Differential Revision: https://reviews.llvm.org/D47511

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333691 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-31 20:13:51 +00:00
Tom Stellard
e0c801c31a AMDGPU: Split AMDGPUTTI into GCNTTI and R600TTI
Reviewers: arsenm, nhaehnle

Reviewed By: arsenm

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D47359

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333605 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-30 22:55:35 +00:00
Matt Arsenault
c40f49e881 AMDGPU: Fix typo in option description
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333457 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-29 19:35:46 +00:00
Mark Searles
d900d78107 [AMDGPU][Waitcnt] Remove obsolete waitcnt option
With the removal of the old waitcnt pass, the '-enable-si-insert-waitcnts' option is obsolete. Remove it.

Differential Revision: https://reviews.llvm.org/D47378

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333303 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-25 20:24:08 +00:00
Matt Arsenault
cdbb0ae52b AMDGPU: Add pass to optimize reqd_work_group_size
Eliminate loads from the dispatch packet when they will have
a known value.

Also pattern match the code used by the library to handle partial
workgroup dispatches, which isn't necessary if reqd_work_group_size
is used.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332771 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-18 21:35:00 +00:00
Matt Arsenault
0d44e5b362 AMDGPU: Rename OpenCL lowering pass to be R600 specific.
This pass is
  a) broken.
  b) r600 specific.

Fixing (a) is a bit more non-trivial, but fixing (b)
is easy. Move this pass to being R600 only for now.

This pass does pass all the unit tests, however clang
no longer generates code that looks like the unit test
input, so fixing the pass requires fixing the tests and
the pass as one, and checking it works with clang still.

Patch by Dave Airlie

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@332196 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-13 10:04:48 +00:00
Mark Searles
ac95aa5361 [AMDGPU][Waitcnt] Remove the old waitcnt pass
Remove the old waitcnt pass ( si-insert-waits ), which is no longer maintained
and getting crufty

Differential Revision: https://reviews.llvm.org/D46448

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331641 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-07 14:43:28 +00:00
Adrian Prantl
26b584c691 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@331272 91177308-0d34-0410-b5e6-96231b3b80d8
2018-05-01 15:54:18 +00:00
Tom Stellard
70b7270658 AMDGPU: Initialize GlobalISel passes
Summary:
This fixes AMDGPU GlobalISel test failures when enabling the AMDGPU
target without any other targets that use GlobalISel.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, rovka, kristof.beyls, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D45353

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@329588 91177308-0d34-0410-b5e6-96231b3b80d8
2018-04-09 16:09:13 +00:00
Matt Arsenault
afaac4ad30 AMDGPU: Set natural stack alignment in DataLayout
Only 4 byte alignment is ever useful, so increasing anything
beyond this may require realigning the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328656 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-27 19:26:40 +00:00
David Blaikie
fe42bd50da Move TargetLoweringObjectFile from CodeGen to Target to fix layering
It's implemented in Target & include from other Target headers, so the
header should be in Target.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@328392 91177308-0d34-0410-b5e6-96231b3b80d8
2018-03-23 23:58:19 +00:00
Yaxun Liu
2930e5c52d [AMDGPU] Change constant addr space to 4
Differential Revision: https://reviews.llvm.org/D43170


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@325030 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-13 18:00:25 +00:00
Matt Arsenault
6f2da0b6ad Reapply "AMDGPU: Add 32-bit constant address space"
This reverts r324494 and reapplies r324487.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324747 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-09 16:57:57 +00:00
Rafael Espindola
c952538085 Revert "AMDGPU: Add 32-bit constant address space"
This reverts commit r324487.

It broke clang tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324494 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-07 18:09:35 +00:00
Marek Olsak
3afd566557 AMDGPU: Add 32-bit constant address space
Note: This is a candidate for LLVM 6.0, because it was planned to be
      in that release but was delayed due to a long review period.

Merge conflict in release_60 - resolution:
    Add "-p6:32:32" into the second (non-amdgiz) string.

Only scalar loads support 32-bit pointers. An address in a VGPR will
fail to compile. That's OK because the results of loads will only be used
in places where VGPRs are forbidden.

Updated AMDGPUAliasAnalysis and used SReg_64_XEXEC.
The tests cover all uses cases we need for Mesa.

Reviewers: arsenm, nhaehnle

Subscribers: kzhuravl, wdng, yaxunl, dstuttard, tpr, t-tye, llvm-commits

Differential Revision: https://reviews.llvm.org/D41651

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324487 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-07 16:01:00 +00:00
Mark Searles
4fe7c58adf [AMDGPU] Suppress redundant waitcnt instrs.
1. Run the memory legalizer prior to the waitcnt pass; keep the policy that the waitcnt pass does not remove any waitcnts within the incoming IR.

2. The waitcnt pass doesn't (yet) track waitcnts that exist prior to the waitcnt pass (it just skips over them); because the waitcnt pass is ignorant of them, it may insert a redundant waitcnt. To avoid this, check the prev instr. If it and the to-be-inserted waitcnt are the same, suppress the insertion. We keep the existing waitcnt under the assumption that whomever, e.g., the memory legalizer, inserted it knows what they were doing.

3. Follow-on work: teach the waitcnt pass to record the pre-existing waitcnts for better waitcnt production.

Differential Revision: https://reviews.llvm.org/D42854

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324440 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-07 02:21:21 +00:00
Yaxun Liu
a2386c3b26 [AMDGPU] Switch to the new addr space mapping by default
This requires corresponding clang change.

Differential Revision: https://reviews.llvm.org/D40955


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@324101 91177308-0d34-0410-b5e6-96231b3b80d8
2018-02-02 16:07:16 +00:00
Hiroshi Inoue
d1b456b6d1 [NFC] fix trivial typos in comments and documents
"to to" -> "to"



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@323628 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-29 05:17:03 +00:00
Matthias Braun
09008367fc Split MachineLICM into EarlyMachineLICM and MachineLICM; NFC
This avoids playing games with pseudo pass IDs and avoids using an
unreliable MRI::isSSA() check to determine whether register allocation
has happened.

Note that this renames:
- MachineLICMID -> EarlyMachineLICM
- PostRAMachineLICMID -> MachineLICMID
to be consistent with the EarlyTailDuplicate/TailDuplicate naming.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@322927 91177308-0d34-0410-b5e6-96231b3b80d8
2018-01-19 06:46:10 +00:00
Sanjoy Das
b166e6723e (Re-landing) Expose a TargetMachine::getTargetTransformInfo function
Re-land r321234.  It had to be reverted because it broke the shared
library build.  The shared library build broke because there was a
missing LLVMBuild dependency from lib/Passes (which calls
TargetMachine::getTargetIRAnalysis) to lib/Target.  As far as I can
tell, this problem was always there but was somehow masked
before (perhaps because TargetMachine::getTargetIRAnalysis was a
virtual function).

Original commit message:

This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321375 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-22 18:21:59 +00:00
Sanjoy Das
0a1aae823b Revert "Expose a TargetMachine::getTargetTransformInfo function"
This reverts commit r321234.  It breaks the -DBUILD_SHARED_LIBS=ON build.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321243 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-21 02:34:39 +00:00
Sanjoy Das
0950f32a89 Expose a TargetMachine::getTargetTransformInfo function
Summary:
This makes the TargetMachine interface a bit simpler.  We still need
the std::function in TargetIRAnalysis to avoid having to add a
dependency from Analysis to Target.

See discussion:
http://lists.llvm.org/pipermail/llvm-dev/2017-December/119749.html

I avoided adding all of the backend owners to this review since the
change is simple, but let me know if you feel differently about this.

Reviewers: echristo, MatzeB, hfinkel

Reviewed By: hfinkel

Subscribers: jholewinski, jfb, arsenm, dschuff, mcrosier, sdardis, nemanjai, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, llvm-commits

Differential Revision: https://reviews.llvm.org/D41464

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@321234 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-21 01:06:58 +00:00
Valery Pykhtin
aeb7444c9b AMDGPU: Partial ILP scheduler port from SelectionDAG to SchedulingDAG (experimental)
Differential revision: https://reviews.llvm.org/D39897

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318649 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-20 14:35:53 +00:00
David Blaikie
e3a9b4ce3a Fix a bunch more layering of CodeGen headers that are in Target
All these headers already depend on CodeGen headers so moving them into
CodeGen fixes the layering (since CodeGen depends on Target, not the
other way around).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318490 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-17 01:07:10 +00:00
Yaxun Liu
ec1f0cc416 [AMDGPU] Change alloca addr space of r600 to 5 for amdgiz environment
Differential Revision: https://reviews.llvm.org/D39657


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317479 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 14:32:33 +00:00
Benjamin Kramer
63e6891819 [AMDGPU] Clean up symbols in the global namespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317051 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-31 23:21:30 +00:00
Matthias Braun
9385cf15d1 Revert "TargetMachine: Merge TargetMachine and LLVMTargetMachine"
Reverting to investigate layering effects of MCJIT not linking
libCodeGen but using TargetMachine::getNameWithPrefix() breaking the
lldb bots.

This reverts commit r315633.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315637 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 22:57:28 +00:00
Matthias Braun
a063107f8d TargetMachine: Merge TargetMachine and LLVMTargetMachine
Merge LLVMTargetMachine into TargetMachine.

- There is no in-tree target anymore that just implements TargetMachine
  but not LLVMTargetMachine.
- It should still be possible to stub out all the various functions in
  case a target does not want to use lib/CodeGen
- This simplifies the code and avoids methods ending up in the wrong
  interface.

Differential Revision: https://reviews.llvm.org/D38489

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315633 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 22:28:54 +00:00
Matt Arsenault
d341fb0564 AMDGPU: Fix incorrect selection of pseudo-branches
These should only be used if the machine structurizer is enabled.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315357 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 20:22:07 +00:00
Yaxun Liu
091c043b90 [AMDGPU] Lower enqueued blocks and generate runtime metadata
This patch adds a post-linking pass which replaces the function pointer of enqueued
block kernel with a global variable (runtime handle) and adds
runtime-handle attribute to the enqueued block kernel.

In LLVM CodeGen the runtime-handle metadata will be translated to
RuntimeHandle metadata in code object. Runtime allocates a global buffer
for each kernel with RuntimeHandel metadata and saves the kernel address
required for the AQL packet into the buffer. __enqueue_kernel function
in device library knows that the invoke function pointer in the block
literal is actually runtime handle and loads the kernel address from it
and puts it into AQL packet for dispatching.

This cannot be done in FE since FE cannot create a unique global variable
with external linkage across LLVM modules. The global variable with internal
linkage does not work since optimization passes will try to replace loads
of the global variable with its initialization value.

Differential Revision: https://reviews.llvm.org/D38610


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315352 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-10 19:39:48 +00:00
Stanislav Mekhanoshin
8b431a9469 [AMDGPU] Set fast-math flags on functions given the options
We have a single library build without relaxation options.
When inlined library functions remove fast math attributes
from the functions they are integrated into.

This patch sets relaxation attributes on the functions after
linking provided corresponding relaxation options are given.
Math instructions inside the inlined functions remain to have
no fast flags, but inlining does not prevent fast math
transformations of a surrounding caller code anymore.

Differential Revision: https://reviews.llvm.org/D38325

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314568 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 23:40:19 +00:00
Stanislav Mekhanoshin
b5a9104224 [AMDGPU] Fixed memory leak with inliner replaced
Delete inliner before replacing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313723 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:34:28 +00:00
Stanislav Mekhanoshin
2e5d75b42d [AMDGPU] Fix regression in test clang/test/CodeGen/backend-unsupported-error.ll
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313718 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 06:10:15 +00:00
Stanislav Mekhanoshin
fbf0e1603c [AMDGPU] Port of HSAIL inliner
Differential Revision: https://reviews.llvm.org/D36849

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313714 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-20 04:25:58 +00:00
Matt Arsenault
4b385be048 AMDGPU: Run internalize symbols at -O0
The relocations used for externally visible functions
aren't supported, so the direct call emitted ends
up hitting a linker error.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313616 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-19 07:40:11 +00:00
Stanislav Mekhanoshin
9643e6bd78 [AMDGPU] Ported and adopted AMDLibCalls pass
The pass does simplifications of well known AMD library calls.
If given -amdgpu-prelink option it works in a pre-link mode which
allows to reference new library functions which will be linked in
later.

In addition it also used to process traditional AMD option
-fuse-native which allows to replace some of the functions with
their fast native implementations from the library.

The necessary glue to pass the prelink option and translate
-fuse-native is to be added to the driver.

Differential Revision: https://reviews.llvm.org/D36436

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310731 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-11 16:42:09 +00:00
Tom Stellard
39aad0ab08 AMDGPU: Move R600 parts of AMDGPUISelDAGToDAG into their own class
Summary: This refactoring is required in order to split the R600 and GCN tablegen files.

Reviewers: arsenm

Subscribers: kzhuravl, wdng, nhaehnle, yaxunl, dstuttard, tpr, llvm-commits, t-tye

Differential Revision: https://reviews.llvm.org/D36286

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310336 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-08 04:57:55 +00:00
Matt Arsenault
f1b4625e0b AMDGPU: Remove redundant opt level check
addOptimizedRegAlloc isn't used for -O0 already.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310275 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-07 18:12:48 +00:00