47 Commits

Author SHA1 Message Date
Yaxun Liu
ab3be33d40 [AMDGPU] Get address space mapping by target triple environment
As we introduced target triple environment amdgiz and amdgizcl, the address
space values are no longer enums. We have to decide the value by target triple.

The basic idea is to use struct AMDGPUAS to represent address space values.
For address space values which are not depend on target triple, use static
const members, so that they don't occupy extra memory space and is equivalent
to a compile time constant.

Since the struct is lightweight and cheap, it can be created on the fly at
the point of usage. Or it can be added as member to a pass and created at
the beginning of the run* function.

Differential Revision: https://reviews.llvm.org/D31284


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298846 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-27 14:04:01 +00:00
George Burgess IV
3479ed63a6 Let llvm.objectsize be conservative with null pointers
This adds a parameter to @llvm.objectsize that makes it return
conservative values if it's given null.

This fixes PR23277.

Differential Revision: https://reviews.llvm.org/D28494


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298430 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 20:08:59 +00:00
Reid Kleckner
6707770d48 Rename AttributeSet to AttributeList
Summary:
This class is a list of AttributeSetNodes corresponding the function
prototype of a call or function declaration. This class used to be
called ParamAttrListPtr, then AttrListPtr, then AttributeSet. It is
typically accessed by parameter and return value index, so
"AttributeList" seems like a more intuitive name.

Rename AttributeSetImpl to AttributeListImpl to follow suit.

It's useful to rename this class so that we can rename AttributeSetNode
to AttributeSet later. AttributeSet is the set of attributes that apply
to a single function, argument, or return value.

Reviewers: sanjoy, javed.absar, chandlerc, pete

Reviewed By: pete

Subscribers: pete, jholewinski, arsenm, dschuff, mehdi_amini, jfb, nhaehnle, sbc100, void, llvm-commits

Differential Revision: https://reviews.llvm.org/D31102

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298393 91177308-0d34-0410-b5e6-96231b3b80d8
2017-03-21 16:57:19 +00:00
Stanislav Mekhanoshin
a1d4ee75a4 [AMDGPU] Account workgroup size in LDS occupancy limits
Functions matching LDS use to occupancy return results for a workgroup
of 64 workitems. The numbers has to be adjusted for bigger workgroups.
For example a workgroup of size 256 already occupies 4 waves just by
itself. Given that all numbers of LDS use in the compiler are per
workgroup, occupancy shall be multiplied by 4 in this case. Each 64
workitems still limited by the same number, but 4 subrgoups 64 workitems
each can afford 4 times more LDS to get the same occupancy.

In addition change initializes LDS size in the subtarget to a real value
for SI+ targets. This is required since LDS size is a variable in these
calculations.

Differential Revision: https://reviews.llvm.org/D29423

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293837 91177308-0d34-0410-b5e6-96231b3b80d8
2017-02-01 22:59:50 +00:00
Matthias Braun
88d207542b Cleanup dump() functions.
We had various variants of defining dump() functions in LLVM. Normalize
them (this should just consistently implement the things discussed in
http://lists.llvm.org/pipermail/cfe-dev/2014-January/034323.html

For reference:
- Public headers should just declare the dump() method but not use
  LLVM_DUMP_METHOD or #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
- The definition of a dump method should look like this:
  #if !defined(NDEBUG) || defined(LLVM_ENABLE_DUMP)
  LLVM_DUMP_METHOD void MyClass::dump() {
    // print stuff to dbgs()...
  }
  #endif

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@293359 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-28 02:02:38 +00:00
Changpeng Fang
6562a033a3 AMDGPU/SI: Give up in promote alloca when a pointer may be captured.
Differential Revision:
  http://reviews.llvm.org/D28970

Reviewer:
  Matt

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292966 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-24 19:06:28 +00:00
Eugene Zelenko
68c521d030 [AMDGPU] Fix some Clang-tidy modernize and Include What You Use warnings; other minor fixes (NFC).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@292623 91177308-0d34-0410-b5e6-96231b3b80d8
2017-01-20 17:52:16 +00:00
Matt Arsenault
f4fb506ab9 AMDGPU: Fix AMDGPUPromoteAlloca breaking addrspacecasts
The users of the addrspacecast were having their types incorrectly
changed, producing invalid bitcasts between address spaces.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@289307 91177308-0d34-0410-b5e6-96231b3b80d8
2016-12-10 00:52:50 +00:00
Mehdi Amini
67f335d992 Use StringRef in Pass/PassManager APIs (NFC)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283004 91177308-0d34-0410-b5e6-96231b3b80d8
2016-10-01 02:56:57 +00:00
Konstantin Zhuravlyov
1f99c41083 [AMDGPU] Wave and register controls
- Implemented amdgpu-flat-work-group-size attribute
- Implemented amdgpu-num-active-waves-per-eu attribute
- Implemented amdgpu-num-sgpr attribute
- Implemented amdgpu-num-vgpr attribute
- Dynamic LDS constraints are in a separate patch

Patch by Tom Stellard and Konstantin Zhuravlyov

Differential Revision: https://reviews.llvm.org/D21562



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@280747 91177308-0d34-0410-b5e6-96231b3b80d8
2016-09-06 20:22:28 +00:00
David Majnemer
975248e4fb Use the range variant of find instead of unpacking begin/end
If the result of the find is only used to compare against end(), just
use is_contained instead.

No functionality change is intended.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@278433 91177308-0d34-0410-b5e6-96231b3b80d8
2016-08-11 22:21:41 +00:00
Matt Arsenault
1b96f3c048 AMDGPU: Remove pointless dyn_cast_or_null
This is already casted above so non-null

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275881 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 19:00:07 +00:00
Matt Arsenault
865e2fa1dc AMDGPU: Remove dead check in AMDGPUPromoteAlloca
This is currently only called with GEP users. A direct
alloca would only happen with current typed pointers
for arrays which are a perverse case.

Also fix crashes on 0 x and 1 x arrays.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275869 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:34:53 +00:00
Matt Arsenault
797b9ee060 AMDGPU: Remove dead code and redundant check
Non intrinsic calls aren't really handled, and this
IntrinsicInst dyn_cast checks for the function for us.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275868 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 18:34:48 +00:00
Nicolai Haehnle
0c05ce4746 AMDGPU: Disable AMDGPUPromoteAlloca pass for shader calling conventions.
Summary:
The work item intrinsics are not available for the shader
calling conventions. And even if we did hook them up most
shader stages haves some extra restrictions on the amount
of available LDS.

Reviewers: tstellarAMD, arsenm

Subscribers: nhaehnle, arsenm, llvm-commits, kzhuravl

Differential Revision: https://reviews.llvm.org/D20728

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@275779 91177308-0d34-0410-b5e6-96231b3b80d8
2016-07-18 09:02:47 +00:00
Matt Arsenault
dca409d5ad AMDGPU: Move subtarget feature checks into passes
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@273937 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-27 20:32:13 +00:00
Peter Collingbourne
63b34cdf34 IR: Introduce local_unnamed_addr attribute.
If a local_unnamed_addr attribute is attached to a global, the address
is known to be insignificant within the module. It is distinct from the
existing unnamed_addr attribute in that it only describes a local property
of the module rather than a global property of the symbol.

This attribute is intended to be used by the code generator and LTO to allow
the linker to decide whether the global needs to be in the symbol table. It is
possible to exclude a global from the symbol table if three things are true:
- This attribute is present on every instance of the global (which means that
  the normal rule that the global must have a unique address can be broken without
  being observable by the program by performing comparisons against the global's
  address)
- The global has linkonce_odr linkage (which means that each linkage unit must have
  its own copy of the global if it requires one, and the copy in each linkage unit
  must be the same)
- It is a constant or a function (which means that the program cannot observe that
  the unique-address rule has been broken by writing to the global)

Although this attribute could in principle be computed from the module
contents, LTO clients (i.e. linkers) will normally need to be able to compute
this property as part of symbol resolution, and it would be inefficient to
materialize every module just to compute it.

See:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160509/356401.html
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20160516/356738.html
for earlier discussion.

Part of the fix for PR27553.

Differential Revision: http://reviews.llvm.org/D20348

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@272709 91177308-0d34-0410-b5e6-96231b3b80d8
2016-06-14 21:01:22 +00:00
Matt Arsenault
44aaff08ed AMDGPU: Fix promote alloca for pointer loads
If the load has a pointer type, we don't want to change
its type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@270000 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 23:20:24 +00:00
Matt Arsenault
41cf920df5 AMDGPU: Handle alloca promoting with null operands
If the second pointer in a multi-pointer instruction is
a constant, we can replace the type.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269945 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-18 15:57:21 +00:00
Matt Arsenault
7985e4be56 AMDGPU: Fix promote alloca pass creating huge arrays
This was assuming it could use all memory before, which is
a bad decision because it restricts occupancy.

By default, only try to use enough space that could reduce
occupancy to 7, an arbitrarily chosen limit.

Based on the exist LDS usage, try to round up to the limit
in the current tier instead of further hurting occupancy.
This isn't ideal, because it doesn't accurately know how much
space is going to be used for alignment padding.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269708 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-16 21:19:59 +00:00
Matt Arsenault
8adf4ebcf3 AMDGPU: Fix breaking IR on instructions with multiple pointer operands
The promote alloca pass would attempt to promote an alloca with
a select, icmp, or phi user, even though the other operand was
from a non-promotable source, producing a select on two different
pointer types.

Only do this if we know that both operands derive from the same
alloca. In the future we should be able to relax this to an alloca
which will also be promoted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@269265 91177308-0d34-0410-b5e6-96231b3b80d8
2016-05-12 01:58:58 +00:00
Matt Arsenault
3ba7927b46 AMDGPU: Fix mishandling array allocations when promoting alloca
The canonical form for allocas is a single allocation of the array type.
In case we see a non-canonical array alloca, make sure we aren't
replacing this with an array N times smaller.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267916 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-28 18:38:48 +00:00
Matt Arsenault
38099e5394 AMDGPU: Account for globals in AMDGPUPromoteAlloca pass
Patch by Bas Nieuwenhuizen

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267791 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-27 21:05:08 +00:00
Andrew Kaylor
c7ca1302cf Add optimization bisect opt-in calls for AMDGPU passes
Differential Revision: http://reviews.llvm.org/D19450



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@267485 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-25 22:23:44 +00:00
Tom Stellard
510a2b9622 AMDGPU: allow specifying a workgroup size that needs to fit in a compute unit
Summary:
For GL_ARB_compute_shader we need to support workgroup sizes of at least 1024. However, if we want to allow large workgroup sizes, we may need to use less registers, as we have to run more waves per SIMD.

This patch adds an attribute to specify the maximum work group size the compiled program needs to support. It defaults, to 256, as that has no wave restrictions.

Reducing the number of registers available is done similarly to how the registers were reserved for chips with the sgpr init bug.

Reviewers: mareko, arsenm, tstellarAMD, nhaehnle

Subscribers: FireBurn, kerberizer, llvm-commits, arsenm

Differential Revision: http://reviews.llvm.org/D18340

Patch By: Bas Nieuwenhuizen

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@266337 91177308-0d34-0410-b5e6-96231b3b80d8
2016-04-14 16:27:07 +00:00
Matt Arsenault
fc80e900d8 AMDGPU: Promote alloca should skip volatiles
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264214 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-23 23:17:29 +00:00
Matt Arsenault
6da276af59 AMDGPU: Don't use InstVisitor for AMDGPUPromoteAlloca
Frontend authors are strongly encouraged to keep allocas
in the entry block, so don't bother visiting every instruction
in the other blocks of the function.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263206 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-11 08:20:50 +00:00
Matt Arsenault
420f9c1154 AMDGPU: Remove a fixme for ptrrtoint handling
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262854 91177308-0d34-0410-b5e6-96231b3b80d8
2016-03-07 21:12:46 +00:00
Matt Arsenault
d1d0a1a39d AMDGPU: Preserve alignments on new created globals
Also switch to internal linkage, and include the name of the function in
the name.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259911 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-05 19:47:23 +00:00
Matt Arsenault
f1f2dd4ca2 AMDGPU: Do not promote allocas with non-inbounds GEPs
If we can't assume the pointer value isn't within the bounds
of the object, it seems risky to try to replace the pointer
calculations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259573 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 21:16:12 +00:00
Matt Arsenault
551787639e AMDGPU: Handle promoting memmove
Also add missing tests for the others.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259558 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 20:28:10 +00:00
Matt Arsenault
ec856e4504 AMDGPU: Skip promote alloca with no optimizations
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259551 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:32:42 +00:00
Matt Arsenault
ed9c1ce27a AMDGPU: Minor cleanups for AMDGPUPromoteAlloca
Mostly convert to use range loops.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259550 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:32:35 +00:00
Matt Arsenault
d01172110e AMDGPU: Report AMDGPUPromoteAlloca changed the function
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259547 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:18:57 +00:00
Matt Arsenault
53b80ebb13 AMDGPU: Whitelist handled intrinsics
We shouldn't crash on unhandled intrinsics.
Also simplify failure handling in loop.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259546 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:18:53 +00:00
Matt Arsenault
374613d697 AMDGPU: Use inbounds when calculating workitem offset
When promoting allocas to LDS, we know we are indexing
into a specific area just created, and the calculation
will also never overflow.

Also emit some of the muls as nsw nuw, because instcombine
infers this already from the range metadata. I think
putting this on the other adds and muls might be OK too,
but I'm not 100% sure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259545 91177308-0d34-0410-b5e6-96231b3b80d8
2016-02-02 19:18:48 +00:00
Matt Arsenault
084a86c382 AMDGPU: Fix emitting invalid workitem intrinsics for HSA
The AMDGPUPromoteAlloca pass was emitting the read.local.size
calls, which with HSA was incorrectly selected to reading from
the offset mesa uses off of the kernarg pointer.

Error on intrinsics which aren't supported by HSA, and start
emitting the correct IR to read the workgroup size
out of the dispatch pointer.

Also initialize the pass so it can be tested with opt, and
start moving towards not depending on the subtarget as an
argument.

Start emitting errors for the intrinsics not handled with HSA.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@259297 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-30 05:19:45 +00:00
Matt Arsenault
584bbb20e9 AMDGPU: Fix crash with invariant markers
The promote alloca pass didn't handle these intrinsics and crashed.
These intrinsics should accept any address space, but for now just
erase them to avoid breaking.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@258537 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-22 19:47:54 +00:00
Manuel Jacob
75e1cfb035 GlobalValue: use getValueType() instead of getType()->getPointerElementType().
Reviewers: mjacob

Subscribers: jholewinski, arsenm, dsanders, dblaikie

Patch by Eduard Burtescu.

Differential Revision: http://reviews.llvm.org/D16260


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257999 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-16 20:30:46 +00:00
Pete Cooper
6d024c616a Revert "Change memcpy/memset/memmove to have dest and source alignments."
This reverts commit r253511.

This likely broke the bots in
http://lab.llvm.org:8011/builders/clang-ppc64-elf-linux2/builds/20202
http://bb.pgr.jp/builders/clang-3stage-i686-linux/builds/3787

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253543 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-19 05:56:52 +00:00
Pete Cooper
8b170f7f29 Change memcpy/memset/memmove to have dest and source alignments.
Note, this was reviewed (and more details are in) http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20151109/312083.html

These intrinsics currently have an explicit alignment argument which is
required to be a constant integer.  It represents the alignment of the
source and dest, and so must be the minimum of those.

This change allows source and dest to each have their own alignments
by using the alignment attribute on their arguments.  The alignment
argument itself is removed.

There are a few places in the code for which the code needs to be
checked by an expert as to whether using only src/dest alignment is
safe.  For those places, they currently take the minimum of src/dest
alignments which matches the current behaviour.

For example, code which used to read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* %dest, i8* %src, i32 500, i32 8, i1 false)
will now read:
  call void @llvm.memcpy.p0i8.p0i8.i32(i8* align 8 %dest, i8* align 8 %src, i32 500, i1 false)

For out of tree owners, I was able to strip alignment from calls using sed by replacing:
  (call.*llvm\.memset.*)i32\ [0-9]*\,\ i1 false\)
with:
  $1i1 false)

and similarly for memmove and memcpy.

I then added back in alignment to test cases which needed it.

A similar commit will be made to clang which actually has many differences in alignment as now
IRBuilder can generate different source/dest alignments on calls.

In IRBuilder itself, a new argument was added.  Instead of calling:
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, /* isVolatile */ false)
you now call
  CreateMemCpy(Dst, Src, getInt64(Size), DstAlign, SrcAlign, /* isVolatile */ false)

There is a temporary class (IntegerAlignment) which takes the source alignment and rejects
implicit conversion from bool.  This is to prevent isVolatile here from passing its default
parameter to the source alignment.

Note, changes in future can now be made to codegen.  I didn't change anything here, but this
change should enable better memcpy code sequences.

Reviewed by Hal Finkel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253511 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-18 22:17:24 +00:00
Duncan P. N. Exon Smith
3d5dddde47 AMDGPU: Remove implicit ilist iterator conversions, NFC
One of the changes in lib/Target/AMDGPU/AMDGPUMCInstLower.cpp was a new
one.  Previously, bundle iterators and single-instruction iterators
could be compared to each other (comparing on underlying pointers).
I changed a comparison from using `MBB->end()` to using
`MBB->instr_end()`, since both end iterators should point at the some
place anyway.

I don't think the implicit conversion between the two iterator types is
a good idea since it's fairly easy to accidentally compare to the wrong
thing (they aren't always end iterators).  Otherwise I would have just
added the conversion.

Even with that, no there should be functionality change here.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250218 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-13 20:07:10 +00:00
Matt Arsenault
ca675427d2 AMDGPU: Produce error on dynamic_stackalloc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246048 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-26 18:37:13 +00:00
Craig Topper
84bbcfe200 De-constify pointers to Type since they can't be modified. NFC
This was already done in most places a while ago. This just fixes the ones that crept in over time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243842 91177308-0d34-0410-b5e6-96231b3b80d8
2015-08-01 22:20:21 +00:00
Matt Arsenault
03b49c843a AMDGPU: Don't try to use LDS/vector for private if pointer value stored
If the pointer is the store's value operand, this would produce
a broken module. Make sure the use is actually for the pointer operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243462 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 18:47:00 +00:00
Matt Arsenault
7a1c02d1f7 AMDGPU: Fix crash if called function is a bitcast
getCalledFunction() is null, so this would crash. Replace
crash with an error on unsupported call.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@243461 91177308-0d34-0410-b5e6-96231b3b80d8
2015-07-28 18:29:14 +00:00
Tom Stellard
953c681473 R600 -> AMDGPU rename
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@239657 91177308-0d34-0410-b5e6-96231b3b80d8
2015-06-13 03:28:10 +00:00