33667 Commits

Author SHA1 Message Date
James Y Knight
c1b271062b Fix gold test after r256465.
That commit added a new pass, and this test is sensitive to what the
first pass after verify is called.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256532 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-29 03:48:37 +00:00
Eric Christopher
8d5b76e5d4 Accept dwarf version 5 for CIE versions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256527 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 23:02:42 +00:00
Artyom Skrobov
1714cbd44c [Thumb] Fix assembler error 'cannot honor width suffix pop {lr}'
Summary:
* avoid generating POP {LR} in Thumb1 epilogues
* combine MOV LR, Rx + BX LR -> BX Rx in a peephole optimization pass
* combine POP {LR} + B + BX LR -> POP {PC} on v5T+

Test cases by Ana Pazos

Differential Revision: http://reviews.llvm.org/D15707

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256523 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 21:40:45 +00:00
Sanjay Patel
a5063429b2 [x86] lower calls to fmin and llvm.minnum.* using minss/minsd/minps/minpd (PR24475)
This is a follow-on to:
http://reviews.llvm.org/rL255700
http://reviews.llvm.org/rL256454
http://reviews.llvm.org/rL256510



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256522 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 21:16:55 +00:00
Manuel Jacob
912373de69 [RS4GC] Fix rematerialization of bitcast of bitcast.
Summary:
Previously, only the outer (last) bitcast was rematerialized, resulting in a
use of the unrelocated inner (first) bitcast after the statepoint.  See the
test case for an example.

Reviewers: igor-laevsky, reames

Subscribers: reames, alex, llvm-commits, sanjoy

Differential Revision: http://reviews.llvm.org/D15789

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256520 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 20:14:05 +00:00
Elena Demikhovsky
84f6badccc Implemented cost model for masked gather and scatter operations
The cost is calculated for all X86 targets. When gather/scatter instruction
is not supported we calculate the cost of scalar sequence.

Differential revision: http://reviews.llvm.org/D15677



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256519 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 20:10:59 +00:00
Sanjay Patel
5fb72b3250 [x86] lower calls to fmax and llvm.maxnum.* using maxps/maxpd (PR24475)
This is a follow-on to:
http://reviews.llvm.org/rL255700
http://reviews.llvm.org/rL256454



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256510 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 19:20:19 +00:00
Sanjay Patel
3e50893e8e Specify triple so 'make check' passes on darwin x86-64
The check lines were added with:
http://reviews.llvm.org/rL256458
http://reviews.llvm.org/rL256460

but on a darwin target, the output looks like:
  ## InlineAsm Start
  rorq  %rdi
  ## InlineAsm End
  ## InlineAsm Start
  rorq  %rsi
  ## InlineAsm End
  leaq  (%rsi,%rdi), %rax
  retq




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256507 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 18:28:44 +00:00
Roman Divacky
d1070bbc62 Support clrex instruction on ARMv6k. Patch by Andrew Turner.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256505 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 17:47:23 +00:00
Michael Kuperstein
651ff526af [X86] Better support for the MCU psABI (LLVM part)
This adds support for the MCU psABI in a way different from r251223 and r251224,
basically reverting most of these two patches. The problem with the approach
taken in r251223/4 is that it only handled libcalls that originated from the backend.
However, the mid-end also inserts quite a few libcalls and assumes these use the
platform's default calling convention.

The previous patch tried to insert inregs when necessary both in the FE and,
somewhat hackily, in the CG. Instead, we now define a new default calling convention
for the MCU, which doesn't use inreg marking at all, similarly to what x86-64 does.

Differential Revision: http://reviews.llvm.org/D15054

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256494 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 14:39:21 +00:00
Asaf Badouh
199a1320b7 [X86][AVX512] Lower broadcast sub vector to vector inrtrinsics
lower broadcast<type>x<vector> to shuffles.
 there are two cases:
1.src is 128 bits and dest is 512 bits: in this case we will lower it to shuffle with imm = 0.
2.src is 256 bit and dest is 512 bits: in this case we will lower it to shuffle with imm = 01000100b (0x44) that way we will broadcast the 256bit source: ymm[0,1,2,3] => zmm[0,1,2,3,0,1,2,3] then it will mask it with the passthru value (in case it's mask op).



Differential Revision: http://reviews.llvm.org/D15790



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256490 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 08:26:26 +00:00
Asaf Badouh
518acfca44 [X86][AVX512] add fp scalar broadcast intrinsics
Differential Revision: http://reviews.llvm.org/D15790


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256489 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 08:09:25 +00:00
Craig Topper
7112ca8ed8 [AVX512] Bring vmovq instructions names into alignment with the AVX and SSE names. Add a missing encoding to disassembler and assembler.
I believe this also fixes a case where a 64-bit memory form that is documented as being unsupported in 32-bit mode was able to be selected there.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256483 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-28 06:11:42 +00:00
Igor Breger
3f202fdf9e AVX512: Change VPMOVB2M DAG lowering , use CVT2MASK node instead TRUNCATE.
Fix TRUNCATE lowering vector to vector i1, use LSB and not MSB.
Implement VPMOVB/W/D/Q2M intrinsic.

Differential Revision: http://reviews.llvm.org/D15675

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256470 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 13:56:16 +00:00
Chandler Carruth
d97964253e [attrs] Extract the pure inference of function attributes into
a standalone pass.

There is no call graph or even interesting analysis for this part of
function attributes -- it is literally inferring attributes based on the
target library identification. As such, we can do it using a much
simpler module pass that just walks the declarations. This can also
happen much earlier in the pass pipeline which has benefits for any
number of other passes.

In the process, I've cleaned up one particular aspect of the logic which
was necessary in order to separate the two passes cleanly. It now counts
inferred attributes independently rather than just counting all the
inferred attributes as one, and the counts are more clearly explained.

The two test cases we had for this code path are both ... woefully
inadequate and copies of each other. I've kept the superset test and
updated it. We need more testing here, but I had to pick somewhere to
stop fixing everything broken I saw here.

Differential Revision: http://reviews.llvm.org/D15676

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256466 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 08:41:34 +00:00
Chandler Carruth
6a1ce8ecd6 [attrs] Split off the forced attributes utility into its own pass that
is (by default) run much earlier than FuncitonAttrs proper.

This allows forcing optnone or other widely impactful attributes. It is
also a bit simpler as the force attribute behavior needs no specific
iteration order.

I've added the pass into the default module pass pipeline and LTO pass
pipeline which mirrors where function attrs itself was being run.

Differential Revision: http://reviews.llvm.org/D15668

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256465 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 08:13:45 +00:00
David Majnemer
13f5e7d35c Make the test properly constrained
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256460 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 06:26:41 +00:00
David Majnemer
2b72c52b3f Try to passify buildbot
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256458 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 06:18:48 +00:00
NAKAMURA Takumi
b6019f26bc Prune the feature "tls". No one is using it since TLS is enabled for Cygwin.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256457 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 06:14:33 +00:00
David Majnemer
1342075082 [X86, Win64] Use a frame pointer if pushf is emitted
A frame pointer must be used if stack pointer is modified after the
prologue.  LLVM will emit pushf/popf if we need to save/restore the
FLAGS register, requiring us to have a frame pointer for the function.

There is a small twist: this sequence might exist in user code via
inline-assembly.  For now, conservatively assume that such functions
require a frame pointer.  For real world justification, please see
clang's implementation of __readeflags.

This fixes PR25945.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256456 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 06:07:26 +00:00
David Majnemer
7de4c3d125 [WinEH] Add comments explaining the EH tables
This is aids in debugging WinEH, similar functionality is present for
DWARF EH.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256455 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-27 06:07:12 +00:00
Sanjay Patel
39227c5849 [x86] lower calls to llvm.maxnum.v4f32 using maxps
This is a follow-on to:
http://reviews.llvm.org/rL255700



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256454 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 21:44:55 +00:00
Benjamin Kramer
1c6a388c07 Fix safepoint intrinsic signatures in test.
Should bring back the bots after r256443.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256450 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 11:40:48 +00:00
Chen Li
955318d58d [gc.statepoint] Change gc.statepoint intrinsic's return type to token type instead of i32 type
Summary: This patch changes gc.statepoint intrinsic's return type to token type instead of i32 type. Using token types could prevent LLVM to merge different gc.statepoint nodes into PHI nodes and cause further problems with gc relocations. The patch also changes the way on how gc.relocate and gc.result look for their corresponding gc.statepoint on unwind path. The current implementation uses the selector value extracted from a { i8*, i32 } landingpad as a hook to find the gc.statepoint, while the patch directly uses a token type landingpad (http://reviews.llvm.org/D15405) to find the gc.statepoint. 

Reviewers: sanjoy, JosephTremoulet, pgavlin, igor-laevsky, mjacob

Subscribers: reames, mjacob, sanjoy, llvm-commits

Differential Revision: http://reviews.llvm.org/D15662

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256443 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 07:54:32 +00:00
Craig Topper
d8ff987182 Add test case for r256433. "[X86] Fix shuffle decoding for variable VPERMIL to be tolerant of the Constant type not matching due to folding in the constant pool and to get VPERMILPD correct."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256435 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 04:58:05 +00:00
Craig Topper
7d5d4dd66a Revert r256432 "Test"
This is the test case for r256433, but it got committed incorrectly in my local repo.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256434 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 04:56:51 +00:00
Craig Topper
eef8544f1d Test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256432 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-26 04:50:01 +00:00
Dan Gohman
005cc9c500 [WebAssembly] Fix handling of COPY instructions in WebAssemblyRegStackify.
Move RegStackify after coalescing and teach it to use LiveIntervals instead
of depending on SSA form. This avoids a problem where a register in a COPY
instruction is stackified and then subsequently coalesced with a register
that is not stackified.

This also puts it after the scheduler, which allows us to simplify the
EXPR_STACK constraint, as we no longer have instructions being reordered
after stackification and before coloring.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256402 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-25 00:31:02 +00:00
Sanjay Patel
75759ab3e9 [InstCombine] transform more extract/insert pairs into shuffles (PR2109)
This is an extension of the shuffle combining from r203229:
http://reviews.llvm.org/rL203229

The idea is to widen a short input vector with undef elements so the
existing shuffle transform for extract/insert can kick in.

The motivation is to finally solve PR2109:
https://llvm.org/bugs/show_bug.cgi?id=2109

For that example, the IR becomes:

%1 = bitcast <2 x i32>* %P to <2 x float>*
%ld1 = load <2 x float>, <2 x float>* %1, align 8
%2 = shufflevector <2 x float> %ld1, <2 x float> undef, <4 x i32> <i32 0, i32 1, i32 undef, i32 undef>
%i2 = shufflevector <4 x float> %A, <4 x float> %2, <4 x i32> <i32 0, i32 1, i32 4, i32 5>
ret <4 x float> %i2

And x86 SSE output improves from:

movq	(%rdi), %xmm1           ## xmm1 = mem[0],zero
movdqa	%xmm1, %xmm2
shufps	$229, %xmm2, %xmm2      ## xmm2 = xmm2[1,1,2,3]
shufps	$48, %xmm0, %xmm1       ## xmm1 = xmm1[0,0],xmm0[3,0]
shufps	$132, %xmm1, %xmm0      ## xmm0 = xmm0[0,1],xmm1[0,2]
shufps	$32, %xmm0, %xmm2       ## xmm2 = xmm2[0,0],xmm0[2,0]
shufps	$36, %xmm2, %xmm0       ## xmm0 = xmm0[0,1],xmm2[2,0]
retq

To the almost optimal:

movhpd	(%rdi), %xmm0

Note: There's a tension in the existing transform related to generating
arbitrary shufflevector masks. We avoid that in other places in InstCombine
because we're scared that codegen can't handle strange masks, but it looks
like we're ok with producing those here. I purposely chose weird insert/extract
indexes for the regression tests to see the effect in these cases. 
For PowerPC+Altivec, AArch64, and X86+SSE/AVX, I think the codegen is equal or
better for these examples.

Differential Revision: http://reviews.llvm.org/D15096



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256394 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 21:17:56 +00:00
Asaf Badouh
5c7343b3a6 [X86][PKU] Add {RD,WR}PKRU encoding
Differential Revision: http://reviews.llvm.org/D15711

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256366 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 08:25:00 +00:00
Elena Demikhovsky
52ebd43338 AVX-512: Kreg set 0/1 optimization
The patterns that set a mask register to 0/1
KXOR %kn, %kn, %kn / KXNOR %kn, %kn, %kn
are replaced with
KXOR %k0, %k0, %kn / KXNOR %k0, %k0, %kn - AVX-512 targets optimization.

KNL does not recognize dependency-breaking idioms for mask registers,
so kxnor %k1, %k1, %k2 has a RAW dependence on %k1.
Using %k0 as the undef input register is a performance heuristic based
on the assumption that %k0 is used less frequently than the other mask
registers, since it is not usable as a write mask.

Differential Revision: http://reviews.llvm.org/D15739



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256365 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 08:12:22 +00:00
Igor Breger
d7d8cb8af1 AVX512: VPMOVM2B/W/D/Q intrinsic implementation.
Differential Revision: http://reviews.llvm.org//D15747

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256364 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 07:11:53 +00:00
Tom Stellard
abdebe3a07 AMDGPU/SI: Fix encoding of flat instructions on VI
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15735

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256360 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-24 03:18:18 +00:00
JF Bastien
c2a3785141 WebAssembly: remove 'external' from test
Summary: Linker testing was sad at seeing an unresolved external symbol. For now don't do that: it's valid but we're not playing with multi-file linking yet, and the LLVM tests are used as hacky sanity tests for single-file linking (the GCC torture tests are much better for this purpose). Another solution would be to use '.extern' to make the intent explicit (don't simple-file link this, there's an unresolved symbol), some assemblers use '.extern' while others ignore it, so we wouldn't really be inventing anything new.

Reviewers: sunfish, kripken

Subscribers: jfb, llvm-commits, dschuff

Differential Revision: http://reviews.llvm.org/D15753

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256353 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 23:56:13 +00:00
Philip Reames
da801219ba [Statepoints] Use Indirect operands for spill slots
Teach the statepoint lowering code to emit Indirect stackmap entries for spill inserted by StatepointLowering (i.e. SelectionDAG), but Direct stackmap entries for in-IR allocas which represent manual stack slots. This is what the docs call for (http://llvm.org/docs/StackMaps.html#stack-map-format), but we've been emitting both as Direct. This was pointed out recently on the mailing list as a bug. It also blocks http://reviews.llvm.org/D15632 which extends the lowering to handle vector-of-pointers since only Indirect references can encode a variable sized slot.

To implement this, I introduced a new flag on the StackObject class used to maintian information about stack slots. I original considered (and prototyped in http://reviews.llvm.org/D15632), the idea of using the existing isSpillSlot flag, but end up deciding that was a bit too risky and that the cost of adding a new flag was low. Having the new flag will also allow us - in the future - to emit better comments in verbose assembly which indicate where a particular stack spill around a call comes from. (deopt, gc, regalloc).

Differential Revision: http://reviews.llvm.org/D15759



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256352 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 23:44:28 +00:00
Adrian Prantl
8eb4aa5936 llvm-dwarfdump: Add support for dumping .dSYM bundles.
This replicates the logic of Darwin dwarfdump for manually opening up
.dSYM bundles without introducing any new dependencies.
<rdar://problem/20491670>

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256350 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 21:51:13 +00:00
Simon Pilgrim
023935c3d9 [X86][AVX] Only shuffle the lower half of vectors if the upper half is undefined
First step towards making better use of AVX's implicit zeroing of the upper half of a 256-bit vector by instructions that only act on the lower 128-bit vector - discussed on D14151.

As well as the fact that 128-bit shuffle instructions are generally more capable, this can be performant for older CPUs with 128-bit ALUs (e.g. Jaguar, Sandy Bridge) that must treat 256-bit vectors as multiple micro-ops.

Moved the similar subvector extraction shuffle combines from PerformShuffleCombine256 to lowerVectorShuffle as well.

Note: I've avoided combining shuffles that reference elements from the upper halves of the input vectors - this may be reviewed in future work as well (AVX1 would probably always gain, but AVX2 does have some cross-lane shuffle instructions).

Differential Revision: http://reviews.llvm.org/D15477

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256332 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 13:10:07 +00:00
David Majnemer
07a42297a7 [OperandBundles] Have GlobalsModRef play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256329 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 09:58:46 +00:00
David Majnemer
575121d310 [OperandBundles] Have TailCallElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

This fixes PR25928.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256328 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 09:58:43 +00:00
David Majnemer
ec185d074d [OperandBundles] Have InstCombine play nice with operand bundles
Don't assume a call's use corresponds to an argument operand, it might
correspond to a bundle operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256327 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 09:58:41 +00:00
David Majnemer
688b5df9e2 [OperandBundles] Have DeadArgElim play nice with operand bundles
A call site's use of a Value might not correspond to an argument
operand but to a bundle operand.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256326 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 09:58:36 +00:00
Igor Breger
d34ae248c0 AVX512BW: Enable packed word shift for 512bit vector. Enable lowering scalar immidiate shift v64i8 .Fix predicate for AVX1/2 shifts.
Differential Revision: http://reviews.llvm.org/D15713

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256324 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 08:06:50 +00:00
David Majnemer
0f16f3c826 [WinEH] Don't visit the same catchswitch twice
We visited the same catchswitch twice because it was both the child of
another funclet and the predecessor of a cleanuppad.

Instead, change the numbering algorithm to only recurse if the unwind
destination of the inner funclet agrees with the unwind destination of
the catchswitch.

This fixes PR25926.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256317 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 03:59:04 +00:00
Paul Robinson
073a62bc1d Form reform for MCDwarf.
MCDwarf emits a canned abbreviation table, but was not emitting proper
forms for DWARF version 4, which is the default after r249655.

Differential Revision: http://reviews.llvm.org/D15732


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256313 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 01:57:31 +00:00
Manuel Jacob
982e9b7f0d [RS4GC] Fix base pair printing for constants.
Previously, "%" + name of the value was printed for each derived and base
pointer.  This is correct for instructions, but wrong for e.g. globals.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256305 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-23 00:19:45 +00:00
Changpeng Fang
89e60598f6 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

  NOTE: re-commit by fixing a failure in Codegen/AMDGPU/llvm.dbg.value.ll

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256282 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 20:55:23 +00:00
Rafael Espindola
86e3cfb9dc Also add unnamed_addr to functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256281 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 20:43:30 +00:00
Rafael Espindola
937ba4cad8 Delete dead GlobalAliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256276 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:50:22 +00:00
Rafael Espindola
a00544a653 Revert "AMDGPU/SI: Use flat for global load/store when targeting HSA"
This reverts commit r256273.

It broke CodeGen/AMDGPU/llvm.dbg.value.ll

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256275 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:46:44 +00:00
Changpeng Fang
808f9643e6 AMDGPU/SI: Use flat for global load/store when targeting HSA
Summary:
  For some reason doing executing an MUBUF instruction with the addr64
  bit set and a zero base pointer in the resource descriptor causes
  the memory operation to be dropped when the shader is executed using
  the HSA runtime.

  This kind of MUBUF instruction is commonly used when the pointer is
  stored in VGPRs.  The base pointer field in the resource descriptor
  is set to zero and and the pointer is stored in the vaddr field.

  This patch resolves the issue by only using flat instructions for
  global memory operations when targeting HSA. This is an overly
  conservative fix as all other configurations of MUBUF instructions
  appear to work.

Reviewers: tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15543

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@256273 91177308-0d34-0410-b5e6-96231b3b80d8
2015-12-22 19:32:28 +00:00