15436 Commits

Author SHA1 Message Date
Krzysztof Parzyszek
5c562302c2 [Hexagon] Implement RDF-based post-RA optimizations
- Handle simple cases of register copies (what current RDF CP allows).
- Hexagon-specific dead code elimination: handles dead address updates
  in post-increment instructions.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257504 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 19:09:01 +00:00
Tom Stellard
e395458a4f AMDGPU: Emit note directive for HSA even if there are no functions
Reviewers: arsenm, echristo

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D16010

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257488 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 17:18:17 +00:00
Daniel Sanders
1c140c48ae [mips] Correct operand order in DSP's mthi/mtlo
Summary: The result register is the second operand as per the other mt* instructions.

Reviewers: vkalintiris

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D15993

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257478 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 15:15:14 +00:00
Robert Lougher
e5716c4e3a The isel pattern that selects the memory-register form of VCVTPH2PS
(64 to 128-bit) matches against the pattern fragment 'vzmovl_v2i64'
(a zero-extended 64-bit load).

However, a change in r248784 teaches the instruction combiner that only
the lower 64 bits of the input to a 128-bit vcvtph2ps are used.  This means
the instruction combiner will ordinarily optimize away the upper 64-bit
insertelement instruction in the zero-extension and so we no longer select
the memory-register form.  To fix this a new pattern has been added.

Differential Revision: http://reviews.llvm.org/D16067


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257470 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 11:48:25 +00:00
Igor Breger
d5839e5e84 AVX512: VPMOVAPS/PD and VPMOVUPS/PD (load) intrinsic implementation.
Differential Revision: http://reviews.llvm.org/D16042

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257463 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 10:02:32 +00:00
Manman Ren
70af20f0b3 CXX_FAST_TLS calling convention: performance improvement for x86-64.
This is the same change on x86-64 as r255821 on AArch64.
rdar://9001553


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257428 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 01:08:46 +00:00
Manman Ren
9f927315d8 CXX_FAST_TLS calling convention: performance improvement for ARM.
This is the same change on ARM as r255821 on AArch64.
rdar://9001553


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257424 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-12 00:47:18 +00:00
Manman Ren
27e49b014c CXX_FAST_TLS calling convention: Add support for ARM on Darwin.
rdar://9001553


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257417 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 23:50:43 +00:00
Dan Gohman
d73b41ae22 [WebAssembly] Define WebAssembly-specific relocation codes.
Currently WebAssembly has two kinds of relocations; data addresses and
function addresses. This adds ELF relocations for them, as well as an
MC symbol kind to indicate which type of relocation is needed.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257416 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 23:38:05 +00:00
Rafael Espindola
8e7d481847 Remove a bugs assert.
There is no reason the value being printed has to be positive.
Fixes pr25802.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257412 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 23:21:45 +00:00
Matt Arsenault
6e3a667705 AMDGPU: Implement {{s|u}}int_to_fp i64 -> f32
The old lowering for uint_to_fp failed opencl conformance.
It might be OK for fast math mode, but I'm not sure.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257393 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 22:01:48 +00:00
Matt Arsenault
ea5802f212 AMDGPU: Cleanup udiv test
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257387 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 21:18:40 +00:00
Matt Arsenault
7717a8b940 AMDGPU: Fix crash with dispatch.ptr intrinsic with non-HSA target
It might be better to let this be a select failure instead.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257386 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 21:18:33 +00:00
Ahmed Bougacha
94dab4de4b [X86] Add AVX512 testcase for r248965/PR24512.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257385 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 21:16:21 +00:00
Matt Arsenault
3f2e0d9a1f AMDGPU: int_to_fp test cleanups
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257354 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 17:02:10 +00:00
Matt Arsenault
68f559ea61 AMDGPU: Fix ctlz combine for sub 32-bit types
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257353 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 17:02:06 +00:00
Matt Arsenault
f12a12cd25 AMDGPU: Pattern match ffbh pattern to instruction.
The hardware instruction's output on 0 is -1 rather than 32.
Eliminate a test and select to -1. This removes an extra instruction
from the compatability function with HSAIL's firstbit instruction.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257352 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 17:02:00 +00:00
Matt Arsenault
01a6cb6ce3 AMDGPU: Custom lower i64 ctlz
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257348 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 16:50:29 +00:00
Matt Arsenault
3bbc287300 LegalizeDAG: Expand ctlz with ctlz_zero_undef if legal
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257345 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 16:37:46 +00:00
Daniel Sanders
58a84d9b9c [mips] Never select JAL for calls to an absolute immediate address.
Summary:
It actually takes an offset into the current PC-region.

This fixes the 'expr' command in lldb.

Reviewers: vkalintiris, jaydeep, bhushan

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D16054


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257339 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 15:57:46 +00:00
Junmo Park
f756623784 [BranchFolding] Set correct mem refs (2nd try)
This is a recommit of r257253 which was reverted in r257270.
Previous testcase can make failure on some targets due to using opt with O3 option.

Original Summary:
Merge MBBICommon and MBBI's MMOs.

Differential Revision: http://reviews.llvm.org/D15990


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257317 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 07:15:38 +00:00
Craig Topper
172de01e7c [AVX-512] Make spacing between comma and {sae} operand consistent in asm strings.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257299 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-11 00:44:52 +00:00
Elena Demikhovsky
d6de44078b Optimized instruction sequence for sitofp operation on X86-32
Optimized sitofp i64 %x to double. The current sequence

movl %ecx, 8(%esp) 
movl %edx, 12(%esp) 
fildll 8(%esp)

is replaced with:

movd %ecx, %xmm0 
movd %edx, %xmm1 
punpckldq %xmm1, %xmm0 
movq %xmm0, 8(%esp)

Differential Revision: http://reviews.llvm.org/D15946



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257285 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-10 09:41:22 +00:00
Michael Zuckerman
e496b80d03 [AVX512] add PRORVQ and PRORVD Intrinsic
Differential Revision:http://reviews.llvm.org/D15955



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257283 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-10 09:16:41 +00:00
Joseph Tremoulet
12b6cd2e54 [WinEH] Disallow cyclic unwinds
Summary:
Funclet-based EH personalities/tables likely can't handle these, and they
can't be generated at source, so make them officially illegal in IR as
well.


Reviewers: andrew.w.kaylor, rnk, majnemer

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15963

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257274 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-10 04:31:05 +00:00
Joseph Tremoulet
c2d8241d67 [WinEH] Verify consistent funclet unwind exits
Summary:
A funclet EH pad may be exited by an unwind edge, which may be a
cleanupret exiting its cleanuppad, an invoke exiting a funclet, or an
unwind out of a nested funclet transitively exiting its parent.  Funclet
EH personalities require all such exceptional exits from a given funclet to
have the same unwind destination, and EH preparation / state numbering /
table generation implicitly depends on this.  Formalize it as a rule of
the IR in the LangRef and verifier.


Reviewers: rnk, majnemer, andrew.w.kaylor

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D15962

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257273 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-10 04:30:02 +00:00
Michael Zolotukhin
5dade21735 Revert "[BranchFolding] Set correct mem refs"
This reverts commit 1ff11017d2669b933b29fcbb6451cfcda34ad693.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257270 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-09 23:53:16 +00:00
Simon Pilgrim
95d397cf33 [X86][AVX] Match broadcast loads through a bitcast
AVX1 v8i32/v4i64 shuffles are bitcasted to v8f32/v4f64, this patch peeks through any bitcast to check for a load node to allow broadcasts to occur.

This is a re-commit of r257055 after r257264 fixed 32-bit broadcast loads of i64 scalars.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257266 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-09 20:59:39 +00:00
Simon Pilgrim
362488a724 [X86][AVX] Add support for i64 broadcast loads on 32-bit targets
Added 32-bit AVX1/AVX2 broadcast tests.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257264 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-09 19:59:27 +00:00
Junmo Park
1ff11017d2 [BranchFolding] Set correct mem refs
Merge MBBICommon and MBBI's MMOs.

Differential Revision: http://reviews.llvm.org/D15990


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257253 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-09 07:30:13 +00:00
Sanjay Patel
47436fa2d1 [DAGCombiner] don't dereference an operand that doesn't exist (PR26070)
The bug was introduced with changes for x86-64 fp128:
http://reviews.llvm.org/rL254653

I don't know why an x86 change is here, so I'll follow up in:
http://reviews.llvm.org/D15134

Should fix:
https://llvm.org/bugs/show_bug.cgi?id=26070



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257200 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 19:53:24 +00:00
Weiming Zhao
432ca7460b RBIT Instruction only available for ARMv6t2 and above.
Summary:
r255334 matches bit-reverse pattern in InstCombine and generates calls to Instrinsic::bitreverse.

RBIT instruction is only available for ARMv6t2 and above. This patch has the intrinsic expanded during legalization for ARMv4 and ARMv5.

Patch by Z. Zheng <zhaoshiz@codeaurora.org>

Reviewers: apazos, jmolloy, weimingz

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D15932

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257188 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 18:43:41 +00:00
Pirama Arumuga Nainar
0de8d7820f Do not ASSERTZEXT for i16 result of bitcast from f16 operand
Summary:
During legalization if i16, do not ASSERTZEXT the result of FP_TO_FP16.
Directly return an FP_TO_FP16 node with return type as the
promote-to-type of i16.

This patch also removes extraneous length check.  This legalization
should be valid even if integer and float types are of different
lengths.

This patch breaks a hard-float test for fp16 args.  The test is changed
to allow a vmov to zero-out the top bits, and also ensure that the
return value is in an FP register.

Reviewers: ab, jmolloy

Subscribers: srhines, llvm-commits

Differential Revision: http://reviews.llvm.org/D15438

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257184 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 17:46:05 +00:00
David Majnemer
dc706797f9 [WinEH] CatchHandler which don't have catch objects in StackColoring
StackColoring rewrites the frame indicies of operations involving
allocas if it can find that the life time of two objects do not overlap.
MSVC EH needs to be kept aware of this if happens in the event that a
catch object has moved around.  However, we represent the non-existance
of a catch object with a sentinel frame index (INT_MAX).  This sentinel
also happens to be the EmptyKey of the SlotRemap DenseMap.  Testing for
whether or not we need to translate the frame index fails in this case
because we call the count method on the DenseMap with the EmptyKey,
leading to assertions.  Instead, check if it is our sentinel value
before trying to look into the DenseMap.

This fixes PR26073.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257182 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 17:24:47 +00:00
Tom Stellard
d7ef3dae86 AMDGPU/SI: Emit global variable sizes when targeting HSA
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15952

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257173 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 14:50:28 +00:00
Tom Stellard
54fa7b1f76 AMDGPU: Emit functions sizes
Reviewers: arsenm

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15951

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257172 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 14:50:23 +00:00
David Majnemer
204e31b7ab [WinEH] Update WinEHFuncInfo if StackColoring merges allocas
Windows EH keeping track of which frame index corresponds to a catchpad
in order to inform the runtime where the catch parameter should be
initialized.  LLVM's optimizations are able to prove that the memory
used by the catch parameter can be reused with another memory
optimization, changing it's frame index.

We need to keep WinEHFuncInfo up to date with respect to this or we will
miscompile/assert.

This fixes PR26069.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257158 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 08:03:55 +00:00
Craig Topper
bf82c317d8 [X86] Don't print the aliased version of CVTSD2SI64rm. This appears to be a mistake I made years ago.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257149 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 06:09:18 +00:00
Kyle Butt
e51b530b12 Add call sequence start and end for __tls_get_addr
This is a fix for bug http://llvm.org/bugs/show_bug.cgi?id=25839.

For a PIC TLS variable access in a function, prologue (mflr followed by std and
stdu) gets scheduled after a tls_get_addr call. tls_get_addr messed up LR but
no one saves/restores it.

Also added a test for save/restore clobbered registers during calling __tls_get_addr.

Patch by Tim Shen

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257137 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 02:06:19 +00:00
Eric Christopher
d0fdbdba37 Add some testing for thumb1 and thumb2 inline asm immediate constraints
and fix a couple of bugs on inspection.

Also fixes PR26061.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257122 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-08 00:34:44 +00:00
JF Bastien
af01328140 WebAssembly: use .skip instead of .zero directive
.zero is confusing when used with two arguments. Documentation:

  This directive emits SIZE 0-valued bytes.  SIZE must be an absolute
  expression.  This directive is actually an alias for the '.skip'
  directive so in can take an optional second argument of the value to
  store in the bytes instead of zero.  Using '.zero' in this way would be
  confusing however.

Ref: https://sourceware.org/bugzilla/show_bug.cgi?id=18353

Hexagon and Sparc do the same, and it's all the same to WebAssembly so
let's pick the less confusing of the two.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257111 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 23:18:29 +00:00
Keno Fischer
c950114021 Temporarily revert r257105 "[Verifier] Check that debug values have proper size"
Looks like there's a case where clang generates debug info that triggers
the new verifier check. Reverting while investigating.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257107 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 22:39:11 +00:00
Keno Fischer
97515eb97b [Verifier] Check that debug values have proper size
Summary:
Teach the Verifier to make sure that the storage size given to llvm.dbg.declare
or the value size given to llvm.dbg.value agree with what is declared in
DebugInfo. This is implicitly assumed in a number of passes (e.g. in SROA).
Additionally this catches a number of common mistakes, such as passing a
pointer when a value was intended or vice versa.

One complication comes from stack coloring which modifies the original IR when
it merges allocas in order to make sure that if AA falls back to the IR it gets
the correct result. However, given this new invariant, indiscriminately
replacing one alloca by a different (differently sized one) is no longer valid.
Fix this by just undefing out any use of the alloca in a dbg.declare in this
case.

Additionally, I had to fix a number of test cases. Of particular note:
- I regenerated dbg-changes-codegen-branch-folding.ll from the given source as
  it was affected by the bug fixed in r256077
- two-cus-from-same-file.ll was changed to avoid having a variable-typed debug
  variable as that would depend on the target, even though this test is
  supposed to be generic
- I had to manually declared size/align for reference type. See also the
  discussion for D14275/r253186.
- fpstack-debuginstr-kill.ll required changing `double` to `long double`
- most others were just a question of adding OP_deref

Reviewers: aprantl
Differential Revision: http://reviews.llvm.org/D14276

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257105 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 22:18:37 +00:00
Derek Schuff
d9b4137f9f [WebAssembly] Support combining GEP and FrameIndex offsets in memory operand offset field
Previously we only supported putting the FI into memory operand offset
fields if there was nothing there already. Now combine them.

Differential Revision: http://reviews.llvm.org/D15941

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257084 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 18:55:52 +00:00
Dan Gohman
181f7cc0f3 [WebAssembly] Use the default private label prefixes.
The MC assembler doesn't like using the empty string as a private label
prefix because then it treats all labels as private. This commit reverts
back to the default prefix, which is .L, which is common in ELF targets
and consistent with the LLVM name mangler.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257083 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 18:49:53 +00:00
Nicolai Haehnle
702b589510 AMDGPU/SI: Fold operands with sub-registers
Summary:
Multi-dword constant loads generated unnecessary moves from SGPRs into VGPRs,
increasing the code size and VGPR pressure. These moves are now folded away.

Note that this lack of operand folding was not a problem for VMEM loads,
because COPY nodes from VReg_Nnn to VGPR32 are eliminated by the register
coalescer.

Some tests are updated, note that the fsub.ll test explicitly checks that
the move is elided.

With the IR generated by current Mesa, the changes are obviously relatively
minor:

7063 shaders in 3531 tests
Totals:
SGPRS: 351872 -> 352560 (0.20 %)
VGPRS: 199984 -> 200732 (0.37 %)
Code Size: 9876968 -> 9881112 (0.04 %) bytes
LDS: 91 -> 91 (0.00 %) blocks
Scratch: 1779712 -> 1767424 (-0.69 %) bytes per wave
Wait states: 295164 -> 295337 (0.06 %)

Totals from affected shaders:
SGPRS: 65784 -> 66472 (1.05 %)
VGPRS: 38064 -> 38812 (1.97 %)
Code Size: 1993828 -> 1997972 (0.21 %) bytes
LDS: 42 -> 42 (0.00 %) blocks
Scratch: 795648 -> 783360 (-1.54 %) bytes per wave
Wait states: 54026 -> 54199 (0.32 %)

Reviewers: tstellarAMD, arsenm, mareko

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15875

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257074 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 17:10:29 +00:00
Nicolai Haehnle
64f913f14f AMDGPU/SI: xnack_mask is always reserved on VI
Summary:
Somehow, I first interpreted the docs as saying space for xnack_mask is only
reserved when XNACK is enabled via SH_MEM_CONFIG. I felt uneasy about this and
went back to actually test what is happening, and it turns out that xnack_mask
is always reserved at least on Tonga and Carrizo, in the sense that flat_scr
is always fixed below the SGPRs that are used to implement xnack_mask, whether
or not they are actually used.

I confirmed this by writing a shader using inline assembly to tease out the
aliasing between flat_scratch and regular SGPRs. For example, on Tonga, where
we fix the number of SGPRs to 80, s[74:75] aliases flat_scratch (so
xnack_mask is s[76:77] and vcc is s[78:79]).

This patch changes both the calculation of the total number of SGPRs and the
various register reservations to account for this.

It ought to be possible to use the gap left by xnack_mask when the feature
isn't used, but this patch doesn't try to do that. (Note that the same applies
to vcc.)

Note that previously, even before my earlier change in r256794, the SGPRs that
alias to xnack_mask could end up being used as well when flat_scr was unused
and the total number of SGPRs happened to fall on the right alignment
(e.g. highest regular SGPR being used s29 and VCC used would lead to number
of SGPRs being 32, where s28 and s29 alias with xnack_mask). So if there
were some conflict due to such aliasing, we should have noticed that already.

Reviewers: arsenm, tstellarAMD

Subscribers: arsenm, llvm-commits

Differential Revision: http://reviews.llvm.org/D15898

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257073 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 17:10:20 +00:00
Michael Zuckerman
496a771bba [avx512] Fix test avx512bw-intrinsics.ll
Change the CHECK lablel into AVX512BW 
And fix declare lable of llvm.x86.avx512.mask.psrav32_hi 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257071 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 16:25:42 +00:00
Michael Zuckerman
6c7a788883 [AVX512] add PSLLW and PSLLV Intrinsic
Differential Revision: http://reviews.llvm.org/D15889


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257070 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 16:02:51 +00:00
Nico Weber
0a765136e6 Revert r257055, it caused PR26064.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@257066 91177308-0d34-0410-b5e6-96231b3b80d8
2016-01-07 15:01:46 +00:00