Commit Graph

14890 Commits

Author SHA1 Message Date
Joseph Tremoulet
b0d280a588 [WinEH] Fix establisher param reg in CLR funclets
Summary:
The CLR's personality routine passes the pointer to the establisher frame
in RCX, not RDX.

Reviewers: pgavlin, majnemer, rnk

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14343

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252135 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 02:20:07 +00:00
Matt Arsenault
ecbecea873 AMDGPU: Add missing v2f64 fadd tests
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252117 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-05 01:03:11 +00:00
Quentin Colombet
aa1e5aa9df [x86] Teach the shrink-wrapping hooks to do the proper thing with Win64.
Win64 has some strict requirements for the epilogue. As a result, we disable
shrink-wrapping for Win64 unless the block that gets the epilogue is already an
exit block.

Fixes PR24193.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252088 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 22:37:28 +00:00
Simon Pilgrim
0a019f7162 [X86][SSE] Add general memory folding for (V)INSERTPS instruction
This patch improves the memory folding of the inserted float element for the (V)INSERTPS instruction.

The existing implementation occurs in the DAGCombiner and relies on the narrowing of a whole vector load into a scalar load (and then converted into a vector) to (hopefully) allow folding to occur later on. Not only has this proven problematic for debug builds, it also prevents other memory folds (notably stack reloads) from happening.

This patch removes the old implementation and moves the folding code to the X86 foldMemoryOperand handler. A new private 'special case' function - foldMemoryOperandCustom - has been added to deal with memory folding of instructions that can't just use the lookup tables - (V)INSERTPS is the first of several that could be done.

It also tweaks the memory operand folding code with an additional pointer offset that allows existing memory addresses to be modified, in this case to convert the vector address to the explicit address of the scalar element that will be inserted.

Unlike the previous implementation we now set the insertion source index to zero, although this is ignored for the (V)INSERTPSrm version, anything that relied on shuffle decodes (such as unfolding of insertps loads) was incorrectly calculating the source address - I've added a test for this at insertps-unfold-load-bug.ll

Differential Revision: http://reviews.llvm.org/D13988

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252074 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 20:48:09 +00:00
Andrew Kaylor
80a2456665 Created new X86 FMA3 opcodes (FMA*_Int) that are used now for lowering of scalar FMA intrinsics.
Patch by Slava Klochkov 

The key difference between FMA* and FMA*_Int opcodes is that FMA*_Int opcodes are handled more conservatively. It is illegal to commute the 1st operand of FMA*_Int instructions as the upper bits of scalar FMA intrinsic result must be taken from the 1st operand, but such commute transformation would change those upper bits and invalidate the intrinsic's result.

Reviewers: Quentin Colombet, Elena Demikhovsky

Differential Revision: http://reviews.llvm.org/D13710



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252060 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 18:10:41 +00:00
James Molloy
447c9ea9e1 [ARM] Combine CMOV into BFI where possible
If we have a CMOV, OR and AND combination such as:
  if (x & CN)
    y |= CM;

And:
  * CN is a single bit;
  * All bits covered by CM are known zero in y;

Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252057 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 16:55:07 +00:00
Michael Kuperstein
c628c1e0b4 [X86] DAGCombine should not introduce FILD in soft-float mode
The x86 "sitofp i64 to double" dag combine, in 32-bit mode, lowers sitofp 
directly to X86ISD::FILD (or FILD_FLAG). This should not be done in soft-float mode.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252042 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 11:17:53 +00:00
Igor Laevsky
7e6636cb71 [StatepointLowering] Remove distinction between call and invoke safepoints
There is no point in having invoke safepoints handled differently than the
call safepoints. All relevant decisions could be made by looking at whether
or not gc.result and gc.relocate lay in a same basic block. This change will
 allow to lower call safepoints with relocates and results in a different 
basic blocks. See test case for example.

Differential Revision: http://reviews.llvm.org/D14158



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252028 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-04 01:16:10 +00:00
Derek Schuff
5c3718f501 Address nit
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252004 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 22:40:45 +00:00
Derek Schuff
05d7d32e12 [WebAssembly] Support wasm select operator
Summary:
Add support for wasm's select operator, and lower LLVM's select DAG node
to it.

Reviewers: sunfish

Subscribers: dschuff, llvm-commits, jfb

Differential Revision: http://reviews.llvm.org/D14295

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252002 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 22:40:40 +00:00
Simon Pilgrim
be22715ca8 [X86][AVX] Tweaked shuffle stack folding tests
To avoid alternative lowerings.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251986 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 21:58:35 +00:00
Simon Pilgrim
6c02d22686 [X86][AVX512] Fixed shuffle test name to match shuffle
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251984 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 21:39:30 +00:00
Simon Pilgrim
91c642526e [X86][XOP] Add support for the matching of the VPCMOV bit select instruction
XOP has the VPCMOV instruction that performs the common vector bit select operation OR( AND( SRC1, SRC3 ), AND( SRC2, ~SRC3 ) )

This patch adds tablegen pattern matching for this instruction.

Differential Revision: http://reviews.llvm.org/D8841

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251975 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 20:27:01 +00:00
Rafael Espindola
1cc68742af Remove unnecessary dependency on section and string positions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251964 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 19:24:17 +00:00
Michael Kuperstein
a64f17562f [X86] Generate .cfi_adjust_cfa_offset correctly when pushing arguments
When push instructions are being used to pass function arguments on
the stack, and either EH or debugging are enabled, we need to generate
.cfi_adjust_cfa_offset directives appropriately. For (synch) EH, it is
enough for the CFA offset to be correct at every call site, while
for debugging we want to be correct after every push.

Darwin does not support this well, so don't use pushes whenever it
would be required.

Differential Revision: http://reviews.llvm.org/D13767

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251904 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-03 08:17:25 +00:00
Matt Arsenault
142cd116fa AMDGPU: Stop assuming vreg for build_vector
This was causing a variety of test failures when v2i64
is added as a legal type.

SIFixSGPRCopies should correctly handle the case of vector inputs
to a scalar reg_sequence, so this isn't necessary anymore. This
was hiding some deficiencies in how reg_sequence is handled later,
but this shouldn't be a problem anymore since the register class
copy of a reg_sequence is now done before the reg_sequence.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251860 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:30:48 +00:00
Matt Arsenault
896d6554cb AMDGPU: Error on graphics shaders with HSA
I've found myself pointlessly debugging problems from running
graphics tests with an HSA triple a few times, so stop this from
happening again.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251858 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:23:02 +00:00
Matt Arsenault
ee4932fd12 AMDGPU: Un XFAIL a test
This should probably be merged with one of the other private memory
tests, but it fails on r600.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251856 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:15:46 +00:00
Matt Arsenault
bd96659e08 AMDGPU: Distribute SGPR->VGPR copies of REG_SEQUENCE
Make the REG_SEQUENCE be a VGPR, and do the register class
copy first.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251855 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 23:15:42 +00:00
Cong Hou
b18412ca5b In MachineBlockPlacement, filter cold blocks off the loop chain when profile data is available.
In the current BB placement algorithm, a loop chain always contains all loop blocks. This has a drawback that cold blocks in the loop may be inserted on a hot function path, hence increasing branch cost and also reducing icache locality.

Consider a simple example shown below:

A
|
B⇆C
|
D

When B->C is quite cold, the best BB-layout should be A,B,D,C. But the current implementation produces A,C,B,D.

This patch filters those cold blocks off from the loop chain by comparing the ratio:

LoopBBFreq / LoopFreq

to 20%: if it is less than 20%, we don't include this BB to the loop chain. Here LoopFreq is the frequency of the loop when we reduce the loop into a single node. In general we have more cold blocks when the loop has few iterations. And vice versa.


Differential revision: http://reviews.llvm.org/D11662




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251833 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 21:24:00 +00:00
James Y Knight
74615f5548 Fix two issues in MergeConsecutiveStores:
1) PR25154. This is basically a repeat of PR18102, which was fixed in
r200201, and broken again by r234430. The latter changed which of the
store nodes was merged into from the first to the last. Thus, we now
also need to prefer merging a later store at a given address into the
target node, instead of an earlier one.

2) While investigating that, I also realized I'd introduced a bug in
r236850. There, I removed a check for alignment -- not realizing that
nothing except the alignment check was ensuring that none of the stores
were overlapping! This is a really bogus way to ensure there's no
aliased stores.

A better solution to both of these issues is likely to always use the
code added in the 'if (UseAA)' branches which rearrange the chain based
on a more principled analysis. I'll look into whether that can be used
always, but in the interest of getting things back to working, I think a
minimal change makes sense.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251816 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 18:48:08 +00:00
Nemanja Ivanovic
ee10ab5e95 Fix for bootstrap bug introduced in r244921
This revision has introduced an issue that only affects bootstrapped compiler
when it is printing the ASM. It turns out that the new code path taken due to
legalizing a scalar_to_vector of i64 -> v2i64 exposes a missing check in a
micro optimization to change a load followed by a scalar_to_vector into a
load and splat instruction on PPC.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251798 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 14:01:11 +00:00
Igor Breger
9aaefc3baa AVX512: Implemented encoding and intrinsics for VBROADCASTI32x2 and VBROADCASTF32x2 instructions.
Differential Revision: http://reviews.llvm.org/D14216

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251781 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 07:39:36 +00:00
Craig Topper
417a782401 [X86] Don't pass a scale value of 0 to scatter/gather intrinsics. This causes the code emitter to throw an assertion if we try to encode it. Need to add a check to fail isel for this, but for now avoid testing it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251779 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-02 07:24:37 +00:00
Elena Demikhovsky
77ab058bf8 AVX-512: Optimized SIMD truncate operations for AVX512F set.
Optimized <8 x i32> to <8 x i16>
<4 x i64> to < 4 x i32>
<16 x i16> to <16 x i8>
All these oprtrations use now AVX512F set (KNL). Before this change it was implemented with AVX2 set.


Differential Revision: http://reviews.llvm.org/D14108



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251764 91177308-0d34-0410-b5e6-96231b3b80d8
2015-11-01 11:45:47 +00:00
JF Bastien
677d6a3a87 [WebAssembly] Fix import statement
Summary:
Imports should be generated like (param i32 f32...) not (param i32) (param f32) ...

Author: binji
Reviewers: jfb
Subscribers: jfb, dschuff

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251714 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-30 16:41:21 +00:00
Tim Northover
9c59c5bea7 ARM: add extra test for watchOS ABI
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251705 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-30 16:29:44 +00:00
Weiming Zhao
57ece5c685 Revert "[ARM] Remove XFAIL on test/CodeGen/Generic/MachineBranchProb.ll"
Summary:
This reverts commit 79c37e1a4f.
    
    This test passes locally but fails on the community buildbot. So we will let it
    XFAIL for now.

Patched by Mandeep Singh Grang (mgrang@codeaurora.org)

Reviewers: kparzysz, weimingz

Subscribers: aemerson, rengolin, llvm-commits

Differential Revision: http://reviews.llvm.org/D14189

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251664 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 22:34:59 +00:00
Simon Pilgrim
d97fb0831c [X86][SSE] Added load+sext tests for 16i1->16i8 and 32i1->32i8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251661 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 22:19:21 +00:00
Simon Pilgrim
d63f887b3e [X86][SSE] Shuffle blends with zero
This patch generalizes the zeroing of vector elements with the BLEND instructions. Currently a zero vector will only blend if the shuffled elements are correctly inline, this patch recognises when a vector input is zero (or zeroable) and modifies a local copy of the shuffle mask to support a blend. As a zeroable vector input may not be all zeroes, the zeroable vector is regenerated if necessary.

Differential Revision: http://reviews.llvm.org/D14050

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251659 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 22:11:28 +00:00
Weiming Zhao
79c37e1a4f [ARM] Remove XFAIL on test/CodeGen/Generic/MachineBranchProb.ll
Summary: Refer PR23377. This test was XFAIL'ed for Hexagon as well as ARM. But it has now started passing for ARM.

Reviewers: hans, rengolin, aemerson, kparzysz

Subscribers: aemerson, llvm-commits, rengolin

Differential Revision: http://reviews.llvm.org/D14155

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251652 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 20:51:54 +00:00
Jonas Paulsson
ea81d62d37 [SystemZ] Make the CCRegs regclass non-allocatable.
This was discovered to be necessary while running memchr-01.ll with
-verify-machinstrs, because it is not allowed to have a phys reg live
accross block boundaries while on SSA form, if the register is
allocatable (expect in entry block and landing pads).

In this test case, stringRRE pseudos are expanded after isel by adding
a loop block which produces a live out CC register. To make the test
pass, it was also necessary to not say that StringRRELoop pseudo uses
R0L, this is only true for the StringRRE opcode.

-verify-machineinstrs added to memchr-01.ll test.

New test case int-cmp-51.ll to test that MachineCSE can eliminate
an identical compare (which it couldn't do before).

Reviewed by Ulrich Weigand

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251634 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 16:13:55 +00:00
Marek Olsak
40e7b16d54 AMDGPU/SI: handle undef for llvm.SI.packf16
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251632 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:29:09 +00:00
Marek Olsak
595eb2bbd1 AMDGPU/SI: use S_OR for fneg (fabs f32)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251631 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:29:05 +00:00
Marek Olsak
bbc32e3efd AMDGPU/SI: use S_AND for i1 trunc
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251630 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 15:05:03 +00:00
Zoran Jovanovic
1a6d92e562 [mips] wrong opcode for ll/sc instructions on mipsr6 when -integrated-as is used
Summary:
This commit resolves wrong opcodes for ll and sc instructions for r6 architecutres, which were generated in method MipsTargetLowering::emitAtomicBinary.

Author: Jelena.Losic

Reviewers: dsanders

Subscribers: dsanders, llvm-commits

Differential Revision: http://reviews.llvm.org/D13593


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251629 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 14:40:19 +00:00
Vasileios Kalintiris
8bbd2ffc7c [mips] Check the register class before replacing materializations of zero with $zero in microMIPS.
Summary:
The microMIPS register class GPRMM16 does not contain the $zero register.
However, MipsSEDAGToDAGISel::replaceUsesWithZeroReg() would replace uses
of the $dst register:

  [d]addiu, $dst, $zero, 0

with the $zero register, without checking for membership in the register
class of the target machine operand.

Reviewers: dsanders

Subscribers: llvm-commits, dsanders

Differential Revision: http://reviews.llvm.org/D13984

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251622 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 10:17:16 +00:00
Jonas Paulsson
0b88d3e7ee [MachineVerifier] Analyze MachineMemOperands for mem-to-mem moves.
Since the verifier will give false reports if it incorrectly thinks MI is
loading or storing using an FI, it is necessary to scan memoperands and
find out how the FI is used in the instruction. This should be relatively
rare.

Needed to make CodeGen/SystemZ/spill-01.ll pass, which now runs with this flag.

Reviewed by Quentin Colombet.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251620 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 08:28:35 +00:00
JF Bastien
0c4ea613c4 [WebAssembly] Update opcode name format for conversions
Summary:
Conversion opcode name format should be f64.convert_u/i64 not f64_convert_u

Author: s3ththompson
Reviewers: jfb
Subscribers: sunfish, jfb, llvm-commits, dschuff
Differential Revision: http://reviews.llvm.org/D14160

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251613 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-29 04:10:52 +00:00
Hal Finkel
b93a3a1757 [PowerPC] Recurse through constants when looking for TLS globals
We cannot form ctr-based loops around function calls, including calls to
__tls_get_addr used for PIC TLS variables. References to such TLS variables,
however, might be buried within constant expressions, and so we need to search
the entire constant expression to be sure that no references to such TLS
variables exist.

Fixes PR25256, reported by Eric Schweitz. This is a slightly-modified version
of the patch suggested by Eric in the bug report, and a test case I created.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251582 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 23:43:00 +00:00
Hal Finkel
75c3afbe05 [PowerPC] Don't return unsupported register classes for asm constraints
As a follow-up to r251566, do the same for the other optionally-supported
register classes (mostly for vector registers). Don't return an unavailable
register class (which would cause an assert later), but fail cleanly when
provided an unsupported inline asm constraint.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251575 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 23:03:45 +00:00
Tim Northover
ed754ee4a7 ARM: add support for WatchOS's compact unwind information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251573 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 22:56:36 +00:00
Tim Northover
7b7ff9e152 ARM: teach backend about WatchOS and TvOS libcalls.
The most substantial changes are again for watchOS: libcalls are hard-float if
needed and sincos has a different calling convention.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251571 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 22:51:16 +00:00
Tim Northover
26541ec6e9 ARM: add backend support for the ABI used in WatchOS
At the LLVM level this ABI is essentially a minimal modification of AAPCS to
support 16-byte alignment for vector types and the stack.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251570 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 22:46:43 +00:00
Hal Finkel
16624bec2f [PowerPC] Cleanly reject asm crbit constraint with -crbits
When crbits are disabled, cleanly reject the constraint (return the register
class only to cause an assert later).

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251566 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 22:25:52 +00:00
Hal Finkel
ca04c7f8e3 [PowerPC] Fix CodeGen/PowerPC/crbit-asm.ll test for -O1
Add the crbits processor feature so that the test can be run at -O1, etc.
regardless of the default crbits setting.

Fixes PR23778.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251548 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 19:58:02 +00:00
JF Bastien
a2446f8d17 WebAssembly: disable some loop-idiom recognition
memset/memcpy aren't fully supported yet. We should invert this test
once they are supported.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251534 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 17:50:23 +00:00
Hal Finkel
e60bc4e8b8 [PowerPC] Replace cntlz[.] with cntlzw[.]
cntlz is the old POWER mnemonic. cntlzw is the PowerPC mnemonic.

This change fixes an issue when -no-integrated-as: The opcode cntlz is
unrecognized by gas

Alias the POWER mnemonic cntlz[.] to the PowerPC mnemonic cntlzw[.]
This is done for because the POWER cntlz mnemonic has be used by LLVM for
a very long time. We need to make sure that assembly programs
that are using the cntlz[.] do not break with this change.

Change PowerPC tests to reflect the insn change from cntlz to cntlzw.
Add assembly test to verify cntlz[.] is encoded correctly.

Patch by Tom Rix!

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251489 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 03:26:45 +00:00
Sanjoy Das
0c67849a49 [SelectionDAG] Don't inspect !range metadata for extended loads
Summary:
Don't call `computeKnownBitsFromRangeMetadata` for extended loads --
this can cause a mismatch between the width of the !range metadata and
the width of the APInt's accumulating `KnownZero` (and `KnownOne` in the
future).  This isn't a problem now, but will be after a future change.

Note: this can be made more aggressive in the future.

Reviewers: nlewycky

Subscribers: llvm-commits

Differential Revision: http://reviews.llvm.org/D14107

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251486 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-28 03:20:10 +00:00
Simon Pilgrim
d694139c3a [X86][AVX512] Test UNPCK with non-sequential scalars
Missing tests for r251297

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251453 91177308-0d34-0410-b5e6-96231b3b80d8
2015-10-27 21:18:45 +00:00