25849 Commits

Author SHA1 Message Date
Robin Morisset
217b38e19a Fix typos in comments, NFC
Summary: Just fixing comments, no functional change.

Test Plan: N/A

Reviewers: jfb

Subscribers: mcrosier, llvm-commits

Differential Revision: http://reviews.llvm.org/D5130

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216784 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:53:01 +00:00
Reid Kleckner
9436574d1b musttail: Forward regparms of variadic functions on x86_64
Summary:
If a variadic function body contains a musttail call, then we copy all
of the remaining register parameters into virtual registers in the
function prologue. We track the virtual registers through the function
body, and add them as additional registers to pass to the call. Because
this is all done in virtual registers, the register allocator usually
gives us good code. If the function does a call, however, it will have
to spill and reload all argument registers (ew).

Forwarding regparms on x86_32 is not implemented because most compilers
don't support varargs in 32-bit with regparms.

Reviewers: majnemer

Subscribers: aemerson, llvm-commits

Differential Revision: http://reviews.llvm.org/D5060

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216780 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:42:08 +00:00
Reid Kleckner
dae28732f4 Verifier: Don't reject varargs callee cleanup functions
We've rejected these kinds of functions since r28405 in 2006 because
it's impossible to lower the return of a callee cleanup varargs
function. However there are lots of legal ways to leave such a function
without returning, such as aborting. Today we can leave a function with
a musttail call to another function with the correct prototype, and
everything works out.

I'm removing the verifier check declaring that a normal return from such
a function is UB.

Reviewed By: nlewycky

Differential Revision: http://reviews.llvm.org/D5059

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216779 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:25:28 +00:00
Louis Gerbarg
6393b3a677 Remove spurious mask operations from AArch64 add->compares on 16 and 8 bit values
This patch checks for DAG patterns that are an add or a sub followed by a
compare on 16 and 8 bit inputs. Since AArch64 does not support those types
natively they are legalized into 32 bit values, which means that mask operations
are inserted into the DAG to emulate overflow behaviour. In many cases those
masks do not change the result of the processing and just introduce a dependent
operation, often in the middle of a hot loop.

This patch detects the relevent DAG patterns and then tests to see if the
transforms are equivalent with and without the mask, removing the mask if
possible. The exact mechanism of this patch was discusses in
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-July/074444.html

There is a reasonably good chance there are missed oppurtunities due to similiar
(but not identical) DAG patterns that could be funneled into this test, adding
them should be simple if we see test cases.

Tests included.

rdar://13754426

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216776 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 21:00:22 +00:00
Reid Kleckner
1469e29334 X86: Fix conflict over ESI between base register and rep;movsl
The new solution is to not use this lowering if there are any dynamic
allocas in the current function. We know up front if there are dynamic
allocas, but we don't know if we'll need to create stack temporaries
with large alignment during lowering. Conservatively assume that we will
need such temporaries.

Reviewed By: hans

Differential Revision: http://reviews.llvm.org/D5128

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216775 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 20:50:31 +00:00
Robin Morisset
66b380b6b6 Relax the constraint more in MemoryDependencyAnalysis.cpp
Even loads/stores that have a stronger ordering than monotonic can be safe.
The rule is no release-acquire pair on the path from the QueryInst, assuming that
the QueryInst is not atomic itself.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216771 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 20:32:58 +00:00
Jean-Luc Duprat
1da4862939 Tablegen fixes for new syntax when initializing bits from variables.
Followup to r215086.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216757 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 19:41:04 +00:00
Juergen Ributzka
d8835d09ec [FastISel][AArch64] Fix an incorrect kill flag due to a bug in SelectTrunc.
When we select a trunc instruction we don't emit any code if the type is already
i32 or smaller. This is because the instruction that uses the truncated value
will deal with it.

This behavior can incorrectly transfer a kill flag, which was meant for the
result of the truncate, onto the source register.

%2 = trunc i32 %1 to i16
... = ... %2                -> ... = ... vreg1 <kill>
... = ... %1                   ... = ... vreg1

This commit fixes this by emitting a COPY instruction, so that the result and
source register are distinct virtual registers.

This fixes rdar://problem/18178188.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216750 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:58:16 +00:00
Tilmann Scheller
59758c4337 [ARM] Add Thumb-2 code size optimization test for ASR (register).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216746 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:19:00 +00:00
Tilmann Scheller
b1424d72ca [ARM] Add Thumb-2 code size optimization test for ASR (immediate).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216744 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 17:02:28 +00:00
Matt Arsenault
0d90962f29 Make fabs safe to speculatively execute
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216736 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 16:01:17 +00:00
Matt Arsenault
f4d57e7874 R600/SI: Use mad for fsub + fmul
We can use a negate source modifier to match
this for fsub.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216735 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 16:01:14 +00:00
Tim Northover
1e77dc84c4 AArch64: only try to get operand of a known node.
A bug in r216725 meant we tried to discover the type of a SETCC before
confirming the node actually was a SETCC.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216734 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:34:58 +00:00
Frederic Riss
d1c544306e Remove unnecessary regex in test pattern per dblaikie suggestion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216733 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:32:15 +00:00
Jingyue Wu
87a2b36cf6 [NVPTX] Make the alignment an explicit argument to ldu/ldg
Summary:
Instead of specifying the alignment as metadata which may be destroyed by
transformation passes, make the alignment the second argument to ldu/ldg
intrinsic calls.

Test Plan:
ldu-ldg.ll
ldu-i8.ll
ldu-reg-plus-offset.ll

Reviewers: eliben, meheff, jholewinski

Reviewed By: meheff, jholewinski

Subscribers: jholewinski, llvm-commits

Differential Revision: http://reviews.llvm.org/D5093

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216731 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:30:20 +00:00
Tilmann Scheller
c5484a2704 [ARM] Make Thumb-2 code size optimization test more strict.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216729 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:13:35 +00:00
Tilmann Scheller
f238c1844e [ARM] Add a first test for the Thumb-2 code size optimization pass.
While working on a Thumb-2 code size optimization I just realized that we don't have any regression tests for it.

So here's a first test case, I plan to increase the coverage over time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216728 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 15:04:40 +00:00
Tim Northover
1f70cb9c14 AArch64: skip select/setcc combine in complex case.
In an llvm-stress generated test, we were trying to create a v0iN type and
asserting when that failed. This case could probably be handled by the
function, but not without added complexity and the situation it arises in is
sufficiently odd that there's probably no benefit anyway.

Should fix PR20775.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216725 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 13:05:18 +00:00
Frederic Riss
622de3b2eb Use DwarfDebug::attachLowHighPC for the compilation unit DIE.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216719 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 09:00:26 +00:00
Robert Khasanov
37e671e894 [SKX] Enable lowering of integer CMP operations.
Added new types to Legalizer.
Fixed getSetCCResultType function
Added lowering tests.

Reviewed by Elena Demikhovsky.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216717 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 08:46:04 +00:00
Job Noorman
d2323fc295 Do not assume the value passed to memset is an i32.
The code in SelectionDAG::getMemset for some reason assumes the value passed to
memset is an i32. This breaks the generated code for targets that only have
registers smaller than 32 bits because the value might get split into multiple
registers by the calling convention. See the test for the MSP430 target included
in the patch for an example.

This patch ensures that nothing is assumed about the type of the value. Instead,
the type is taken from the selected overload of the llvm.memset intrinsic.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216716 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 08:23:53 +00:00
Jiangning Liu
3cd73a5ded [AArch64] Fix some failures exposed by value type v4f16 and v8f16.
1) Add some missing bitcast patterns for v8f16.
2) Add type promotion for operand of ld/st operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216706 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 01:31:42 +00:00
Alexey Samsonov
79e9b30b11 Introduce -DLLVM_USE_SANITIZER=Undefined CMake option to build UBSan-ified version of LLVM/Clang.
I've fixed most of the simple bugs and currently "check-llvm" test suite
has 26 failures, and "check-clang" suite has 5 failures.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216701 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:50:36 +00:00
Juergen Ributzka
cf45151b2c [FastISel][AArch64] Don't fold instructions that are not in the same basic block.
This fix checks first if the instruction to be folded (e.g. sign-/zero-extend,
or shift) is in the same machine basic block as the instruction we are folding
into.

Not doing so can result in incorrect code, because the value might not be
live-out of the basic block, where the value is defined.

This fixes rdar://problem/18169495.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216700 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:19:21 +00:00
David Majnemer
6acfc54706 Revert two GEP-related InstCombine commits
This reverts commit r216523 and r216598; people have reported
regressions.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216698 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-29 00:06:43 +00:00
Reid Kleckner
8f58e02646 Don't promote byval pointer arguments when padding matters
Don't promote byval pointer arguments when when their size in bits is
not equal to their alloc size in bits. This can happen for x86_fp80,
where the size in bits is 80 but the alloca size in bits in 128.
Promoting these types can break passing unions of x86_fp80s and other
types.

Patch by Thomas Jablin!

Reviewed By: rnk

Differential Revision: http://reviews.llvm.org/D5057

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216693 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 22:42:00 +00:00
Jim Grosbach
0d34b1ed26 AArch64: More correctly constrain target vector extend lowering.
The AArch64 target lowering for [zs]ext of vectors is set up to handle
input simple types and expects the generic SDag path to do something reasonable
with anything that's not a simple type. The code, however, was only
checking that the result type was a simple type and assuming that
implied that the source type would also be a simple type. That's not a
valid assumption, as operations like "zext <1 x i1> %0 to <1 x i32>"
demonstrate. The fix is to simply explicitly validate the source type
as well as the result type.

PR20791

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216689 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 22:08:28 +00:00
Rafael Espindola
4bb535027e On MachO, don't put non-private constants in mergeable sections.
On MachO, putting a symbol that doesn't start with a 'L' or 'l' in one of the
__TEXT,__literal* sections prevents the linker from merging the context of the
section.

Since private GVs are the ones the get mangled to start with 'L' or 'l', we now
only put those on the __TEXT,__literal* sections.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216682 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 20:13:31 +00:00
Sanjay Patel
cf9661c6f9 Fix a logic bug in x86 vector codegen: sext (zext (x) ) != sext (x) (PR20472).
Remove a block of code from LowerSIGN_EXTEND_INREG() that was added with:
http://llvm.org/viewvc/llvm-project?view=revision&revision=177421

And caused:
http://llvm.org/bugs/show_bug.cgi?id=20472 (more analysis here)
http://llvm.org/bugs/show_bug.cgi?id=18054

The testcases confirm that we (1) don't remove a zext op that is necessary and (2) generate
a pmovz instead of punpck if SSE4.1 is available. Although pmovz is 1 byte longer, it allows 
folding of the load, and so saves 3 bytes overall.

Differential Revision: http://reviews.llvm.org/D4909



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216679 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 18:59:22 +00:00
Erik Eckstein
c84ba857ea Fix: SLPVectorizer tried to move an instruction which was replaced by a vector instruction.
For a detailed description of the problem see the comment in the test file.
The problematic moveBefore() calls are not required anymore because the new
scheduling algorithm ensures a correct ordering anyway.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216656 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 07:04:02 +00:00
David Xu
5ca793561e Generate CMN when comparing a short int with minus
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216651 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 04:59:53 +00:00
Chandler Carruth
da3a293313 [x86] Clean up some tests to use FileCheck and combine two into a single
file.

Changing code that is covered by these tests is just too hard to debug
currently, and now it will be clear the nature of the changes.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216643 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 03:41:28 +00:00
David Majnemer
b11fff1d8a InstSimplify: Move a transform from InstCombine to InstSimplify
Several combines involving icmp (shl C2, %X) C1 can be simplified
without introducing any new instructions.  Move them to InstSimplify;
while we are at it, make them more powerful.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216642 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 03:34:28 +00:00
Juergen Ributzka
4a76317ebb [FastISel] Undo phi node updates when falling-back to SelectionDAG.
The included test case would fail, because the MI PHI node would have two
operands from the same predecessor.

This problem occurs when a switch instruction couldn't be selected. This happens
always, because there is no default switch support for FastISel to begin with.

The problem was that FastISel would first add the operand to the PHI nodes and
then fall-back to SelectionDAG, which would then in turn add the same operands
to the PHI nodes again.

This fix removes these duplicate PHI node operands by reseting the
PHINodesToUpdate to its original state before FastISel tried to select the
instruction.

This fixes <rdar://problem/18155224>.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216640 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 02:06:55 +00:00
Juergen Ributzka
d24494d672 [FastISel]
Currently instructions are folded very aggressively for AArch64 into the memory
operation, which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

This fix teaches hasTrivialKill to not only check the LLVM IR that the value has
a single use, but also to check if the register that represents that value has
already been used. This can happen when the instruction with the use was folded
into another instruction (in this particular case a load instruction).

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216634 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-28 00:09:46 +00:00
Juergen Ributzka
a26b1bdcc8 Revert "[FastISel][AArch64] Don't fold instructions too aggressively into the memory operation."
Quentin pointed out that this is not the correct approach and there is a better and easier solution.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216632 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 23:09:40 +00:00
Juergen Ributzka
1f5263e43f [FastISel][AArch64] Don't fold instructions too aggressively into the memory operation.
Currently instructions are folded very aggressively into the memory operation,
which can lead to the use of killed operands:
  %vreg1<def> = ADDXri %vreg0<kill>, 2
  %vreg2<def> = LDRBBui %vreg0, 2
  ... = ... %vreg1 ...

This usually happens when the result is also used by another non-memory
instruction in the same basic block, or any instruction in another basic block.

If the computed address is used by only memory operations in the same basic
block, then it is safe to fold them. This is because all memory operations will
fold the address computation and the original computation will never be emitted.

This fixes rdar://problem/18142857.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216629 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 22:52:33 +00:00
Juergen Ributzka
ccf53013cd [FastISel][AArch64] Fix simplify address when the address comes from a shift.
When the address comes directly from a shift instruction then the address
computation cannot be folded into the memory instruction, because the zero
register is not available as a base register. Simplify addess needs to emit the
shift instruction and use the result as base register.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216621 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:38:33 +00:00
Juergen Ributzka
d445e4acdb [FastISel][AArch64] Use the zero register for stores.
Use the zero register directly when possible to avoid an unnecessary register
copy and a wasted register at -O0. This also uses integer stores to store a
positive floating-point zero. This saves us from materializing the positive zero
in a register and then storing it.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216617 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 21:04:52 +00:00
Reid Kleckner
2ab3b563da X86 MC: Handle instructions like fxsave that match multiple operand sizes
Instructions like 'fxsave' and control flow instructions like 'jne'
match any operand size. The loop I added to the Intel syntax matcher
assumed that using a different size would give a different instruction.
Now it handles the case where we get the same instruction for different
memory operand sizes.

This also allows us to remove the hack we had for unsized absolute
memory operands, because we can successfully match things like 'jnz'
without reporting ambiguity.  Removing this hack uncovered test case
involving 'fadd' that was ambiguous. The memory operand could have been
single or double precision.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216604 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:10:38 +00:00
David Majnemer
8ee308f499 InstCombine: Combine gep X, (Y-X) to Y
We try to perform this transform in InstSimplify but we aren't always
able to.  Sometimes, we need to insert a bitcast if X and Y don't have
the same time.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216598 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:08:37 +00:00
David Majnemer
48164ed24d InstSimplify: Don't simplify gep X, (Y-X) to Y if types differ
It's incorrect to perform this simplification if the types differ.
A bitcast would need to be inserted for this to work.

This fixes PR20771.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216597 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:08:34 +00:00
Nico Weber
1c768816d7 Reland r216439 215441, majnemer has a real fix for PR20771.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216586 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:06:19 +00:00
Nico Weber
b2f71836eb Revert r216439 (and r216441, else the former doesn't revert cleanly).
It caused PR 20771. I'll land a test on the clang side.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216582 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 20:00:13 +00:00
David Majnemer
d8e448bd27 InstSimplify: Compute comparison ranges for left shift instructions
'shl nuw CI, x' produces [CI, CI << CLZ(CI)]
'shl nsw CI, x' produces [CI << CLO(CI)-1, CI] if CI is negative
'shl nsw CI, x' produces [CI, CI << CLZ(CI)-1] if CI is non-negative

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216570 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 18:03:46 +00:00
Oliver Stannard
5e487f8dc7 Teach the AArch64 backend about v4f16 and v8f16
This teaches the AArch64 backend to deal with the operations required
to deal with the operations on v4f16 and v8f16 which are exposed by
NEON intrinsics, plus the add, sub, mul and div operations.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216555 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 16:16:04 +00:00
Michael Zolotukhin
b8c95a89e6 [SLP] Re-enable vectorization of GEP expressions (re-apply r210342 with a fix).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216549 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 15:01:18 +00:00
Chandler Carruth
7e3dc40fab [x86] Fix a regression introduced with r213897 for 32-bit targets where
we stopped efficiently lowering sextload using the SSE41 instructions
for that operation.

This is a consequence of a bad predicate I used thinking of the memory
access needs. The code actually handles the cases where the predicate
doesn't apply, and handles them much better. =] Simple fix and a test
case added. Fixes PR20767.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216538 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 11:39:47 +00:00
Chandler Carruth
963a5e6c61 [SDAG] Re-instate r215611 with a fix to a pesky X86 DAG combine.
This combine is essentially combining target-specific nodes back into target
independent nodes that it "knows" will be combined yet again by a target
independent DAG combine into a different set of target-independent nodes that
are legal (not custom though!) and thus "ok". This seems... deeply flawed. The
crux of the problem is that we don't combine un-legalized shuffles that are
introduced by legalizing other operations, and thus we don't see a very
profitable combine opportunity. So the backend just forces the input to that
combine to re-appear.

However, for this to work, the conditions detected to re-form the unlegalized
nodes must be *exactly* right. Previously, failing this would have caused poor
code (if you're lucky) or a crasher when we failed to select instructions.
After r215611 we would fall back into the legalizer. In some cases, this just
"fixed" the crasher by produces bad code. But in the test case added it caused
the legalizer and the dag combiner to iterate forever.

The fix is to make the alignment checking in the x86 side of things match the
alignment checking in the generic DAG combine exactly. This isn't really a
satisfying or principled fix, but it at least make the code work as intended.
It also highlights that it would be nice to detect the availability of under
aligned loads for a given type rather than bailing on this optimization. I've
left a FIXME to document this.

Original commit message for r215611 which covers the rest of the chang:
  [SDAG] Fix a case where we would iteratively legalize a node during
  combining by replacing it with something else but not re-process the
  node afterward to remove it.

  In a truly remarkable stroke of bad luck, this would (in the test case
  attached) end up getting some other node combined into it without ever
  getting re-processed. By adding it back on to the worklist, in addition
  to deleting the dead nodes more quickly we also ensure that if it
  *stops* being dead for any reason it makes it back through the
  legalizer. Without this, the test case will end up failing during
  instruction selection due to an and node with a type we don't have an
  instruction pattern for.

It took many million runs of the shuffle fuzz tester to find this.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216537 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 11:22:16 +00:00
Robert Khasanov
e79a94a839 [SKX] Added new versions of cmp instructions in avx512_icmp_cc multiclass, added VL multiclass.
Added encoding tests


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216532 91177308-0d34-0410-b5e6-96231b3b80d8
2014-08-27 09:34:37 +00:00