Fix for PR20648 - http://llvm.org/bugs/show_bug.cgi?id=20648
This patch checks the operands of a vselect to see if all values are constants.
If yes, bail out of any further attempts to create a blend or shuffle because
SelectionDAGLegalize knows how to turn this kind of vselect into a single load.
This already happens for machines without SSE4.1, so the added checks just send
more targets down that path.
Differential Revision: http://reviews.llvm.org/D4934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216121 91177308-0d34-0410-b5e6-96231b3b80d8
The goal of the patch is to implement section 3.2.3 of the AMD64 ABI
correctly. The controlling sentence is, "The size of each argument gets
rounded up to eightbytes. Therefore the stack will always be eightbyte
aligned." The equivalent sentence in the i386 ABI page 37 says, "At all
times, the stack pointer should point to a word-aligned area." For both
architectures, the stack pointer is not being rounded up to the nearest
eightbyte or word between the last normal argument and the first
variadic argument.
Patch by Thomas Jablin!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216119 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This fixes http://llvm.org/bugs/show_bug.cgi?id=19530.
The problem is that X86ISelLowering erroneously thought the third call
was eligible for tail call elimination.
It would have been if it's return value was actually the one returned
by the calling function, but here that is not the case and
additional values are being returned.
Test Plan: Test case from the original bug report is included.
Reviewers: rafael
Reviewed By: rafael
Subscribers: rafael, llvm-commits
Differential Revision: http://reviews.llvm.org/D4968
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216117 91177308-0d34-0410-b5e6-96231b3b80d8
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.
Original commit message:
In the large code model for X86 floating-point constants are placed in the
constant pool and materialized by loading from it. Since the constant pool
could be far away, a PC relative load might not work. Therefore we first
materialize the address of the constant pool with a movabsq and then load
from there the floating-point value.
Fixes <rdar://problem/17674628>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216012 91177308-0d34-0410-b5e6-96231b3b80d8
Note: This was originally reverted to track down a buildbot error. Reapply
without any modifications.
Original commit message:
This mostly affects the i64 value type, which always resulted in an 15byte
mobavsq instruction to materialize any constant. The custom code checks the
value of the immediate and tries to use a different and smaller mov
instruction when possible.
This fixes <rdar://problem/17420988>.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216010 91177308-0d34-0410-b5e6-96231b3b80d8
Group: Floating Point XMM and YMM instructions.
Sub-group: Other instructions.
<rdar://problem/15607571>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215923 91177308-0d34-0410-b5e6-96231b3b80d8
Group: Floating Point XMM and YMM instructions.
Sub-group: Math instructions.
<rdar://problem/15607571>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215921 91177308-0d34-0410-b5e6-96231b3b80d8
ARM in particular is getting dangerously close to exceeding 32 bits worth of
possible subtarget features. When this happens, various parts of MC start to
fail inexplicably as masks get truncated to "unsigned".
Mostly just refactoring at present, and there's probably no way to test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215887 91177308-0d34-0410-b5e6-96231b3b80d8
It should remove dosens of lines in handling instrinsics (in a huge switch) and give an easy way to add new intrinsics.
I did not completed to move al intrnsics to the table, I'll do this in the upcomming commits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215826 91177308-0d34-0410-b5e6-96231b3b80d8
MSVC gives this awesome diagnostic:
..\lib\Target\X86\X86ISelLowering.cpp(7085) : error C2971: 'llvm::VariadicFunction1' : template parameter 'Func' : 'isShuffleEquivalentImpl' : a local variable cannot be used as a non-type argument
..\include\llvm/ADT/VariadicFunction.h(153) : see declaration of 'llvm::VariadicFunction1'
..\lib\Target\X86\X86ISelLowering.cpp(7061) : see declaration of 'isShuffleEquivalentImpl'
Using an anonymous namespace makes the problem go away.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215744 91177308-0d34-0410-b5e6-96231b3b80d8
the new shuffle lowering and an implementation for v4 shuffles.
This allows us to handle non-half-crossing shuffles directly for v4
shuffles, both integer and floating point. This currently misses places
where we could perform the blend via UNPCK instructions, but otherwise
generates equally good or better code for the test cases included to the
existing vector shuffle lowering. There are a few cases that are
entertainingly better. ;]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215702 91177308-0d34-0410-b5e6-96231b3b80d8
BLENDPS, BLENDPD, and PBLENDW instructions into pretty shuffle comments.
These will be used in my next commit as part of test cases for AVX
shuffles which can directly use blend in more places.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215701 91177308-0d34-0410-b5e6-96231b3b80d8
elements of a shuffle mask and simplify how it works. No functionality
changed now that the bug that was here has been fixed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215696 91177308-0d34-0410-b5e6-96231b3b80d8
target-specific shuffl DAG combines.
We were recognizing the paired shuffles backwards. This code needs to be
replaced anyways as we have the same functionality elsewhere, but I'll
do the refactoring in a follow-up, this is the minimal fix to the
behavior.
In addition to fixing miscompiles with the new vector shuffle lowering,
it also causes the canonicalization to kick in much better, selecting
the smaller encoding variants in lots of places in the new AVX path.
This still isn't quite ideal as we don't need both the shufpd and the
punpck instructions, but that'll get fixed in a follow-up patch.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215690 91177308-0d34-0410-b5e6-96231b3b80d8
broken logic for merging shuffle masks in the face of SM_SentinelZero
mask operands.
While these are '-1' they don't mean 'undef' the way '-1' means in the
pre-legalized shuffle masks. Instead, they mean that the shuffle
operation is forcibly zeroing that lane. Reflect this and explicitly
handle it in a bunch of places. In one place the effect is equivalent
but much more clear. In the rest it was really weirdly broken.
Also, rewrite the entire merging thing to be a more directy operation
with a single loop and just doing math to map the indices through the
various masks.
Also add a bunch of asserts to try to make in extremely clear what the
different masks can possibly look like.
Finally, add some comments to clarify that we're merging shuffle masks
*up* here rather than *down* as we do everywhere else, and thus the
logic is quite confusing.
Thanks to several different people for sending test cases, and for
Robert Khasanov for an initial attempt at fixing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215687 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts:
r215595 "[FastISel][X86] Add large code model support for materializing floating-point constants."
r215594 "[FastISel][X86] Use XOR to materialize the "0" value."
r215593 "[FastISel][X86] Emit more efficient instructions for integer constant materialization."
r215591 "[FastISel][AArch64] Make use of the zero register when possible."
r215588 "[FastISel] Let the target decide first if it wants to materialize a constant."
r215582 "[FastISel][AArch64] Cleanup constant materialization code. NFCI."
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215673 91177308-0d34-0410-b5e6-96231b3b80d8