dag-combine optimization to implement the ext-load efficiently (using shuffles).
For example the type <4 x i8> is stored in memory as i32, but it needs to
find its way into a <4 x i32> register. Previously we scalarized the memory
access, now we use shuffles.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139995 91177308-0d34-0410-b5e6-96231b3b80d8
maxps and maxpd). This broke the sse41-blend.ll testcase by causing
maxpd to be produced rather than a cmp+blend pair, which is the reason
I tweaked it. Gives a small speedup on doduc with dragonegg when the
GCC vectorizer is used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139986 91177308-0d34-0410-b5e6-96231b3b80d8
are declared with load patterns. This fix the crash in PR10941. No testcases,
since a fold is triggered and then converted back to the register form
afterwards.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139953 91177308-0d34-0410-b5e6-96231b3b80d8
This PR basically reports a problem where a crash in generated code
happened due to %rbp being clobbered:
pushq %rbp
movq %rsp, %rbp
....
vmovmskps %ymm12, %ebp
....
movq %rbp, %rsp
popq %rbp
ret
Since Eric's r123367 commit, the default stack alignment for x86 32-bit
has changed to be 16-bytes. Since then, the MaxStackAlignmentHeuristicPass
hasn't been really used, but with AVX it becomes useful again, since per
ABI compliance we don't always align the stack to 256-bit, but only when
there are 256-bit incoming arguments.
ReserveFP was only used by this pass, but there's no RA target hook that
uses getReserveFP() to check for the presence of FP (since nothing was
triggering the pass to run, the uses of getReserveFP() were removed
through time without being noticed). Change this pass to use
setForceFramePointer, which is properly called by MachineFunction
hasFP method.
The testcase is very big and dependent on RA, not sure if it's worth
adding to test/CodeGen/X86.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139939 91177308-0d34-0410-b5e6-96231b3b80d8
The leaveIntvAfter() function normally inserts a back-copy after the
requested instruction, making the back-copy kill the live range.
In spill mode, try to insert the back-copy before the last use instead.
That means the last use becomes the kill instead of the back-copy. This
lowers the register pressure because the last use can now redefine the
same register it was reading.
This will also improve compile time: The back-copy isn't a kill, so
hoisting it in hoistCopiesForSize() won't force a recomputation of the
source live range. Similarly, if the back-copy isn't hoisted by the
splitter, the spiller will not attempt hoisting it locally.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139883 91177308-0d34-0410-b5e6-96231b3b80d8
If the source register is live after the copy being spilled, there is no
point to hoisting it. Hoisting inside a basic block only serves to
resolve interferences by shortening the live range of the source.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139882 91177308-0d34-0410-b5e6-96231b3b80d8
gold plugin is built with Large File Support (sizeof(off_t) == 64 on i686)
and the rest of LLVM is built w/o Large File Support
(sizeof(off_t) == 32 on i686) which corrupts the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139873 91177308-0d34-0410-b5e6-96231b3b80d8
When -split-spill-mode is enabled, spill hoisting is performed by
SplitKit instead of by InlineSpiller. This hidden command line option
is for testing the splitter spill mode.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@139845 91177308-0d34-0410-b5e6-96231b3b80d8