The changes in:
test/CodeGen/X86/machine-cp.ll
are just due to scheduling differences after some logic instructions were reassociated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247516 91177308-0d34-0410-b5e6-96231b3b80d8
Renamed to lowerVectorShuffleAsPermuteAndUnpack to make it clear that it lowers to more than just a UNPCK instruction.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247513 91177308-0d34-0410-b5e6-96231b3b80d8
realignment should be forced.
With this commit, we can now force stack realignment when doing LTO and
do so on a per-function basis. Also, add a new cl::opt option
"stackrealign" to CommandFlags.h which is used to force stack
realignment via llc's command line.
Out-of-tree projects currently using -force-align-stack to force stack
realignment should make changes to attach the attribute to the functions
in the IR.
Differential Revision: http://reviews.llvm.org/D11814
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247450 91177308-0d34-0410-b5e6-96231b3b80d8
The Win32 EH runtime caller does not preserve EBP, even though it does
preserve the CSRs (EBX, ESI, EDI) for us. The result was that each
finally funclet call would leave the frame pointer off by 12 bytes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247348 91177308-0d34-0410-b5e6-96231b3b80d8
Except the changes that defined virtual destructors as =default, because that
ran into problems with GCC 4.7 and overriding methods that weren't noexcept.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247298 91177308-0d34-0410-b5e6-96231b3b80d8
All of the complexity is in cleanupret, and it mostly follows the same
codepaths as catchret, except it doesn't take a return value in RAX.
This small example now compiles and executes successfully on win32:
extern "C" int printf(const char *, ...) noexcept;
struct Dtor {
~Dtor() { printf("~Dtor\n"); }
};
void has_cleanup() {
Dtor o;
throw 42;
}
int main() {
try {
has_cleanup();
} catch (int) {
printf("caught it\n");
}
}
Don't try to put the cleanup in the same function as the catch, or Bad
Things will happen.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247219 91177308-0d34-0410-b5e6-96231b3b80d8
The 32-bit tables don't actually contain PC range data, so emitting them
is incredibly simple.
The 64-bit tables, on the other hand, use the same table for state
numbering as well as label ranges. This makes things more difficult, so
it will be implemented later.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247192 91177308-0d34-0410-b5e6-96231b3b80d8
With subregister liveness enabled we can detect the case where only
parts of a register are live in, this is expressed as a 32bit lanemask.
The current code only keeps registers in the live-in list and therefore
enumerated all subregisters affected by the lanemask. This turned out to
be too conservative as the subregister may also cover additional parts
of the lanemask which are not live. Expressing a given lanemask by
enumerating a minimum set of subregisters is computationally expensive
so the best solution is to simply change the live-in list to store the
lanemasks as well. This will reduce memory usage for targets using
subregister liveness and slightly increase it for other targets
Differential Revision: http://reviews.llvm.org/D12442
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247171 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
32-bit funclets have short prologues that allocate enough stack for the
largest call in the whole function. The runtime saves CSRs for the
funclet. It doesn't restore CSRs after we finally transfer control back
to the parent funciton via a CATCHRET, but that's a separate issue.
32-bit funclets also have to adjust the incoming EBP value, which is
what llvm.x86.seh.recoverframe does in the old model.
64-bit funclets need to spill CSRs as normal. For simplicity, this just
spills the same set of CSRs as the parent function, rather than trying
to compute different CSR sets for the parent function and each funclet.
64-bit funclets also allocate enough stack space for the largest
outgoing call frame, like 32-bit.
Reviewers: majnemer
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D12546
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247092 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch modifies X86TargetLowering::LowerVASTART so that
struct va_list is initialized with 32 bit pointers in x32. It also
includes tests that call @llvm.va_start() for x32.
Patch by João Porto
Subscribers: llvm-commits, hjl.tools
Differential Revision: http://reviews.llvm.org/D12346
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247069 91177308-0d34-0410-b5e6-96231b3b80d8
The old implementation assumed LP64 which is broken for x32. Specifically, the
MOVE8rm_NOREX and MOVE8mr_NOREX, when selected, would cause a 'Cannot emit
physreg copy instruction' error message to be reported.
This patch also enable the h-register*ll tests for x32.
Differential Revision: http://reviews.llvm.org/D12336
Patch by João Porto
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@247058 91177308-0d34-0410-b5e6-96231b3b80d8
This prevents MC clients from getting COFF.h, which conflicts with
winnt.h macros. Also a minor IWYU cleanup. Now the only public headers
including COFF.h are in Object, and they actually need it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246784 91177308-0d34-0410-b5e6-96231b3b80d8
We used to accept (and even test, and generate) 16-byte alignment
for 32-byte nontemporal stores, but they require 32-byte alignment,
per SDM. Found by inspection.
Instead of hardcoding 16 in the patfrag, check for natural alignment.
Also fix the autoupgrade and the various tests.
Also, use explicit -mattr instead of -mcpu: I stared at the output
several minutes wondering why I get 2x movntps for the unaligned
case (which is the ideal output, but needs some work: see FIXME),
until I remembered corei7-avx implies +slow-unaligned-mem-32.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246733 91177308-0d34-0410-b5e6-96231b3b80d8
We can chain other fragments to avoid repeating conditions.
This also fixes a potential bug (that realistically can't happen),
where we would match indexed nontemporal stores for i32/i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246719 91177308-0d34-0410-b5e6-96231b3b80d8
X86FastISel has been using the wrong register class for VBLENDVPS which
produces a VR128 and needs an extra copy to the target register. The
problem was already hit by the existing test cases when using
> llvm-lit -Dllc="llc -verify-machineinstr"
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246461 91177308-0d34-0410-b5e6-96231b3b80d8
Make the arrays 'static const' instead of just 'static'. Post-commit review
comment from Roman Divacky on IRC. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@246376 91177308-0d34-0410-b5e6-96231b3b80d8