Found during fuzz testing - 32-bit x86 targets were legalizing a <2 x i1> compare result to <2 x i32> when <2 x i64> was expected.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264085 91177308-0d34-0410-b5e6-96231b3b80d8
We were just completely ignoring the types when determining whether we could
safely emit a libcall as a tail call. This is clearly wrong.
Theoretically, we could dig deeper looking for incidental matches (much like
the generic code in Analysis.cpp does), but it's probably not worth it for the
few libcalls that exist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264084 91177308-0d34-0410-b5e6-96231b3b80d8
When you have multiple LCSSA (single-operand) PHIs that are converted
into two-operand PHIs due to versioning, only assert that the PHI
currently being converted has a single operand. I.e. we don't want to
check PHIs that were converted earlier in the loop.
Fixes PR27023.
Thanks to Karl-Johan Karlsson for the minimized testcase!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264081 91177308-0d34-0410-b5e6-96231b3b80d8
Improve vector extension of vectors on hardware without dedicated VSEXT/VZEXT instructions.
We already convert these to SIGN_EXTEND_VECTOR_INREG/ZERO_EXTEND_VECTOR_INREG but can further improve this by using the legalizer instead of prematurely splitting into legal vectors in the combine as this only properly helps for lowering to VSEXT/VZEXT.
Removes a lot of unnecessary any_extend + mask pattern - (Fix for PR25718).
Reapplied with a fix for PR26953 (missing vector widening legalization).
Differential Revision: http://reviews.llvm.org/D17932
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264062 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also renamed li_simm7 to li16_imm since it's not a simm7 and has an unusual
encoding (it's a uimm7 except that 0x7f represents -1).
Reviewers: vkalintiris
Subscribers: dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D18145
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264056 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We can't check the error message for this one because there's another lw/sw
available that covers a larger range. We therefore check the transition
between the two sizes.
Reviewers: vkalintiris
Subscribers: llvm-commits, dsanders
Differential Revision: http://reviews.llvm.org/D18144
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264054 91177308-0d34-0410-b5e6-96231b3b80d8
It's a bug fix.
For rerolled loops SE trip count remains unchanged. It leads to incorrect work of the next passes.
My patch just resets SE info for rerolled loop forcing SE to re-evaluate it next time it requested.
I also added a verifier call in the exisitng test to be sure no invalid SE data remain. Without my fix this test would fail with -verify-scev.
Differential Revision: http://reviews.llvm.org/D18316
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264051 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
After this change, deopt operand bundles can be lowered directly by
SelectionDAG into STATEPOINT instructions (which are then lowered to a
call or sequence of nop, with an associated __llvm_stackmaps entry0.
This obviates the need to round-trip deoptimization state through
gc.statepoint via RewriteStatepointsForGC.
Reviewers: reames, atrick, majnemer, JosephTremoulet, pgavlin
Subscribers: sanjoy, mcrosier, majnemer, llvm-commits
Differential Revision: http://reviews.llvm.org/D18257
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264015 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Without tree pruning clang has 2,667,552 points.
Wiht only dominators pruning: 1,515,586.
With both dominators & predominators pruning: 1,340,534.
Resubmit of r262103.
Differential Revision: http://reviews.llvm.org/D18341
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@264003 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a BB with only MemoryDefs, live-in calculations will ignore
it. This means we get results like this:
define void @foo(i8* %p) {
; 1 = MemoryDef(liveOnEntry)
store i8 0, i8* %p
br i1 undef, label %if.then, label %if.end
if.then:
; 2 = MemoryDef(1)
store i8 1, i8* %p
br label %if.end
if.end:
; 3 = MemoryDef(1)
store i8 2, i8* %p
ret void
}
...When there should be a MemoryPhi in the `if.end` BB.
This patch fixes that behavior.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263991 91177308-0d34-0410-b5e6-96231b3b80d8
I added -march=hexagon to force using Hexagon target when testing
locally, and I forgot to take it out.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263990 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Whole quad mode is already enabled for pixel shaders that compute
derivatives, but it must be suspended for instructions that cause a
shader to have side effects (i.e. stores and atomics).
This pass addresses the issue by storing the real (initial) live mask
in a register, masking EXEC before instructions that require exact
execution and (re-)enabling WQM where required.
This pass is run before register coalescing so that we can use
machine SSA for analysis.
The changes in this patch expose a problem with the second machine
scheduling pass: target independent instructions like COPY implicitly
use EXEC when they operate on VGPRs, but this fact is not encoded in
the MIR. This can lead to miscompilation because instructions are
moved past changes to EXEC.
This patch fixes the problem by adding use-implicit operands to
target independent instructions. Some general codegen passes are
relaxed to work with such implicit use operands.
Reviewers: arsenm, tstellarAMD, mareko
Subscribers: MatzeB, arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18162
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263982 91177308-0d34-0410-b5e6-96231b3b80d8
- R10 and R11 are not reserved registers.
- Check for reserved registers when finding unused caller-saved registers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263977 91177308-0d34-0410-b5e6-96231b3b80d8
In executable and shared object ELF files, relocations in the file contain the final virtual address rather than section offset so this is adjusted to display section offset.
Differential revision: http://reviews.llvm.org/D15965
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263971 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When control flow is implemented using the exec mask, the compiler will
insert branch instructions to skip over the masked section when exec is
zero if the section contains more than a certain number of instructions.
The previous code would only count instructions in successor blocks,
and this patch modifies the code to start counting instructions in all
blocks between the start and end of the branch.
Reviewers: nhaehnle, arsenm
Subscribers: arsenm, llvm-commits
Differential Revision: http://reviews.llvm.org/D18282
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263969 91177308-0d34-0410-b5e6-96231b3b80d8
This introduces a custom lowering for ISD::SETCCE (introduced in r253572)
that allows us to emit a short code sequence for 64-bit compares.
Before:
push {r7, lr}
cmp r0, r2
mov.w r0, #0
mov.w r12, #0
it hs
movhs r0, #1
cmp r1, r3
it ge
movge.w r12, #1
it eq
moveq r12, r0
cmp.w r12, #0
bne .LBB1_2
@ BB#1: @ %bb1
bl f
pop {r7, pc}
.LBB1_2: @ %bb2
bl g
pop {r7, pc}
After:
push {r7, lr}
subs r0, r0, r2
sbcs.w r0, r1, r3
bge .LBB1_2
@ BB#1: @ %bb1
bl f
pop {r7, pc}
.LBB1_2: @ %bb2
bl g
pop {r7, pc}
Saves around 80KB in Chromium's libchrome.so.
Some notes on this patch:
- I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I
introduced (nothing else needs them). However, they are necessary in
order to avoid poor codegen, and they seem similar to existing combines
in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to
(brcond Compare)).
- No support for Thumb-1. This is in principle possible, but we'd need
to implement ARMISD::SUBE for Thumb-1.
Differential Revision: http://reviews.llvm.org/D15256
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@263962 91177308-0d34-0410-b5e6-96231b3b80d8