This was left implicit and never ever checked, which means we could have a CMPZ against some non-zero value and we were carrying on with BFI conversion regardless.
Caught by Oliver Stannard using csmith; regression test added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253195 91177308-0d34-0410-b5e6-96231b3b80d8
MCRelaxableFragment previously kept a copy of MCSubtargetInfo and
MCInst to enable re-encoding the MCInst later during relaxation. A copy
of MCSubtargetInfo (instead of a reference or pointer) was needed
because the feature bits could be modified by the parser.
This commit replaces the MCSubtargetInfo copy in MCRelaxableFragment
with a constant reference to MCSubtargetInfo. The copies of
MCSubtargetInfo are kept in MCContext, and the target parsers are now
responsible for asking MCContext to provide a copy whenever the feature
bits of MCSubtargetInfo have to be toggled.
With this patch, I saw a 4% reduction in peak memory usage when I
compiled verify-uselistorder.lto.bc using llc.
rdar://problem/21736951
Differential Revision: http://reviews.llvm.org/D14346
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253127 91177308-0d34-0410-b5e6-96231b3b80d8
ISD::BITREVERSE matches "rbit" completely, so remove ARMISD::RBIT and mark ISD::BITREVERSE as legal, adding a test for lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@253047 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch changes ARMV5, ARMV5E, ARMV6SM, ARMV6HL, ARMV7, ARMV7L,
ARMV7HL, ARMV7EM to be treated as aliases for the corresponding
standard architectures, instead of as actual architectures.
Reviewers: rengolin
Subscribers: aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14577
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252903 91177308-0d34-0410-b5e6-96231b3b80d8
I completely misunderstood what ARMISD::CMPZ means. It's not "compare equal to zero", it's "compare, only setting the zero/Z flag". It can either be equal-to-zero or not-equal-to-zero, and we weren't checking what sense it was.
If it's equal-to-zero, we can swap the operands around and pretend like it is not-equal-to-zero, which is both a bug fix and lets us handle more cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252891 91177308-0d34-0410-b5e6-96231b3b80d8
I missed the side-effects of ParseBFI in my previous attempt (r252748).
Thanks dblaikie for the suggestion of adding a void use of the unused
variable instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252751 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a chain of BFIs, we may be able to combine several together into one merged BFI. We can do this if the "from" bits from one BFI OR'd with the "from" bits from the other BFI form a contiguous range, and the same with the "to" bits.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252740 91177308-0d34-0410-b5e6-96231b3b80d8
ARM V6T2 has instructions for efficient count-leading/trailing-zeros, so this should be
considered a cheap operation (and therefore fair game for speculation) for any ARM V6T2
implementation.
The net result of allowing this speculation for the regression tests in this patch is
that we get this code:
ctlz:
clz r0, r0
bx lr
cttz:
rbit r0, r0
clz r0, r0
bx lr
Instead of:
ctlz:
cmp r0, #0
moveq r0, #32
clzne r0, r0
bx lr
cttz:
cmp r0, #0
moveq r0, #32
rbitne r0, r0
clzne r0, r0
bx lr
This will help solve a general speculation/despeculation problem noted in PR24818:
https://llvm.org/bugs/show_bug.cgi?id=24818
Differential Revision: http://reviews.llvm.org/D14469
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252639 91177308-0d34-0410-b5e6-96231b3b80d8
Added fixes for stage2 failures: CMOV is not commutable; commuting the operands results in the condition being flipped! d'oh!
Original commit message:
If we have a CMOV, OR and AND combination such as:
if (x & CN)
y |= CM;
And:
* CN is a single bit;
* All bits covered by CM are known zero in y;
Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252606 91177308-0d34-0410-b5e6-96231b3b80d8
"GCC requires the freestanding environment provide memcpy, memmove, memset
and memcmp": https://gcc.gnu.org/onlinedocs/gcc-5.2.0/gcc/Standards.html
Hence in GNUEABI targets LLVM should not convert 'memops' to their equivalent
'__aeabi_memops'. This convertion violates GCC contract.
The -meabi flag controls whether or not LLVM will modify 'memops' in GNUEABI
targets.
Without -meabi: use the triple default EABI.
With -meabi=default: use the triple default EABI.
With -meabi=gnu: use 'memops'.
With -meabi=4 or -meabi=5: use '__aeabi_memops'.
With -meabi set to an unknown value: same as -meabi=default.
Patch by Vinicius Tinti.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252462 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The CLR's personality routine passes these in rdx/edx, not rax/eax.
Make getExceptionPointerRegister a virtual method parameterized by
personality function to allow making this distinction.
Similarly make getExceptionSelectorRegister a virtual method parameterized
by personality function, for symmetry.
Reviewers: pgavlin, majnemer, rnk
Subscribers: jyknight, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D14344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252383 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In this implementation, LiveIntervalAnalysis invents a few register
masks on basic block boundaries that preserve no registers. The nice
thing about this is that it prevents the prologue inserter from thinking
it needs to spill all XMM CSRs, because it doesn't see any explicit
physreg defs in the MI.
Reviewers: MatzeB, qcolombet, JosephTremoulet, majnemer
Subscribers: MatzeB, llvm-commits
Differential Revision: http://reviews.llvm.org/D14407
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252318 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This review is related to another review request http://reviews.llvm.org/D11268, does the same and merely fixes a couple of issues with it.
D11268 is quite old and has merge conflicts against the current trunk.
This request
- rebases D11268 onto the new trunk;
- resolves the merge conflicts;
- fixes the prologue_end tests, which do not pass due to the subprogram definitions not marked as distinct.
Reviewers: echristo, rengolin, kubabrecka
Subscribers: aemerson, rengolin, jyknight, dsanders, llvm-commits, asl
Differential Revision: http://reviews.llvm.org/D14338
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252177 91177308-0d34-0410-b5e6-96231b3b80d8
We can conservatively know that CMOV's known bits are the intersection of known bits for each of its operands. This helps PerformCMOVToBFICombine find more opportunities.
I tried hard to create a testcase for this and failed - we have to sufficiently confuse DAG.computeKnownBits which can see through all the cheap tricks I tried to narrow my larger testcase down :(
This code is actually exercised in CodeGen/ARM/bfi.ll, there's just no functional difference because DAG.computeKnownBits gets the right answer in that case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252168 91177308-0d34-0410-b5e6-96231b3b80d8
This brings back the behavior from before r252090 for out of range symbols.
Should bring some arm bots back.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252119 91177308-0d34-0410-b5e6-96231b3b80d8
The generic infrastructure already did a lot of work to decide if the
fixup value is know or not. It doesn't make sense to reimplement a very
basic case: same fragment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252090 91177308-0d34-0410-b5e6-96231b3b80d8
If we have a CMOV, OR and AND combination such as:
if (x & CN)
y |= CM;
And:
* CN is a single bit;
* All bits covered by CM are known zero in y;
Then we can convert this to a sequence of BFI instructions. This will always be a win if CM is a single bit, will always be no worse than the TST & OR sequence if CM is two bits, and for thumb will be no worse if CM is three bits (due to the extra IT instruction).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@252057 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
ARMv6KZ cores were set up incorrectly in ARM.td; also, the SMI mnemonic
(the old name for SMC, as defined in ARMv6KZ) wasn't supported.
Reviewers: jmolloy, rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D14154
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251627 91177308-0d34-0410-b5e6-96231b3b80d8
The most substantial changes are again for watchOS: libcalls are hard-float if
needed and sincos has a different calling convention.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251571 91177308-0d34-0410-b5e6-96231b3b80d8
At the LLVM level this ABI is essentially a minimal modification of AAPCS to
support 16-byte alignment for vector types and the stack.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251570 91177308-0d34-0410-b5e6-96231b3b80d8
These MachO file directives are used by linkers and other tools to provide
compatibility information, much like the existing .ios_version_min and
.macosx_version_min.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251569 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch handles assembly and disassembly, but not codegen, as of yet.
Additionally, it fixes a bug whereby SP and PC as shifted-reg operands
were treated as predictable in ARMv7 Thumb; and it enables the tests
for invalid and unpredictable instructions to run on both ARMv7 and ARMv8.
Reviewers: jmolloy, rengolin
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D14141
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251516 91177308-0d34-0410-b5e6-96231b3b80d8
This also lets us remove the versions of the functions that took a statically sized array as we can rely on ArrayRef implicit conversion now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251490 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: After D13851 landed, we saw backend crashes when compiling the reduced test case included in this patch. The right fix seems to be to allow these vector types for expansion in instruction selection.
Reviewers: rengolin, t.p.northover
Subscribers: RKSimon, t.p.northover, aemerson, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14082
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251401 91177308-0d34-0410-b5e6-96231b3b80d8
This avoid mentioning the table name an extra time and allows the lookup to be done directly in the ifs by relying on the bool conversion of the pointer.
While there make use of ArrayRef and std::find_if.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251382 91177308-0d34-0410-b5e6-96231b3b80d8
Both VLDRS and VLDRD fault if the memory is not 4 byte aligned, which wasn't
really being checked before, leading to faults at runtime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251352 91177308-0d34-0410-b5e6-96231b3b80d8
In PIC mode we were previously computing global variable addresses (or GOT
entry addresses) by adding the PC, the PC-relative GOT displacement and
the GOT-relative symbol/GOT entry displacement. Because the latter two
displacements are fixed, we ended up performing one more addition than
necessary.
This change causes us to compute addresses using a single PC-relative
displacement, resulting in a shorter code sequence. This reduces code size
by about 4% in a recent build of Chromium for Android.
As a result of this change we no longer need to compute the GOT base address
in the ARM backend, which allows us to remove the Global Base Reg pass and
SDAG lowering for the GOT.
We also now no longer use the GOT when addressing a symbol which is known
to be defined in the same linkage unit. Specifically, the symbol must have
either hidden visibility or a strong definition in the current module in
order to not use the the GOT.
This is a change from the previous behaviour where we would use the GOT to
address externally visible symbols defined in the same module. I think the
only cases where this could matter are cases involving symbol interposition,
but we don't really support that well anyway.
Differential Revision: http://reviews.llvm.org/D13650
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251322 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When ARMFrameLowering::emitPopInst generates a "pop" instruction to restore the callee saved registers, it checks if the LR register is among them. If so, the function may decide to remove the basic block's terminator and replace it with a "pop" to the PC register instead of LR.
This leads to a problem when the block's terminator is preceded by a "llvm.debugtrap" call. The MI iterator points to the trap in such a case, which is also a terminator. If the function decides to restore LR to PC, it erroneously removes the trap.
Reviewers: asl, rengolin
Subscribers: aemerson, jfb, rengolin, dschuff, llvm-commits
Differential Revision: http://reviews.llvm.org/D13672
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@251123 91177308-0d34-0410-b5e6-96231b3b80d8
These were the cause of a verifier error when building 7zip with
-verify-machineinstrs. Running 'make check' with the verifier
triggered the same error on the test here so i've updated the test
to run the verifier on one of its runs instead of adding a new one.
While looking at this code, there was a stale comment that these
instructions were only used for disassembly. This probably used to
be the case, but they are now used in the 'ARM load / store optimization pass' too.
This reapplies r242300 which was reverted in r242428 due to bot failures.
Ultimately those failures were spurious and completely unrelated to this commit. I reverted this
at the time because it was thought to be at fault.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250969 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
TargetLoweringBase::Expand is defined as "Try to expand this to other ops,
otherwise use a libcall." For ISD::UDIV and ISD::SDIV, the choice between
the two possibilities was defined in a rather convoluted way:
- if DIVREM is legal, expand to DIVREM
- if DIVREM has a custom lowering, expand to DIVREM
- if DIVREM libcall is defined and a remainder from the same division is
computed elsewhere, expand to a DIVREM libcall
- else, expand to a DIV libcall
This had the undesirable effect that if both DIV and DIVREM are implemented
as libcalls, then ISD::UDIV and ISD::SDIV are expanded to the heavier DIVREM
libcall, even when the remainder isn't used.
The new code adds a new LegalizeAction, TargetLoweringBase::LibCall, so that
backends can directly control whether they prefer an expansion or a conversion
to a libcall. This makes the generic lowering code even more generic,
allowing its reuse in a wider range of target-specific configurations.
The useful effect is that ARM backend will now generate a call
to __aeabi_{i,u}div rather than __aeabi_{i,u}divmod in cases where
it doesn't need the remainder. There's no functional change outside
the ARM backend.
Reviewers: t.p.northover, rengolin
Subscribers: t.p.northover, llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D13862
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250826 91177308-0d34-0410-b5e6-96231b3b80d8
The mapping of these two intrinsics in ARMInstrInfo.td had a small
omission which lead to their operands not being validated/transformed
before being lowered into usat and ssat instructions. This can cause
incorrect instructions to be emitted.
I've also added tests for the remaining two saturating arithmatic
intrinsics @llvm.arm.qadd and @llvm.arm.qsub as they are missing
codegen tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@250697 91177308-0d34-0410-b5e6-96231b3b80d8