Currently, we only codegen the VRINT[APMXZR] and VCVT[BT] instructions
when targeting ARMv8, but they are actually present on any target with
FP-ARMv8. Note that FP-ARMv8 is called FPv5 when is is part of an
M-profile core, but they have the same instructions so we model them
both as FPARMv8 in the ARM backend.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218763 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The goal is to eventually remove all the code related to getInsertFencesForAtomic
in SelectionDAGBuilder as it is wrong (designed for ARM, not really portable, works
mostly by accident because the backends are overly conservative), and repeats the
same logic that goes in emitLeading/TrailingFence.
In this patch, I make AtomicExpandPass insert the fences as it knows better
where to put them. Because this requires getting the fences and not just
passing an IRBuilder around, I had to change the return type of
emitLeading/TrailingFence.
This code only triggers on ARM for now. Because it is earlier in the pipeline
than SelectionDAGBuilder, it triggers and lowers atomic accesses to atomic so
SelectionDAGBuilder does not add barriers anymore on ARM.
If this patch is accepted I plan to implement emitLeading/TrailingFence for all
backends that setInsertFencesForAtomic(true), which will allow both making them
less conservative and simplifying SelectionDAGBuilder once they are all using
this interface.
This should not cause any functionnal change so the existing tests are used
and not modified.
Test Plan: make check-all, benefits from existing tests of atomics on ARM
Reviewers: jfb, t.p.northover
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5179
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218329 91177308-0d34-0410-b5e6-96231b3b80d8
The fix is slightly different then x86 (see r216117) because the number of values
attached to a return can vary even for a single returned value (e.g., f64 yields
two returned values).
<rdar://problem/18352998>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218076 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch was originally in D5304 (I could not find a way to reopen that revision).
It was accepted, commited and broke the build bots because the overloading of
the constructor of ArrayRef for braced initializer lists is not supported by all
toolchains. I then reverted it, and propose this fixed version that uses a plain
C array instead in makeDMB (that array is then converted implicitly to an
ArrayRef, but that is not behind an ifdef). Could someone confirm me whether
initialization lists for plain C arrays are supported by every toolchain used
to build llvm ? Otherwise I can just initialize the array in the old way:
args[0] = ...; .. ; args[5] = ...;
Below is the description of the original patch:
```
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.
Thanks to luqmana for having noticed this bug.
```
Test Plan: Added more cases to atomic-load-store.ll + make check-all
Reviewers: jfb, t.p.northover, luqmana
Subscribers: llvm-commits, aemerson
Differential Revision: http://reviews.llvm.org/D5386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@218066 91177308-0d34-0410-b5e6-96231b3b80d8
It is breaking the build on the buildbots but works fine on my machine, I revert
while trying to understand what happens (it appears to depend on the compiler used
to build, I probably used a C++11 feature that is not perfectly supported by some
of the buildbots).
This reverts commit feb3176c4d006f99af8b40373abd56215a90e7cc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217973 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
I had only tested this code for ARMv7 and ARMv8. This patch adds several
fallback paths if the processor does not support dmb ish:
- dmb sy if a cortex-M with support for dmb
- mcr p15, #0, r0, c7, c10, #5 for ARMv6 (special instruction equivalent to a DMB)
These fallback paths were chosen based on the code for fence seq_cst.
Thanks to luqmana for having noticed this bug.
Test Plan: Added more cases to atomic-load-store.ll + make check-all
Reviewers: jfb, t.p.northover, luqmana
Subscribers: aemerson, llvm-commits
Differential Revision: http://reviews.llvm.org/D5304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217965 91177308-0d34-0410-b5e6-96231b3b80d8
This required a new hook called hasLoadLinkedStoreConditional to know whether
to expand atomics to LL/SC (ARM, AArch64, in a future patch Power) or to
CmpXchg (X86).
Apart from that, the new code in AtomicExpandPass is mostly moved from
X86AtomicExpandPass. The main result of this patch is to get rid of that
pass, which had lots of code duplicated with AtomicExpandPass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217928 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Split shouldExpandAtomicInIR() into different versions for Stores/Loads/RMWs/CmpXchgs.
Makes runOnFunction cleaner (no more redundant checking/casting), and will help moving
the X86 backend to this pass.
This requires a way of easily detecting which instructions are atomic.
I followed the pattern of mayReadFromMemory, mayWriteOrReadMemory, etc.. in making
isAtomic() a method of Instruction implemented by a switch on the opcodes.
Test Plan: make check
Reviewers: jfb
Subscribers: mcrosier, llvm-commits
Differential Revision: http://reviews.llvm.org/D5035
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217080 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes two latent bugs:
- There was no fence inserted before expanded seq_cst load (unsound on Power)
- There was only a fence release before seq_cst stores (again unsound, in particular on Power)
It is not even clear if this is correct on ARM swift processors (where release fences are
DMB ishst instead of DMB ish). This behaviour is currently preserved on ARM Swift
as it is not clear whether it is incorrect. I would love to get documentation stating
whether it is correct or not.
These two bugs were not triggered because Power is not (yet) using this pass, and these
behaviours happen to be (mostly?) working on ARM
(although they completely butchered the semantics of the llvm IR).
See:
http://lists.cs.uiuc.edu/pipermail/llvmdev/2014-August/075821.html
for an example of the problems that can be caused by the second of these bugs.
I couldn't see a way of fixing these in a completely target-independent way without
adding lots of unnecessary fences on ARM, hence the target-dependent parts of this
patch.
This patch implements the new target-dependent parts only for ARM (the default
of not doing anything is enough for AArch64), other architectures will use this
infrastructure in later patches.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@217076 91177308-0d34-0410-b5e6-96231b3b80d8
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reinstates commits r215111, 215115, 215116, 215117, 215136.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216982 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r215862 due to nightly failures. Will work on getting a
reduced test case, but I wanted to get our bots green in the meantime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216325 91177308-0d34-0410-b5e6-96231b3b80d8
There's no need to do this if the user doesn't call va_start. In the
future, we're going to have thunks that forward these register
parameters with musttail calls, and they won't need these spills for
handling va_start.
Most of the test suite changes are adding va_start calls to existing
tests to keep things working.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216294 91177308-0d34-0410-b5e6-96231b3b80d8
The FPv4-SP floating-point unit is generally referred to as
single-precision only, but it does have double-precision registers and
load, store and GPR<->DPR move instructions which operate on them.
This patch enables the use of these registers, the main advantage of
which is that we now comply with the AAPCS-VFP calling convention.
This partially reverts r209650, which added some AAPCS-VFP support,
but did not handle return values or alignment of double arguments in
registers.
This patch also adds tests for Thumb2 code generation for
floating-point instructions and intrinsics, which previously only
existed for ARM.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216172 91177308-0d34-0410-b5e6-96231b3b80d8
LLVM generates illegal `rbit r0, #352` instruction for rbit intrinsic.
According to ARM ARM, rbit only takes register as argument, not immediate.
The correct instruction should be rbit <Rd>, <Rm>.
The bug was originally introduced in r211057.
Differential Revision: http://reviews.llvm.org/D4980
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@216064 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Make use of isAtLeastRelease/Acquire in the ARM/AArch64 backends
These helper functions are introduced in D4844.
Depends D4844
Test Plan: make check-all passes
Reviewers: jfb
Subscribers: aemerson, llvm-commits, mcrosier, reames
Differential Revision: http://reviews.llvm.org/D4937
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215902 91177308-0d34-0410-b5e6-96231b3b80d8
Externally-defined functions with weak linkage should not be
tail-called on ARM or AArch64, as the AAELF spec requires normal calls
to undefined weak functions to be replaced with a NOP or jump to the
next instruction. The behaviour of branch instructions in this
situation (as used for tail calls) is implementation-defined, so we
cannot rely on the linker replacing the tail call with a return.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215890 91177308-0d34-0410-b5e6-96231b3b80d8
Simply indicate the functions that are part of the runtime library that we do
not setup libcalls for. This is merely for ease of identification. NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215863 91177308-0d34-0410-b5e6-96231b3b80d8
The set of functions defined in the RTABI was separated for no real reason.
This brings us closer to proper utilisation of the functions defined by the
RTABI. It also sets the ground for correctly emitting function calls to AEABI
functions on all AEABI conforming platforms.
The previously existing lie on the behaviour of __ldivmod and __uldivmod is
propagated as it is beyond the scope of the change.
The changes to the test are due to the fact that we now use the divmod functions
which return both the quotient and remainder and thus we no longer need to
invoke two functions on Linux (making it closer to EABI's behaviour).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215862 91177308-0d34-0410-b5e6-96231b3b80d8
Type::dump() doesn't print a newline, which makes for a poor
experience in a debugger. This looks like it was an ommission
considering Value::dump() two lines above, so I've changed Type to add
a newline as well.
Of the two in-tree callers, one added a newline anyway, and I've
updated the other one to use Type::print instead.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215421 91177308-0d34-0410-b5e6-96231b3b80d8
By default, LLVM uses the "C" calling convention for all runtime
library functions. The half-precision FP conversion functions use the
soft-float calling convention, and are needed for some targets which
use the hard-float convention by default, so must have their calling
convention explicitly set.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215348 91177308-0d34-0410-b5e6-96231b3b80d8
be deleted. This will be reapplied as soon as possible and before
the 3.6 branch date at any rate.
Approved by Jim Grosbach, Lang Hames, Rafael Espindola.
This reverts commits r215111, 215115, 215116, 215117, 215136.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215154 91177308-0d34-0410-b5e6-96231b3b80d8
I am sure we will be finding bits and pieces of dead code for years to
come, but this is a good start.
Thanks to Lang Hames for making MCJIT a good replacement!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@215111 91177308-0d34-0410-b5e6-96231b3b80d8
to get the subtarget and that's accessible from the MachineFunction
now. This helps clear the way for smaller changes where we getting
a subtarget will require passing in a MachineFunction/Function as
well.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214988 91177308-0d34-0410-b5e6-96231b3b80d8
Particularly on MachO, we were generating "blx _dest" instructions on M-class
CPUs, which don't actually exist. They happen to get fixed up by the linker
into valid "bl _dest" instructions (which is why such a massive issue has
remained largely undetected), but we shouldn't rely on that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214959 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r214893, re-applying r214881 with the test case relaxed a bit to
satiate the build bots.
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:
POP {r3}
ADD sp, #offset
BX r3
This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:
MOV ip, r3
POP {r3}
ADD sp, #offset
MOV lr, r3
MOV r3, ip
BX lr
http://reviews.llvm.org/D4748
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214928 91177308-0d34-0410-b5e6-96231b3b80d8
POP on armv4t cannot be used to change thumb state (unilke later non-m-class
architectures), therefore we need a different return sequence that uses 'bx'
instead:
POP {r3}
ADD sp, #offset
BX r3
This patch also fixes an issue where the return value in r3 would get clobbered
for functions that return 128 bits of data. In that case, we generate this
sequence instead:
MOV ip, r3
POP {r3}
ADD sp, #offset
MOV lr, r3
MOV r3, ip
BX lr
http://reviews.llvm.org/D4748
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214881 91177308-0d34-0410-b5e6-96231b3b80d8
Currently when DAGCombine converts loads feeding a switch into a switch of
addresses feeding a load the new load inherits the isInvariant flag of the left
side. This is incorrect since invariant loads can be reordered in cases where it
is illegal to reoarder normal loads.
This patch adds an isInvariant parameter to getExtLoad() and updates all call
sites to pass in the data if they have it or false if they don't. It also
changes the DAGCombine to use that data to make the right decision when
creating the new load.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214449 91177308-0d34-0410-b5e6-96231b3b80d8
Rename to allowsMisalignedMemoryAccess.
On R600, 8 and 16 byte accesses are mostly OK with 4-byte alignment,
and don't need to be split into multiple accesses. Vector loads with
an alignment of the element type are not uncommon in OpenCL code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@214055 91177308-0d34-0410-b5e6-96231b3b80d8
address of the stack guard was being spilled to the stack.
Previously the address of the stack guard would get spilled to the stack if it
was impossible to keep it in a register. This patch introduces a new target
independent node and pseudo instruction which gets expanded post-RA to a
sequence of instructions that load the stack guard value. Register allocator
can now just remat the value when it can't keep it in a register.
<rdar://problem/12475629>
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213967 91177308-0d34-0410-b5e6-96231b3b80d8
In order to enable the preservation of noalias function parameter information
after inlining, and the representation of block-level __restrict__ pointer
information (etc.), additional kinds of aliasing metadata will be introduced.
This metadata needs to be carried around in AliasAnalysis::Location objects
(and MMOs at the SDAG level), and so we need to generalize the current scheme
(which is hard-coded to just one TBAA MDNode*).
This commit introduces only the necessary refactoring to allow for the
introduction of other aliasing metadata types, but does not actually introduce
any (that will come in a follow-up commit). What it does introduce is a new
AAMDNodes structure to hold all of the aliasing metadata nodes associated with
a particular memory-accessing instruction, and uses that structure instead of
the raw MDNode* in AliasAnalysis::Location, etc.
No functionality change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213859 91177308-0d34-0410-b5e6-96231b3b80d8
When performing a dynamic stack adjustment without optimisations, we would mark
SP as def and R4 as kill. This occurred as part of the expansion of a
WIN__CHKSTK SDNode which indicated the proper handling of SP and R4. The result
would be that we would double define SP as part of an operation, which is
obviously incorrect.
Furthermore, the VTList for the chain had an incorrect parameter type of i32
instead of Other.
Correct these to permit proper lowering of __builtin_alloca at -O0.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@213442 91177308-0d34-0410-b5e6-96231b3b80d8