Because we've canonicalised on using LD1/ST1, every time we do a bitcast
between vector types we must do an equivalent lane reversal.
Consider a simple memory load followed by a bitconvert then a store.
v0 = load v2i32
v1 = BITCAST v2i32 v0 to v4i16
store v4i16 v2
In big endian mode every memory access has an implicit byte swap. LDR and
STR do a 64-bit byte swap, whereas LD1/ST1 do a byte swap per lane - that
is, they treat the vector as a sequence of elements to be byte-swapped.
The two pairs of instructions are fundamentally incompatible. We've decided
to use LD1/ST1 only to simplify compiler implementation.
LD1/ST1 perform the equivalent of a sequence of LDR/STR + REV. This makes
the original code sequence: v0 = load v2i32
v1 = REV v2i32 (implicit)
v2 = BITCAST v2i32 v1 to v4i16
v3 = REV v4i16 v2 (implicit)
store v4i16 v3
But this is now broken - the value stored is different to the value loaded
due to lane reordering. To fix this, on every BITCAST we must perform two
other REVs:
v0 = load v2i32
v1 = REV v2i32 (implicit)
v2 = REV v2i32
v3 = BITCAST v2i32 v2 to v4i16
v4 = REV v4i16
v5 = REV v4i16 v4 (implicit)
store v4i16 v5
This means an extra two instructions, but actually in most cases the two REV
instructions can be combined into one. For example:
(REV64_2s (REV64_4h X)) === (REV32_4h X)
There is also no 128-bit REV instruction. This must be synthesized with an
EXT instruction.
Most bitconverts require some sort of conversion. The only exceptions are:
a) Identity conversions - vNfX <-> vNiX
b) Single-lane-to-scalar - v1fX <-> fX or v1iX <-> iX
Even though there are hundreds of changed lines, I have a fairly high confidence
that they are somewhat correct. The changes to add two REV instructions per
bitcast were pretty mechanical, and once I'd done that I threw the resulting
.td at a script I wrote which combined the two REVs together (and added
an EXT instruction, for f128) based on an instruction description I gave it.
This was much less prone to error than doing it all manually, plus my brain
would not just have melted but would have vapourised.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208194 91177308-0d34-0410-b5e6-96231b3b80d8
This completes the port of r204814 (cpirker "AArch64_BE function argument
passing for ARM ABI") from AArch64 to ARM64, and fixes a bunch of issues
found during later development along the way. The biggest of these was
that the alignment fixup logic wasn't replicated into all the places it
should have been.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208192 91177308-0d34-0410-b5e6-96231b3b80d8
This is a third patch of patch series that improves MergeFunctions
performance time from O(N*N) to O(N*log(N)).
This patch description:
Being comparing functions we need to compare values we meet at left and
right sides.
Its easy to sort things out for external values. It just should be
the same value at left and right.
But for local values (those were introduced inside function body)
we have to ensure they were introduced at exactly the same place,
and plays the same role.
In short, patch introduces values serial numbering and comparison routine.
The last one compares two values by their serial numbers.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208189 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The overall idea is to chop the Predicates list into subsets that are
usually overridden independently. This allows subclasses to partially
override the predicates of their superclasses without having to re-add all
the existing predicates.
This patch starts the process by moving HasStdEnc into a new
EncodingPredicates list and almost everything else into
AdditionalPredicates.
It has revealed a couple likely bugs where 'let Predicates' has removed
the HasStdEnc predicate.
No functional change (confirmed by diffing tablegen-erated files).
Depends on D3549, D3506
Reviewers: vmedic
Differential Revision: http://reviews.llvm.org/D3550
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208184 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
It concatenates two or more lists. In addition to the !strconcat semantics
the lists must have the same element type.
My overall aim is to make it easy to append to Instruction.Predicates
rather than override it. This can be done by concatenating lists passed as
arguments, or by concatenating lists passed in additional fields.
Reviewers: dsanders
Reviewed By: dsanders
Subscribers: hfinkel, llvm-commits
Differential Revision: http://reviews.llvm.org/D3506
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208183 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This will make it easier to prove that a more complicated change in the
following commit is non-functional.
No functional change.
Depends on D3506
Reviewers: vmedic
Reviewed By: vmedic
Differential Revision: http://reviews.llvm.org/D3549
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208179 91177308-0d34-0410-b5e6-96231b3b80d8
O(N*log(N)). The idea is to introduce total ordering among functions set.
It allows to build binary tree and perform function look-up procedure in O(log(N)) time.
This patch description:
Introduced total ordering among constants implemented in cmpConstants method.
Method performs lexicographical comparison between constants represented as
hypothetical numbers of next format:
<bitcastability-trait><raw-bit-contents>
Please, read cmpConstants declaration comments for more details.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208173 91177308-0d34-0410-b5e6-96231b3b80d8
This field is used for a list of variables to ensure they are not lost
during optimization (they're only included when optimizations are
enabled).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208159 91177308-0d34-0410-b5e6-96231b3b80d8
interface methods isCOFF().
The '-coff' command line option has been removed. It was not used in any
test cases.
The patch reviewed by Michael Spencer.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208157 91177308-0d34-0410-b5e6-96231b3b80d8
Mark up additional instructions which are part of the function prologue as
MachineFrameSetup. These instructions are part of the function prologue,
emitted by the PEI pass to setup the stack for use in the activating frame.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208153 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM::BLX instruction is an ARM mode instruction. The Windows on ARM target
is limited to Thumb instructions. Correctly use the thumb mode tBLXr
instruction. This would manifest as an errant write into the object file as the
instruction is 4-bytes in length rather than 2. The result would be a corrupted
object file that would eventually result in an executable that would crash at
runtime.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208152 91177308-0d34-0410-b5e6-96231b3b80d8
If the source files referenced by a gcno file are missing, gcov
outputs a coverage file where every line is simply /*EOF*/. This also
occurs for lines in the coverage that are past the end of a file that
is found.
This change mimics gcov.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208149 91177308-0d34-0410-b5e6-96231b3b80d8
In gcov, there's a -n/--no-output option, which disables the writing
of any .gcov files, so that it emits only the summary info on stdout.
This implements the same behaviour in llvm-cov.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208148 91177308-0d34-0410-b5e6-96231b3b80d8
This is similar to the getAlignment patch, but is done just for
completeness. It looks like we never call getSection on an alias. All the
tests still pass if the if is replaced with an assert.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208139 91177308-0d34-0410-b5e6-96231b3b80d8
remove it from the list of unspilled registers. Otherwise the following
attempt to keep the stack aligned by picking an extra GPR register to
spill will not work as it picks up r11.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208129 91177308-0d34-0410-b5e6-96231b3b80d8
This removes arguments passed everywhere and allows the use of
standard iteration over lists.
Should be no functional change.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208127 91177308-0d34-0410-b5e6-96231b3b80d8
Split from the musttail inliner change. This will be covered by an opt
test when the inliner change lands.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208126 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When I initially introduced -pass-remarks, I thought it would be a
neat idea to make it additive. So, if one used it as:
$ llc -pass-remarks=inliner --pass-remarks=loop.*
the compiler would build the regular expression '(inliner)|(loop.*)'.
The more I think about it, the more I regret it. This is not how
other flags work. The standard semantics are right-to-left overrides.
This is how clang interprets -Rpass. And I think the two should be
compatible in this respect.
Reviewers: qcolombet
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D3614
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208122 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch, the backend always emitted a store+load sequence to
bitconvert from f64 to i64 the input operand of a ISD::BITCAST dag node that
performed a bitconvert from type MVT::f64 to type MVT::v2i32. The resulting
i64 node was then used to build a v2i32 vector.
With this patch, the backend now produces a cheaper SCALAR_TO_VECTOR from
MVT::f64 to MVT::v2f64. That SCALAR_TO_VECTOR is then followed by a "free"
bitcast to type MVT::v4i32. The elements of the resulting
v4i32 are then extracted to build a v2i32 vector (which is illegal and
therefore promoted to MVT::v2i64).
This is in general cheaper than emitting a stack store+load sequence
to bitconvert the operand from type f64 to type i64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208107 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements the infrastructure to use named register constructs in
programs that need access to specific registers (bare metal, kernels, etc).
So far, only the stack pointer is supported as a technology preview, but as it
is, the intrinsic can already support all non-allocatable registers from any
architecture.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208104 91177308-0d34-0410-b5e6-96231b3b80d8
An alias has the address of what it points to, so it also has the same
alignment.
This allows a few optimizations to see past aliases for free.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208103 91177308-0d34-0410-b5e6-96231b3b80d8
fall back to the normal path without a cpu. While doing this fix
llc to just exit when we don't have a module to process instead of
asserting.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208102 91177308-0d34-0410-b5e6-96231b3b80d8
The fact that GlobalAlias::setAlignment exists at all is a side effect of
how the classes are organized, it should never be used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208094 91177308-0d34-0410-b5e6-96231b3b80d8
Obviously we can't expect the two backends to produce identical diagnostics,
since what's possible depends quite a bit on how the .td files are structured.
I think the ARM64 diagnostics are basically of the same quality in all the
changed cases, so I've split the CHECK lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@208084 91177308-0d34-0410-b5e6-96231b3b80d8