This patch uses the DiagnosticPredicate for SVE predicate patterns
to improve their diagnostics, now giving a 'invalid operand' diagnostic
if the type is not an immediate or one of the expected pattern
labels.
Reviewers: samparker, SjoerdMeijer, javed.absar, fhahn
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D48220
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334983 91177308-0d34-0410-b5e6-96231b3b80d8
The variants added by this patch are:
- SQINC signed increment, e.g. sqinc x0, w0, all, mul #4
- SQDEC signed decrement, e.g. sqdec x0, w0, all, mul #4
- UQINC unsigned increment, e.g. uqinc w0, all, mul #4
- UQDEC unsigned decrement, e.g. uqdec w0, all, mul #4
This patch includes asmparser changes to parse a GPR64 as a GPR32 in
order to satisfy the constraint check:
x0 == GPR64(w0)
in:
sqinc x0, w0, all, mul #4
^___^ (must match)
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47716
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334980 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds instructions for comparing elements from two vectors, e.g.
cmpgt p0.s, p0/z, z0.s, z1.s
and also adds support for comparing to a 64-bit wide element vector, e.g.
cmpgt p0.s, p0/z, z0.s, z1.d
The patch also contains aliases for certain comparisons, e.g.:
cmple p0.s, p0/z, z0.s, z1.s => cmpge p0.s, p0/z, z1.s, z0.s
cmplo p0.s, p0/z, z0.s, z1.s => cmphi p0.s, p0/z, z1.s, z0.s
cmpls p0.s, p0/z, z0.s, z1.s => cmphs p0.s, p0/z, z1.s, z0.s
cmplt p0.s, p0/z, z0.s, z1.s => cmpgt p0.s, p0/z, z1.s, z0.s
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334931 91177308-0d34-0410-b5e6-96231b3b80d8
Support for SVE's predicated select instructions to select elements
from either vector, both in a data-vector and a predicate-vector
variant.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334905 91177308-0d34-0410-b5e6-96231b3b80d8
Predicated splat/copy of SIMD/FP register or general purpose
register to SVE vector, along with MOV-aliases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334842 91177308-0d34-0410-b5e6-96231b3b80d8
Increment/decrement scalar register by (scaled) element count given by
predicate pattern, e.g. 'incw x0, all, mul #4'.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47713
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334838 91177308-0d34-0410-b5e6-96231b3b80d8
Some instructions require of a limited set of FP immediates as operands,
for example '#0.5 or #1.0' for SVE's FADD instruction.
This patch adds support for parsing and printing such FP immediates as
exact values (e.g. #0.499999 is not accepted for #0.5).
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47711
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334826 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For targets I'm not familiar with, I've automatically made the "default to 1 for each resource" behaviour explicit in the td files.
For more obvious cases, I've ventured a fix.
Some notes:
- Exynos is especially fishy.
- AArch64SchedThunderX2T99.td had some truncated entries. If I understand correctly, the person who wrote that interpreted the ResourceCycle as a range. I made the decision to use the upper/lower bound for consistency with the 'Latency' value. I'm sure there is a better choice.
- The change to X86ScheduleBtVer2.td is an NFC, it just makes values more explicit.
Also see PR37310.
Reviewers: RKSimon, craig.topper, javed.absar
Subscribers: kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D46356
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334586 91177308-0d34-0410-b5e6-96231b3b80d8
Register x20 is a callee-saved register which may be used for other
purposes in certain contexts, for example to hold special variables
within the kernel. This change adds support for reserving this register
both to frontend and backend to make this register usable for these
purposes.
Differential Revision: https://reviews.llvm.org/D46552
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334531 91177308-0d34-0410-b5e6-96231b3b80d8
On targets like Arm some relaxations may only be performed when certain
architectural features are available. As functions can be compiled with
differing levels of architectural support we must make a judgement on
whether we can relax based on the MCSubtargetInfo for the function. This
change passes through the MCSubtargetInfo for the function to
fixupNeedsRelaxation so that the decision on whether to relax can be made
per function. In this patch, only the ARM backend makes use of this
information. We must also pass the MCSubtargetInfo to applyFixup because
some fixups skip error checking on the assumption that relaxation has
occurred, to prevent code-generation errors applyFixup must see the same
MCSubtargetInfo as fixupNeedsRelaxation.
Differential Revision: https://reviews.llvm.org/D44928
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@334078 91177308-0d34-0410-b5e6-96231b3b80d8
This is setting up to fix bug 37573 cleanly.
This moves data structures that are technically both used in some way by the
target and the general-purpose outlining algorithm into MachineOutliner.h. In
particular, the `Candidate` class is of importance.
Before, the outliner passed the locations of `Candidates` to the target, which
would then make some decisions about the prospective outlined function. This
change allows us to just pass `Candidates` along to the target. This will allow
the target to discard `Candidates` that would be considered unsafe before cost
calculation. Thus, we will be able to remove the unsafe candidates described in
the bug without resorting to torching the entire prospective function.
Also, as a side-effect, it makes the outliner a bit cleaner.
https://bugs.llvm.org/show_bug.cgi?id=37573
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333952 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The new rules are straightforward. The main rules to keep in mind
are:
1. NAME is an implicit template argument of class and multiclass,
and will be substituted by the name of the instantiating def/defm.
2. The name of a def/defm in a multiclass must contain a reference
to NAME. If such a reference is not present, it is automatically
prepended.
And for some additional subtleties, consider these:
3. defm with no name generates a unique name but has no special
behavior otherwise.
4. def with no name generates an anonymous record, whose name is
unique but undefined. In particular, the name won't contain a
reference to NAME.
Keeping rules 1&2 in mind should allow a predictable behavior of
name resolution that is simple to follow.
The old "rules" were rather surprising: sometimes (but not always),
NAME would correspond to the name of the toplevel defm. They were
also plain bonkers when you pushed them to their limits, as the old
version of the TableGen test case shows.
Having NAME correspond to the name of the toplevel defm introduces
"spooky action at a distance" and breaks composability:
refactoring the upper layers of a hierarchy of nested multiclass
instantiations can cause unexpected breakage by changing the value
of NAME at a lower level of the hierarchy. The new rules don't
suffer from this problem.
Some existing .td files have to be adjusted because they ended up
depending on the details of the old implementation.
Change-Id: I694095231565b30f563e6fd0417b41ee01a12589
Reviewers: tra, simon_tatham, craig.topper, MartinO, arsenm, javed.absar
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D47430
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333900 91177308-0d34-0410-b5e6-96231b3b80d8
For immediates used in DUP instructions that have the range
-128 to 127, or a multiple of 256 in the range -32768 to 32512,
one could argue that when the result element size is 16bits (.h),
the value can be considered both signed and unsigned.
Reviewers: rengolin, fhahn, SjoerdMeijer, samparker, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47619
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333873 91177308-0d34-0410-b5e6-96231b3b80d8
Print the first indexed element as a FP register, for example:
mov z0.d, z1.d[0]
Is now printed as:
mov z0.d, d1
Next to printing, this patch also adds aliases to parse 'mov z0.d, d1'.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47571
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333872 91177308-0d34-0410-b5e6-96231b3b80d8
Unpredicated copy of indexed SVE element to SVE vector,
along with MOV-aliases.
For example:
dup z0.h, z1.h[0]
duplicates the first 16-bit element from z1 to all elements in
the result vector z0.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: SjoerdMeijer
Differential Revision: https://reviews.llvm.org/D47570
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333871 91177308-0d34-0410-b5e6-96231b3b80d8
Predicated copy of floating-point immediate value to SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47518
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333869 91177308-0d34-0410-b5e6-96231b3b80d8
Predicated copy of possibly shifted immediate value into SVE
vector, along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47517
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333868 91177308-0d34-0410-b5e6-96231b3b80d8
Unpredicated copy of floating-point immediate value into SVE vector,
along with MOV-aliases.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47482
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333744 91177308-0d34-0410-b5e6-96231b3b80d8
The immediate on an ADRP MCInst needs to be multiplied by 0x1000 to obtain the
actual PC-offset that will be calculated.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333525 91177308-0d34-0410-b5e6-96231b3b80d8
Floating point immediate combining a negative sign and
a hexadecimal number, e.g. #-0x0 caused the compiler to crash.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: javed.absar
Differential Revision: https://reviews.llvm.org/D47483
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333524 91177308-0d34-0410-b5e6-96231b3b80d8
This patch addresses the following variants:
- bitmask immediate, e.g. 'and z0.d, z0.d, #0x6'.
- unpredicated data vectors, e.g. 'and z0.d, z1.d, z2.d'.
- predicated data vectors, e.g. 'and z0.d, p0/m, z0.d, z1.d'.
And also several aliases, such as:
- ORN, alias of ORR.
- EON, alias of EOR.
- BIC, alias of AND (immediate variant)
- MOV, alias of ORR (if unpredicated and source register operands are the same)
Reviewers: rengolin, huntergr, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47363
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333414 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds addsub_imm8_opt_lsl_(i8|i16|i32|i64) operands
that are unsigned values in the range 0 to 255. For element widths of
16 bits or higher it may also be a signed multiple of 256 in the
range 0 to 65280.
Note: This also does some refactoring to reuse convenience function
getShiftedVal<shift>(), and now allows AArch64 scalar 'ADD #-4096' to be
accepted to be mapped to SUB #4096.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47310
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333408 91177308-0d34-0410-b5e6-96231b3b80d8
Unpredicated copy of optionally-shifted immediate to SVE vector,
along with MOV-aliases.
This patch contains parsing and printing support for
cpy_imm8_opt_lsl_(i8|i16|i32|i64). This operand allows a signed value in
the range -128 to +127. For element widths of 16 bits or higher it may
also be a signed multiple of 256 in the range -32768 to +32512.
For element-width of 8 bits a range of -128 to 255 is accepted, since a copy
of a byte can be considered either signed/unsigned.
Note: This patch renames tryParseAddSubImm() -> tryParseImmWithOptionalShift()
and moves the behaviour of trying to shift a plain immediate by an allowed
shift-value to its addImmWithOptionalShiftOperands() method, so that the
parsing itself is generic and allows immediates from multiple shifted operands.
This is done because an immediate can be divisible by both shifted operands.
Reviewers: rengolin, fhahn, samparker, SjoerdMeijer, javed.absar
Reviewed By: fhahn
Differential Revision: https://reviews.llvm.org/D47309
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333263 91177308-0d34-0410-b5e6-96231b3b80d8
The existing code has three different ways to try to lower a 64-bit
immediate to the sequence ORR+MOVK. The result is messy: it misses
some possible sequences, and the order of the checks means we sometimes
emit two MOVKs when we only need one.
Instead, just use a simple loop to try all possible two-instruction
ORR+MOVK sequences.
Differential Revision: https://reviews.llvm.org/D47176
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333218 91177308-0d34-0410-b5e6-96231b3b80d8
The AArch64 asm parser currently has custom parsing logic for .hword, .word,
and .xword. Rather than use this custom logic, we can just use
addAliasForDirective to enable the reuse of AsmParser::parseDirectiveValue.
Differential Revision: https://reviews.llvm.org/D47000
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333077 91177308-0d34-0410-b5e6-96231b3b80d8
When we're outlining a sequence that ends in a call, we can save up to
three instructions in the outlined function by turning the call into
a tail-call. I refer to this as thunk outlining because the resulting
outlined function looks like a thunk; suggestions welcome for a better
name.
In addition to making the outlined function shorter, thunk outlining
allows outlining calls which would otherwise be illegal to outline:
we don't need to save/restore LR, so we don't need to prove anything
about the stack access patterns of the callee.
To make this work effectively, I also added
MachineOutlinerInstrType::LegalTerminator to the generic MachineOutliner
code; this allows treating an arbitrary instruction as a terminator in
the suffix tree.
Differential Revision: https://reviews.llvm.org/D47173
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@333015 91177308-0d34-0410-b5e6-96231b3b80d8