For stack frames requiring realignment, three pointers may be needed:
- ebp to address incoming arguments
- esi (could be any callee-saved register) to address locals
- esp to address outgoing arguments
We would use esi unconditionally without verifying that it did not
conflict with inline assembly.
This change doesn't do the verification, it simply emits a fatal error
on functions that use stack realignment, dynamic SP adjustments, and
inline assembly.
Because stack realignment is common on Windows, we also no longer assume
that MS inline assembly clobbers esp. Instead, we analyze the inline
instructions for implicit definitions and check if esp is there. If so,
we require the use of a base pointer and consider it in the condition
above.
Mostly fixes PR16830, but we could try harder to find a non-conflicting
base pointer.
Reviewers: sunfish
Differential Revision: http://llvm-reviews.chandlerc.com/D1317
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196876 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by Jiangning Liu.
With some test case changes:
- intrinsic test added to the existing /test/CodeGen/AArch64/neon-aba-abd.ll.
- New test cases to cover movi 1D scenario without using the intrinsic in
test/CodeGen/AArch64/neon-mov.ll.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196806 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The MSA ld.[bhwd] and st.[bhwd] instructions scale the immediate by the
element size before use as an offset. The offset must therefore be a
multiple of the element size to be valid in these instructions. However,
an unaligned base address is valid in MSA.
This commit causes the compiler to emit valid code when the calculated
offset is not a multiple of the element size by accounting for the offset
using addiu and using a zero offset in the load/store.
Depends on D2338
Reviewers: matheusalmeida
Reviewed By: matheusalmeida
Differential Revision: http://llvm-reviews.chandlerc.com/D2339
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196777 91177308-0d34-0410-b5e6-96231b3b80d8
As we can't make a complete solution now it has been decided to enable .set directive to handle long jump expressions. This will cause parser to report errors when parsing integer based register assignments, for example:
.set r3, will be reported as error. Still, the need for expressions is higher priority as the integer based register assignments are Mips specific and can be avoided using register names.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196773 91177308-0d34-0410-b5e6-96231b3b80d8
When trying to eliminate an "sub sp, sp, #N" instruction by folding
it into an existing push/pop using dummy registers, we need to account
for the fact that this might affect precisely how "fp" gets set in the
prologue.
We were attempting this, but assuming that *whenever* we performed a
fold it would make a difference. This is false, for example, in:
push {r4, r7, lr}
add fp, sp, #4
vpush {d8}
sub sp, sp, #8
we can fold the "sub" into the "vpush", forming "vpush {d7, d8}".
However, in that case the "add fp" instruction mustn't change, which
we were getting wrong before.
Should fix PR18160.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196725 91177308-0d34-0410-b5e6-96231b3b80d8
They were out of place since the introduction of arbitrary precision integer
types.
This also synchronizes the documentation to Types.h, so it refers to first class
types and single value types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196661 91177308-0d34-0410-b5e6-96231b3b80d8
- krait processor currently modeled with the same features as A9.
- Krait processor additionally has VFP4 (fused multiply add/sub)
and hardware division features enabled.
- krait has currently the same Schedule model as A9
- krait cpu flag is not recognized by the GNU assembler yet,
it is replaced with march=armv7-a to avoid a lower march
from being used.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196619 91177308-0d34-0410-b5e6-96231b3b80d8
The current peephole optimizing for compare inst assumes an instr that
uses CPSR has an MO for ARM Cond code.However, for VSEL instructions
(vseqeq, vselgt, vselgt, vselvs), there is no such operand nor do
they support the modification of Cond Code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196588 91177308-0d34-0410-b5e6-96231b3b80d8
Since z has no setcc instruction as such, the choice of setBooleanContents
is a bit arbitrary. Currently it's set to ZeroOrOneBooleanContent,
so we produced a branch-free form when selecting between 0 and 1,
but not when selecting between 0 and -1. This patch handles the latter
case too.
At some point I'd like to measure whether it's better to use conditional
moves for constant selects on z196, but that's future work.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196578 91177308-0d34-0410-b5e6-96231b3b80d8
in case the operands are constants and its difference is |1|.
It should be possible in those cases to rematerialize the result using
MIPS's slt and similar instructions.
The small update to some of the tests in cmov.ll, sel1c.ll and sel2c.ll was needed
otherwise the optimization implemented in this patch would have been triggered
(difference between the operands was 1) and that would have changed the semantic
of the tests.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196498 91177308-0d34-0410-b5e6-96231b3b80d8
The structure of the code was slightly modified so that the next patch is easier to read/review.
No functional changes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196496 91177308-0d34-0410-b5e6-96231b3b80d8
not being correctly encoded/decoded.
In more detail, immediate fields of LD/ST instructions should be
divided/multiplied by the size of the data format before encoding and
after decoding, respectively.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196494 91177308-0d34-0410-b5e6-96231b3b80d8
We were trying to fold the stack adjustment into the wrong instruction in the
situation where the entire basic-block was epilogue code. Really, it can only
ever be valid to do the folding precisely where the "add sp, ..." would be
placed so there's no need for a separate iterator to track that.
Should fix PR18136.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196493 91177308-0d34-0410-b5e6-96231b3b80d8
getSymbolWithGlobalValueBase use is to create a name of a new symbol based
on the name of an existing GV. Assert that and then remove the last call
to pass true to isImplicitlyPrivate.
This gives the mangler API a 1:1 mapping from GV to names, which is what we
need to drop the mangler dependency on the target (and use an extended
datalayout instead).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196472 91177308-0d34-0410-b5e6-96231b3b80d8
This patch tries to avoid unrelated changes other than fixing a few
hyphen-related ambiguities and contractions in nearby lines.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196471 91177308-0d34-0410-b5e6-96231b3b80d8
given
declare void @llvm.memset.p0i8.i32(i8* nocapture, i8, i32, i32, i1)
declare void @foo()
define void @bar() {
call void @foo()
call void @llvm.memset.p0i8.i32(i8* null, i8 0, i32 188, i32 1, i1 false)
ret void
}
We used to produce
L_foo$stub:
.indirect_symbol _foo
.ascii "\364\364\364\364\364"
_memset$stub:
.indirect_symbol _memset
.ascii "\364\364\364\364\364"
We not produce a private stub for memset too.
Stubs are not needed with recent linkers, but we still produce them for darwin8.
Thanks to David Fang for confirming that gcc used to do this too.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@196468 91177308-0d34-0410-b5e6-96231b3b80d8