When a block has no terminator instructions, getFirstTerminator() returns
end(), which can't be used in dominance checks. Check dominance for phi
operands separately.
Also, remove some bits from WebAssemblyRegStackify.cpp that were causing
trouble on the same testcase; they were left behind from an earlier
experiment.
Differential Revision: http://reviews.llvm.org/D15210
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254662 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
These ADJCALLSTACK markers don't generate code, but they keep dynamic
alloca code that calls chkstk out of the prologue.
This slightly pessimizes inalloca calls by preventing some register copy
coalescing, but I can live with that.
Reviewers: qcolombet
Subscribers: hans, llvm-commits
Differential Revision: http://reviews.llvm.org/D15200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254645 91177308-0d34-0410-b5e6-96231b3b80d8
This works mostly fine but breaks some stage 1 builders when compiling
compiler-rt on i386. Revert for further investigation as I can't see an
obvious cause/fix.
This reverts commit r254577.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254586 91177308-0d34-0410-b5e6-96231b3b80d8
The new algorithm remembers the uses encountered while walking backwards
until a matching def is found. Contrary to the previous version this:
- Works without LiveIntervals being available
- Allows to increase the precision to subregisters/lanemasks
(not used for now)
The changes in the AMDGPU tests are necessary because the R600 scheduler
is not stable with respect to the order of nodes in the ready queues.
Differential Revision: http://reviews.llvm.org/D9068
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254577 91177308-0d34-0410-b5e6-96231b3b80d8
AggressiveAntiDepBreaker was renaming registers specified by the user
for inline assembly. While this will work for compiler-specified
registers, it won't work for user-specified registers, and at the time
this runs, I don't currently see a way to distinguish them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254532 91177308-0d34-0410-b5e6-96231b3b80d8
The ARM ARM is clear that 128-bit loads are only guaranteed to have been atomic
if there has been a corresponding successful stxp. It's less clear for AArch32, so
I'm leaving that alone for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254524 91177308-0d34-0410-b5e6-96231b3b80d8
We mustn't introduce a shift of exactly 64-bits for any inputs, since that's an
UNDEF value (and worse, it's not what you want with the natural Arch64
implementation).
The generated code is pretty horrific, but I couldn't come up with an obviously
better alternative (if the amount is constant EXTR could help). Turns out
128-bit shifts are just nasty.
rdar://22491037
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254475 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This had been broken for a very long time, but nobody noticed until
D14357 enabled shrink-wrapping by default.
Reviewers: jroelofs, qcolombet
Subscribers: tyomitch, llvm-commits, rengolin
Differential Revision: http://reviews.llvm.org/D14986
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254444 91177308-0d34-0410-b5e6-96231b3b80d8
The @llvm.get.dynamic.area.offset.* intrinsic family is used to get the offset
from native stack pointer to the address of the most recent dynamic alloca on
the caller's stack. These intrinsics are intendend for use in combination with
@llvm.stacksave and @llvm.restore to get a pointer to the most recent dynamic
alloca. This is useful, for example, for AddressSanitizer's stack unpoisoning
routines.
Patch by Max Ostapenko.
Differential Revision: http://reviews.llvm.org/D14983
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254404 91177308-0d34-0410-b5e6-96231b3b80d8
(This is the second attempt to submit this patch. The first caused two assertion
failures and was reverted. See https://llvm.org/bugs/show_bug.cgi?id=25687)
The patch in http://reviews.llvm.org/D13745 is broken into four parts:
1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.
This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.
All uses of weight-based interfaces are now updated to use probability-based
ones.
Differential revision: http://reviews.llvm.org/D14973
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254377 91177308-0d34-0410-b5e6-96231b3b80d8
and the follow-up r254356: "Fix a bug in MachineBlockPlacement that may cause assertion failure during BranchProbability construction."
Asserts were firing in Chromium builds. See PR25687.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254366 91177308-0d34-0410-b5e6-96231b3b80d8
The patch in http://reviews.llvm.org/D13745 is broken into four parts:
1. New interfaces without functional changes (http://reviews.llvm.org/D13908).
2. Use new interfaces in SelectionDAG, while in other passes treat probabilities
as weights (http://reviews.llvm.org/D14361).
3. Use new interfaces in all other passes.
4. Remove old interfaces.
This patch is 3+4 above. In this patch, MBB won't provide weight-based
interfaces any more, which are totally replaced by probability-based ones.
The interface addSuccessor() is redesigned so that the default probability is
unknown. We allow unknown probabilities but don't allow using it together
with known probabilities in successor list. That is to say, we either have a
list of successors with all known probabilities, or all unknown
probabilities. In the latter case, we assume each successor has 1/N
probability where N is the number of successors. An assertion checks if the
user is attempting to add a successor with the disallowed mixed use as stated
above. This can help us catch many misuses.
All uses of weight-based interfaces are now updated to use probability-based
ones.
Differential revision: http://reviews.llvm.org/D14973
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254348 91177308-0d34-0410-b5e6-96231b3b80d8
We currently output FMA instructions on targets which support both FMA4 + FMA (i.e. later Bulldozer CPUS bdver2/bdver3/bdver4).
This patch flips this so FMA4 is preferred; this is for several reasons:
1 - FMA4 is non-destructive reducing the need for mov instructions.
2 - Its more straighforward to commute and fold inputs (although the recent work on FMA has reduced this difference).
3 - All supported targets have FMA4 performance equal or better to FMA - Piledriver (bdver2) in particular has half the throughput when executing FMA instructions.
Its looks like no future AMD processor lines will support FMA4 after the Bulldozer series so we're not causing problems for later CPUs.
Differential Revision: http://reviews.llvm.org/D14997
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254339 91177308-0d34-0410-b5e6-96231b3b80d8
If we know we have stack objects, we reserve the registers
that the private buffer resource and wave offset are passed
and use them directly.
If not, reserve the last 5 SGPRs just in case we need to spill.
After register allocation, try to pick the next available registers
instead of the last SGPRs, and then insert copies from the inputs
to the reserved registers in the progloue.
This also only selectively enables all of the input registers
which are really required instead of always enabling them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254331 91177308-0d34-0410-b5e6-96231b3b80d8
It does not work because of emergency stack slots.
This pass was supposed to eliminate dummy registers for the
spill instructions, but the register scavenger can introduce
more during PrologEpilogInserter, so some would end up
left behind if they were needed.
The potential for spilling the scratch resource descriptor
and offset register makes doing something like this
overly complicated. Reserve registers to use for the resource
descriptor and use them directly in eliminateFrameIndex.
Also removes creating another scratch resource descriptor
when directly selecting scratch MUBUF instructions.
The choice of which registers are reserved is temporary.
For now it attempts to pick the next available registers
after the user and system SGPRs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254329 91177308-0d34-0410-b5e6-96231b3b80d8
The MachineVerifier wants to check that the register operands of an
instruction belong to the instruction's register class. RIP-relative
control flow instructions violated this by referencing RIP. While this
was fixed for SysV, it was never fixed for Win64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254315 91177308-0d34-0410-b5e6-96231b3b80d8
Re-enable shrink wrapping for PPC64 Little Endian.
One minor modification to PPCFrameLowering::findScratchRegister was necessary to handle fall-thru blocks (blocks with no terminator) correctly.
Tested with all LLVM test, clang tests, and the self-hosting build, with no problems found.
PHabricator: http://reviews.llvm.org/D14778
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254314 91177308-0d34-0410-b5e6-96231b3b80d8
We could already recognise shuffle(FSUB, FADD) -> ADDSUB, this allow us to recognise shuffle(FADD, FSUB) -> ADDSUB by commuting the shuffle mask prior to matching.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254259 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r254201 and r254202, as it broke test-suite,
self-hosting and sanitizer tests on ARM buildbots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254234 91177308-0d34-0410-b5e6-96231b3b80d8
Added FMADD/FMSUB/FNMADD/FNMSUB tests for all types
Added load folding tests for 512-bit vectors
NOTE: Many of the AVX512 FMA instructions don't yet commute/fold correctly
As discussed on D14909
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254232 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements dynamic realignment of stack objects for targets
with a non-realigned stack pointer. Behaviour in FunctionLoweringInfo
is changed so that for a target that has StackRealignable set to
false, over-aligned static allocas are considered to be variable-sized
objects and are handled with DYNAMIC_STACKALLOC nodes.
It would be good to group aligned allocas into a single big alloca as
an optimization, but this is yet todo.
SystemZ benefits from this, due to its stack frame layout.
New tests SystemZ/alloca-03.ll for aligned allocas, and
SystemZ/alloca-04.ll for "no-realign-stack" attribute on functions.
Review and help from Ulrich Weigand and Hal Finkel.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254227 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Since this build attribute corresponds to a whole module, and
different functions in a module may differ in the optimizations
enabled for them, this attribute is emitted after all functions,
and only in the case that the optimization goals for all
functions match.
Reviewers: logan, hans
Subscribers: aemerson, rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D14934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@254201 91177308-0d34-0410-b5e6-96231b3b80d8