DMB instructions can be expensive, so it's best to avoid them if possible. In
atomicrmw operations there will always be an attempted store so a release
barrier is always needed, but in the cmpxchg case we can delay the DMB until we
know we'll definitely try to perform a store (and so need release semantics).
In the strong cmpxchg case this isn't quite free: we must duplicate the LDREX
instructions to skip the barrier on subsequent iterations. The basic outline
becomes:
ldrex rOld, [rAddr]
cmp rOld, rDesired
bne Ldone
dmb
Lloop:
strex rRes, rNew, [rAddr]
cbz rRes Ldone
ldrex rOld, [rAddr]
cmp rOld, rDesired
beq Lloop
Ldone:
So we'll skip this version for strong operations in "minsize" functions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261568 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Convergent instrs shouldn't be made control-dependent on other values,
but this is basically the whole point of tail duplication. So just bail
if we see a convergent instruction.
Reviewers: iteratee
Subscribers: jholewinski, jhen, hfinkel, tra, jingyue, llvm-commits
Differential Revision: http://reviews.llvm.org/D17320
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261540 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r261510, effectively reapplying r261509. The
original commit missed a caller in AArch64ConditionalCompares.
Original commit message:
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261511 91177308-0d34-0410-b5e6-96231b3b80d8
Pass non-null arguments by reference in MachineTraceMetrics::Trace,
simplifying future work to remove implicit iterator => pointer
conversions.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261509 91177308-0d34-0410-b5e6-96231b3b80d8
Delete MachineInstr::getIterator(), since the term "iterator" is
overloaded when talking about MachineInstr.
- Downcast to ilist_node in iplist::getNextNode() and getPrevNode() so
that ilist_node::getIterator() is still available.
- Add it back as MachineInstr::getInstrIterator(). This matches the
naming in MachineBasicBlock.
- Add MachineInstr::getBundleIterator(). This is explicitly called
"bundle" (not matching MachineBasicBlock) to disintinguish it clearly
from ilist_node::getIterator().
- Update all calls. Some of these I switched to `auto` to remove
boiler-plate, since the new name is clear about the type.
There was one call I updated that looked fishy, but it wasn't clear what
the right answer was. This was in X86FrameLowering::inlineStackProbe(),
added in r252578 in lib/Target/X86/X86FrameLowering.cpp. I opted to
leave the behaviour unchanged, but I'll reply to the original commit on
the list in a moment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261504 91177308-0d34-0410-b5e6-96231b3b80d8
I missed == and != when I removed implicit conversions between iterators
and pointers in r252380 since they were defined outside ilist_iterator.
Since they depend on getNodePtrUnchecked(), they indirectly rely on UB.
This commit removes all uses of these operators. (I'll delete the
operators themselves in a separate commit so that it can be easily
reverted if necessary.)
There should be NFC here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261498 91177308-0d34-0410-b5e6-96231b3b80d8
Stop using `getNodePtrUnchecked()` when building IR. Eventually a
dereference will be required to get at the downcast node, since the
iterator will only store an `ilist_node_base` of some sort.
This should have no functionality change for now, but is a path towards
removing some more UB from ilist.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261495 91177308-0d34-0410-b5e6-96231b3b80d8
`ilist_iterator<NodeTy>::getNodePtrUnchecked()` is documented as being
for internal use only, but CodeGenPrepare was using it anyway. This
code relies on pulling out the `Value*` pointer even after the lifetime
of the iterator is over. But having this pointer available in
ilist_iterator depends on UB in the first place.
Instead, safely pull out the `Value*` when the iterator is alive and
stop using the internal-only API.
There should be no functionality change here.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261493 91177308-0d34-0410-b5e6-96231b3b80d8
COFF doesn't have sections with mergeable contents. Instead, each
constant pool entry ends up in a COMDAT section. The linker, when
choosing between COMDAT sections, doesn't choose the max alignment of
the two sections. You just get whatever alignment was on the section.
If one constant needed a higher alignment in one object file from
another one, then we will get into trouble if the linker chooses the
lower alignment one.
Instead, lets promote the alignment of the constant pool entry to make
sure we don't use an under aligned constant with an instruction which
assumed otherwise.
This fixes PR26680.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261462 91177308-0d34-0410-b5e6-96231b3b80d8
TailDuplicate can run on either on SSA code or non-SSA code, as indicated to
it by MRI->isSSA() ("PreRegAlloc" here). TailDuplicate does extra work to
preserve SSA invariants when it duplicates code. This patch makes it skip
some of this extra work in the case where the code is not in SSA form.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261450 91177308-0d34-0410-b5e6-96231b3b80d8
This avoids unnecessarily passing them around when calling helper
functions. It may also be slightly faster to call clear() on the
datastructures instead of freshly initializing them for each block.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261407 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we don't always add an element to AllocatedStackSlots if we
don't find a pre-existing unallocated stack slot, bumping
StatepointMaxSlotsRequired to `NumSlots + 1` is not correct. Instead
bump the statistic near the push_back, to
Builder.FuncInfo.StatepointStackSlots.size().
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261348 91177308-0d34-0410-b5e6-96231b3b80d8
The check on MFI->getObjectSize() has to be on the FrameIndex, not on
the index of the FrameIndex in AllocatedStackSlots. Weirdly, the tests
I added in rL261336 didn't catch this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261347 91177308-0d34-0410-b5e6-96231b3b80d8
NFCI. They key motivation here is that I'd like to use
SmallBitVector::all() in a later change. Also, using a bit vector here
seemed better in general.
The only interesting change here is that in the failure case of
allocateStackSlot, we no longer (the equivalent of) push_back(true) to
AllocatedStackSlots. As far as I can tell, this is fine, since we'd
never re-use those slots in the same StatepointLoweringState instance.
Technically there was no need to change the operator[] type accesses to
set() and test(), but I thought it'd be nice to make it obvious that
we're using something other than a std::vector like thing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261337 91177308-0d34-0410-b5e6-96231b3b80d8
allocateStackSlot did not consider the size of the value to be spilled
before deciding to re-use a spill slot. This was originally okay (since
originally we'd only ever spill pointers), but it became not okay when
we changed our scheme to directly spill vectors of pointers.
While this change fixes the bug pointed out, it has two performance
caveats:
- It matches spill slot and spillee size exactly, while in theory we
can spill, e.g., an 8 byte pointer into a 16 byte slot. This is
slightly complicated to fix since in the stackmaps section, we report
the size of the spill slot as the size of the "indirect value"; and
if they're no longer equivalent, we'll have to keep track of the
(indirect) value size separately from the stack slot size.
- It will "spuriously run out" of reusable slots, since we now have an
second check in the search loop in addition to the availablity
check (e.g. you had two free scalar slots, and you first ask for a
vector slot followed by a scalar slot). I'll fix this in a later
commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261336 91177308-0d34-0410-b5e6-96231b3b80d8
This removes the unusual loop structure in allocateStackSlot in favor of
something more straightforward. I've also removed the cautionary
comment in the function, which I suspect is historical cruft now, and
confuses more than it enlightens.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261335 91177308-0d34-0410-b5e6-96231b3b80d8
Certain optimization passes (like globaldce) can prune function
declaration that SjLjEHPrepare assumed would exit when it'd
runOnFunction.
This fixes PR26669.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261303 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Without this, this command
$ llvm-run llc -stop-after machine-cp -o - <( echo '' )
outputs an error, because we close stdout twice -- once when closing the
file opened for "-o", and again when closing outs().
Also clarify in the outs() definition that you can't ever call it if you
want to open your own raw_fd_ostream on stdout.
Reviewers: jroelofs, tstellarAMD
Subscribers: jholewinski, qcolombet, dsanders, llvm-commits
Differential Revision: http://reviews.llvm.org/D17422
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261286 91177308-0d34-0410-b5e6-96231b3b80d8
Today, we do not allow cmpxchg operations with pointer arguments. We require the frontend to insert ptrtoint casts and do the cmpxchg in integers. While correct, this is problematic from a couple of perspectives:
1) It makes the IR harder to analyse (for instance, it make capture tracking overly conservative)
2) It pushes work onto the frontend authors for no real gain
This patch implements the simplest form of IR support. As we did with floating point loads and stores, we teach AtomicExpand to convert back to the old representation. This prevents us needing to change all backends in a single lock step change. Over time, we can migrate each backend to natively selecting the pointer type. In the meantime, we get the advantages of a cleaner IR representation without waiting for the backend changes.
Differential Revision: http://reviews.llvm.org/D17413
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261281 91177308-0d34-0410-b5e6-96231b3b80d8
covmap needs to created as non allocatable, but not with
SHT_NOTE. The latter was needed to workaround a problem
of BFD linker with gc, which is no longer needed. (A more
proper longer term fix requires changing FE driver to force
referencing the section using linker script).
Differential Revision: http://reviews.llvm.org/D17309
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261228 91177308-0d34-0410-b5e6-96231b3b80d8
The commit breaks stage2 compilation on PowerPC. Reverting for now while
this is analyzed. I also have to revert the LiveIntervalTest for now as
that depends on this commit.
Revert "LiveIntervalAnalysis: Remove LiveVariables requirement"
This reverts commit r260806.
Revert "Remove an unnecessary std::move to fix -Wpessimizing-move warning."
This reverts commit r260931.
Revert "Fix typo in LiveIntervalTest"
This reverts commit r260907.
Revert "Add unittest for LiveIntervalAnalysis::handleMove()"
This reverts commit r260905.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261189 91177308-0d34-0410-b5e6-96231b3b80d8
This patch detects vector reductions before instruction selection. Vector
reductions are vectorized reduction operations, and for such operations we have
freedom to reorganize the elements of the result as long as the reduction of them
stay unchanged. This will enable some reduction pattern recognition during
instruction combine such as SAD/dot-product on X86. A flag is added to
SDNodeFlags to mark those vector reduction nodes to be checked during instruction
combine.
To detect those vector reductions, we search def-use chains starting from the
given instruction, and check if all uses fall into two categories:
1. Reduction with another vector.
2. Reduction on all elements.
in which 2 is detected by recognizing the pattern that the loop vectorizer
generates to reduce all elements in the vector outside of the loop, which
includes several ShuffleVector and one ExtractElement instructions.
Differential revision: http://reviews.llvm.org/D15250
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261070 91177308-0d34-0410-b5e6-96231b3b80d8
This apparently comes up when the register allocator decides that a
variable will become undef along a certain path.
Also improve the error message we emit when we can't map from LLVM
register number to CV register number.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261016 91177308-0d34-0410-b5e6-96231b3b80d8
Eventually we should find a way to describe constant variables, but it
is not obvious how to do this at the moment.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@261010 91177308-0d34-0410-b5e6-96231b3b80d8
Original message:
Get rid of the ifdefs in TargetLowering.
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260998 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce a new API used only by GlobalISel: CallLowering.
This API will contain target hooks dedicated to call lowering.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@260922 91177308-0d34-0410-b5e6-96231b3b80d8