Masked-expand-load node represents load operation that loads a variable amount of elements from memory according to amount of "true" bits in the mask and expands the loaded elements according to their position in the mask vector.
Right now, the node is used in intrinsics for VEXPAND* instructions.
The work is done towards implementation of masked.expandload and masked.compressstore intrinsics.
Differential Revision: https://reviews.llvm.org/D25322
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283694 91177308-0d34-0410-b5e6-96231b3b80d8
The core of the change is supposed to be NFC, however it also fixes
what I believe was an undefined behavior when calling:
va_start(ValueArgs, Desc);
with Desc being a StringRef.
Differential Revision: https://reviews.llvm.org/D25342
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283671 91177308-0d34-0410-b5e6-96231b3b80d8
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
Issue from previous rollback fixed, and a new test was added for that
case as well. Issue was worklist/scheduling/taildup issue in layout.
Issue from 2nd rollback fixed, with 2 additional tests. Issue was
tail merging/loop info/tail-duplication causing issue with loops that share
a header block.
Differential revision: https://reviews.llvm.org/D18226
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283619 91177308-0d34-0410-b5e6-96231b3b80d8
The code used llvm basic block predecessors to decided where to insert phi
nodes. Instruction selection can and will liberally insert new machine basic
block predecessors. There is not a guaranteed one-to-one mapping from pred.
llvm basic blocks and machine basic blocks.
Therefore the current approach does not work as it assumes we can mark
predecessor machine basic block as needing a copy, and needs to know the set of
all predecessor machine basic blocks to decide when to insert phis.
Instead of computing the swifterror vregs as we select instructions, propagate
them at the end of instruction selection when the MBB CFG is complete.
When an instruction needs a swifterror vreg and we don't know the value yet,
generate a new vreg and remember this "upward exposed" use, and reconcile this
at the end of instruction selection.
This will only happen if the target supports promoting swifterror parameters to
registers and the swifterror attribute is used.
rdar://28300923
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283617 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This makes a change to the state used to maintain visited information for depth first iterator. We know assume a method "completed(...)" which is called after all children of a node have been visited. In all existing cases, this method does nothing so this patch has no functional changes. It will however allow a client to distinguish back from cross edges in a DFS tree.
Reviewers: nadav, mehdi_amini, dberlin
Subscribers: MatzeB, mzolotukhin, twoh, freik, llvm-commits
Differential Revision: https://reviews.llvm.org/D25191
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283391 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit 062ace9764.
Issue with loop info and block removal revealed by polly.
I have a fix for this issue already in another patch, I'll re-roll this
together with that fix, and a test case.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283292 91177308-0d34-0410-b5e6-96231b3b80d8
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
Issue from previous rollback fixed, and a new test was added for that
case as well.
Differential revision: https://reviews.llvm.org/D18226
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283274 91177308-0d34-0410-b5e6-96231b3b80d8
The motivation for the change is that we can't have pseudo-global settings for
codegen living in TargetOptions because that doesn't work with LTO.
Ideally, these reciprocal attributes will be moved to the instruction-level via
FMF, metadata, or something else. But making them function attributes is at least
an improvement over the current state.
The ingredients of this patch are:
Remove the reciprocal estimate command-line debug option.
Add TargetRecip to TargetLowering.
Remove TargetRecip from TargetOptions.
Clean up the TargetRecip implementation to work with this new scheme.
Set the default reciprocal settings in TargetLoweringBase (everything is off).
Update the PowerPC defaults, users, and tests.
Update the x86 defaults, users, and tests.
Note that if this patch needs to be reverted, the related clang patch checked in
at r283251 should be reverted too.
Differential Revision: https://reviews.llvm.org/D24816
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283252 91177308-0d34-0410-b5e6-96231b3b80d8
The tail duplication pass uses an assumed layout when making duplication
decisions. This is fine, but passes up duplication opportunities that
may arise when blocks are outlined. Because we want the updated CFG to
affect subsequent placement decisions, this change must occur during
placement.
In order to achieve this goal, TailDuplicationPass is split into a
utility class, TailDuplicator, and the pass itself. The pass delegates
nearly everything to the TailDuplicator object, except for looping over
the blocks in a function. This allows the same code to be used for tail
duplication in both places.
This change, in concert with outlining optional branches, allows
triangle shaped code to perform much better, esepecially when the
taken/untaken branches are correlated, as it creates a second spine when
the tests are small enough.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@283164 91177308-0d34-0410-b5e6-96231b3b80d8
This is a step toward statically allocate InstructionMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.
This should already improve compile time by getting rid of a bunch of
memmove of SmallVectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282643 91177308-0d34-0410-b5e6-96231b3b80d8
This adds a comment explaining why we will duplicate PartialMapping to
represent the breakdown for complex mappings (mappings with more than
one partial mapping), instead of using an array of pointer.
NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282326 91177308-0d34-0410-b5e6-96231b3b80d8
This is a step toward statically allocate ValueMapping. Like the
previous few commits, the goal is to move toward a TableGen'ed like
structure with no dynamic allocation at all.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282324 91177308-0d34-0410-b5e6-96231b3b80d8
Previously we were using the address of the unique instance of a partial
mapping in the related map to access this instance. However, when the
map grows, the whole set of instances may be moved elsewhere and the
previous addresses are not valid anymore.
Instead, keep the address of the unique heap allocated instance of a
partial mapping.
Note: I did not see any actual bugs for that problem as the number of
partial mappings dynamically allocated is small (<= 4).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282323 91177308-0d34-0410-b5e6-96231b3b80d8
In the verify method of the ValueMapping class we used to check that the
mapping exactly matches the bits of the input value. This is problematic
for statically allocated mappings because we would need a different
mapping for each different size of the value that maps on one
instruction. For instance, with such scheme, we would need a different
mapping for a value of size 1, 5, 23 whereas they all end up on a 32-bit
wide instruction.
Therefore, change the verifier to check that the meaningful bits are
covered by the mapping instead of matching them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282214 91177308-0d34-0410-b5e6-96231b3b80d8
This is another step toward TableGen'ed like structures. The BreakDown of
the mapping of the value will be statically computed by TableGen, thus
we only have to point to the right entry in the table instead of
dynamically allocate the mapping for each instruction.
We still support the dynamic allocation through a factory of
PartialMapping to ease the bring-up of the targets while the TableGen
backend is not available.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282213 91177308-0d34-0410-b5e6-96231b3b80d8
Currently all nodes get added to the NextSU list when they are released,
so any candidate must be in that list, making the heuristic ineffective.
Remove it for now, we can add it back later in a working fashion if
necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282200 91177308-0d34-0410-b5e6-96231b3b80d8
According to MSDN (see the PR), functions which don't touch any callee-saved
registers (including %rsp) don't need any unwind info.
This patch makes LLVM not emit unwind info for such functions, to save
binary size.
Differential Revision: https://reviews.llvm.org/D24748
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282185 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is basically the first step toward what will
RegisterBankInfo look when it gets TableGen'ed.
It introduces a XXXGenRegisterBankInfo.def file that is what TableGen
will issue at some point. Moreover, the RegBanks field in
RegisterBankInfo changed to reflect the static (compile time) aspect of
the information.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282131 91177308-0d34-0410-b5e6-96231b3b80d8
We still don't really have an equivalent of "AssertXExt" in DAG, so we don't
exploit the guarantees on the receiving side yet, but this should produce
conservatively correct code on iOS ABIs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282069 91177308-0d34-0410-b5e6-96231b3b80d8
The only implementation that exists immediately looks it up anyway, and the
information is needed to handle various parameter attributes (stored on the
function itself).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@282068 91177308-0d34-0410-b5e6-96231b3b80d8
This should match the existing behaviour for passing complicated struct and
array types, in particular HFAs come through like that from Clang.
For C & C++ we still need to somehow support all the weird ABI flags, or at
least those that are present in the IR (signext, byval, ...), and stack-based
parameter passing.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281977 91177308-0d34-0410-b5e6-96231b3b80d8
Use SmallVector instead of dynamically allocated arrays for the mapping of the
operands in the InstructionMapping. That way we avoid heap allocation for most
of the cases. Ultimately, we should not have to rely on such tricky, the
instances of InstructionMapping would be TableGen'ed.
This improves the compilation time of the RegBankSelect pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281955 91177308-0d34-0410-b5e6-96231b3b80d8
The OperandsMapper class is used heavy in RegBankSelect and each
instantiation triggered a heap allocation for the array of operands.
Instead, use a SmallVector with a big enough size such that most of the
cases do not have to use dynamically allocated memory.
This improves the compile time of the RegBankSelect pass.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281916 91177308-0d34-0410-b5e6-96231b3b80d8
When a phi node is finally lowered to a machine instruction it is
important that the lowered "load" instruction is placed before the
associated DEBUG_VALUE entry describing the value loaded.
Renamed the existing SkipPHIsAndLabels to SkipPHIsLabelsAndDebug to
more fully describe that it also skips debug entries. Then used the
"new" function SkipPHIsAndLabels when the debug information should not
be skipped when placing the lowered "load" instructions so that it is
placed before the debug entries.
Differential Revision: https://reviews.llvm.org/D23760
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281727 91177308-0d34-0410-b5e6-96231b3b80d8
It was only really there as a sentinel when instructions had to have precisely
one type. Now that registers are typed, each register really has to have a type
that is sized.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281599 91177308-0d34-0410-b5e6-96231b3b80d8
Otherwise everything that needs to work out what size they are has to keep a
DataLayout handy, which is a bit silly and very annoying.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@281597 91177308-0d34-0410-b5e6-96231b3b80d8