The custom root mechanism didn't actually do anything. ShadowStackGC, the only one which used it, just removed the gcroots before they reached the normal lowering in SelectionDAG. As a result, the state flag had no value.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346632 91177308-0d34-0410-b5e6-96231b3b80d8
The GCStrategy provides three configuration options were are largely redundant.
1) Support for conditionally lowering gcread and gcwrite to loads and stores. This is redundant since any GC which wished to use these abstractions would lower them out of existance before the built in lowering anyways. As such, there's no need to have the lowering being conditional.
2) Conditional initialization for allocas marked via gcroot. Semantically, roots have to be initialized before first potential use. Arguably, the frontend really should have responsibility for that, but the old API allowed the frontend to ignore this detail. Only one builtin GC used the non-initializing mode. Since no one to my knowledge actually uses the ErlangGC strategy, I decide the slight pessimization was worth the simplicity. If that turns out to be problematic, we can always improve the insertion algorithm to detect more existing initializing stores.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346621 91177308-0d34-0410-b5e6-96231b3b80d8
This is a long-awaited follow-up suggested in D33578. Since then, we've picked up even more
opportunities for vector narrowing from changes like D53784, so there are a lot of test diffs.
Apart from 2-3 strange cases, these are all wins.
I've structured this to be no-functional-change-intended for any target except for x86
because I couldn't tell if AArch64, ARM, and AMDGPU would improve or not. All of those
targets have existing regression tests (4, 4, 10 files respectively) that would be
affected. Also, Hexagon overrides the shouldReduceLoadWidth() hook, but doesn't show
any regression test diffs. The trade-off is deciding if an extra vector load is better
than a single wide load + extract_subvector.
For x86, this is almost always better (on paper at least) because we often can fold
loads into subsequent ops and not increase the official instruction count. There's also
some unknown -- but potentially large -- benefit from using narrower vector ops if wide
ops are implemented with multiple uops and/or frequency throttling is avoided.
Differential Revision: https://reviews.llvm.org/D54073
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346595 91177308-0d34-0410-b5e6-96231b3b80d8
It's possible for vector op legalization to generate a shuffle. If that happens we should give a chance for DAG combine to combine that with a build_vector input.
I also fixed a bug in combineShuffleOfScalars that was considering the number of uses on a undef input to a shuffle. We don't care how many times undef is used.
Differential Revision: https://reviews.llvm.org/D54283
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346530 91177308-0d34-0410-b5e6-96231b3b80d8
Previous version used type erasure through a `void* (*)()` pointer,
which triggered gcc warning and implied a lot of reinterpret_cast.
This version should make it harder to hit ourselves in the foot.
Differential revision: https://reviews.llvm.org/D54203
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346522 91177308-0d34-0410-b5e6-96231b3b80d8
Currently in llvm, CalleeSavedInfo can only assign a callee saved register to
stack frame index to be spilled in the prologue. We would like to enable
spilling gprs to vector registers. This patch adds the capability to spill to
other registers aside from just the stack. It also adds the changes for power9
to spill gprs to volatile vector registers when they are available.
This happens only for leaf functions when using the option
-ppc-enable-pe-vector-spills.
Differential Revision: https://reviews.llvm.org/D39386
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346512 91177308-0d34-0410-b5e6-96231b3b80d8
The DAGCombiner tries to SimplifySelectCC as follows:
select_cc(x, y, 16, 0, cc) -> shl(zext(set_cc(x, y, cc)), 4)
It can't cope with the situation of reordered operands:
select_cc(x, y, 0, 16, cc)
In that case we just need to swap the operands and invert the Condition Code:
select_cc(x, y, 16, 0, ~cc)
Differential Revision: https://reviews.llvm.org/D53236
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346484 91177308-0d34-0410-b5e6-96231b3b80d8
FindBetterNeighborChains simulateanously improves the chain
dependencies of a chain of related stores avoiding the generation of
extra token factors. For chains longer than the GatherAllAliasDepths,
stores further down in the chain will necessarily fail, a potentially
significant waste and preventing otherwise trivial parallelization.
This patch directly parallelize the chains of stores before improving
each store. This generally improves DAG-level parallelism.
Reviewers: courbet, spatel, RKSimon, bogner, efriedma, craig.topper, rnk
Subscribers: sdardis, javed.absar, hiraditya, jrtc27, atanasyan, llvm-commits
Differential Revision: https://reviews.llvm.org/D53552
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346432 91177308-0d34-0410-b5e6-96231b3b80d8
Turns out knowing more than just the base address might be useful -
specifically a future change to respect a DICompileUnit flag for the use
of base address specifiers in DWARF < 5.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346380 91177308-0d34-0410-b5e6-96231b3b80d8
If a block doesn't have any ranges of adjacent legal instructions, then it
can't have outlining candidates. There's no point in mapping legal isntructions
in situations like this.
I noticed this reduces the size of the suffix tree in sqlite3 for AArch64 at
-Oz by about 3%.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346379 91177308-0d34-0410-b5e6-96231b3b80d8
I noticed that there are lots of basic blocks that don't have enough legal
instructions in them to warrant outlining. We can skip mapping these entirely.
In sqlite3, compiled for AArch64 at -Oz, this results in a 10% reduction of
the total nodes in the suffix tree. These nodes can never be part of a
repeated substring, and so they don't impact the result at all.
Before this, there were 62128 nodes in the tree for sqlite3. After this, there
are 56457 nodes.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346373 91177308-0d34-0410-b5e6-96231b3b80d8
This is only used for calculating ConcatLen. This isn't necessary,
since it's easily derived from the traversal setting suffix indices.
Remove that. Rename CurrIdx to CurrNodeLen to better describe what's
going on.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346349 91177308-0d34-0410-b5e6-96231b3b80d8
This takes the traversal methods introduced in r346269 and adapts them
into an iterator. This allows the outliner to iterate over repeated substrings
within the suffix tree directly without having to initially find all of the
substrings and then iterate over them after you've found them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346345 91177308-0d34-0410-b5e6-96231b3b80d8
NFC-ish. This doesn't change the behaviour of the outliner, but does make sure
that you won't end up with say
OUTLINED_FUNCTION_2:
...
ret
OUTLINED_FUNCTION_248:
...
ret
as the only outlined functions in your module. Those should really be
OUTLINED_FUNCTION_0:
...
ret
OUTLINED_FUNCTION_1:
...
ret
If we produce outlined functions, they probably should have sequential numbers
attached to them. This makes it a bit easier+stable to write outliner tests.
The point of this is to move towards a bit more stability in outlined function
names. By doing this, we at least don't rely on the traversal order of the
suffix tree. Instead, we rely on the order of the candidate list, which is
*far* more consistent. The candidate list is ordered by the end indices of
candidates, so we're more likely to get a stable ordering. This is still
susceptible to changes in the cost model though (like, if we suddenly find new
candidates, for example).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346340 91177308-0d34-0410-b5e6-96231b3b80d8
This adds the llvm-side support for post-inlining evaluation of the
__builtin_constant_p GCC intrinsic.
Also fixed SCCPSolver::visitCallSite to not blow up when seeing a call
to a function where canConstantFoldTo returns true, and one of the
arguments is a struct.
Updated from patch initially by Janusz Sobczak.
Differential Revision: https://reviews.llvm.org/D4276
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346322 91177308-0d34-0410-b5e6-96231b3b80d8
Set `LiveReg::PhysReg` to zero when freeing a register instead of
removing it from the entry from `LiveRegMap`. This way no iterators get
invalidated and we can avoid passing around and updating iterators all
over the place.
This does not change any allocator decisions. It is not completely NFC
because the arbitrary iteration order through `LiveRegMap` in
`spillAll()` changes so we may get a different order in those spill
sequences (the amount of spills does not change).
This is in preparation of https://reviews.llvm.org/D52010.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346298 91177308-0d34-0410-b5e6-96231b3b80d8
The metric does not return the number of remaining (or inserted) copies
but the number of copies that were coalesced. Pick a more descriptive
name.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346287 91177308-0d34-0410-b5e6-96231b3b80d8
After changing the way we find repeated substrings in r346269, this
field is no longer used by anything, so it can be removed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346274 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of iterating over the leaves to find repeated substrings, and walking
collecting leaf children when we don't necessarily need them, let's just
calculate what we need and iterate over that.
By doing this, we don't have to save every leaf. It's easier to read the code
too and understand what's going on.
The goal here, at the end of the day, is to set up to allow us to do something
like
for (RepeatedSubstring &RS : ST) {
... do stuff with RS ...
}
Which would let us perform the cost model stuff and the repeated substring
query at the same time.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346269 91177308-0d34-0410-b5e6-96231b3b80d8
Change the type in a couple of lists and sets that only store physical
registers from unsigned to MCPhysRegs. The later is only 16bits and
saves us a bit of memory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346254 91177308-0d34-0410-b5e6-96231b3b80d8
MachineFunction can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.
Do the same for references in ScheduleDAG and RegUsageInfoCollector.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346183 91177308-0d34-0410-b5e6-96231b3b80d8
MachineModuleInfo can only be used in code using lib/CodeGen, hence we
can keep a more specific reference to LLVMTargetMachine rather than just
TargetMachine around.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346182 91177308-0d34-0410-b5e6-96231b3b80d8
Prior to initial work to add vector expansion support, remove assumptions that we're working on scalar types.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346139 91177308-0d34-0410-b5e6-96231b3b80d8
The original code avoided creating a zero vector after type legalization, but if we're after type legalization the type we have is legal. The real hazard we need to avoid is creating a build vector after op legalization. tryFoldToZero takes care of checking for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346119 91177308-0d34-0410-b5e6-96231b3b80d8
These methods were just wrappers around getNode with additional asserts (identical and repeated 3 times). But getNode already has a switch that can be used to hold these asserts that allows them to be shared for all 3 opcodes. This also enables checking on the places that create these nodes without using the wrappers.
The rest of the patch is just changing all callers to use getNode directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346087 91177308-0d34-0410-b5e6-96231b3b80d8
Use MachineFrameInfo's OffsetAdjustment field to pass this information
from the target to CodeViewDebug.cpp. The X86 backend doesn't use it for
any other purpose.
This fixes PR38857 in the case where there is a non-aligned quantity of
CSRs and a non-aligned quantity of locals.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346062 91177308-0d34-0410-b5e6-96231b3b80d8
We already have custom lowering for the AVX case in LegalizeVectorOps. So its better to keep the regular extend op around as long as possible.
I had to qualify one place in DAG combine that created illegal vector extending load operations. This change by itself had no effect on any tests which is why its included here.
I've made a few cleanups to the custom lowering. The sign extend code no longer creates an identity shuffle with undef elements. The zero extend code now emits a zero_extend_vector_inreg instead of an unpckl with a zero vector.
For the high half of the custom lowering of zero_extend/any_extend, we're now using an unpckh with a zero vector or undef. Previously we used used a pshufd to move the upper 64-bits to the lower 64-bits and then used a zero_extend_vector_inreg. I think the zero vector should require less execution resources and be smaller code size.
Differential Revision: https://reviews.llvm.org/D54024
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@346043 91177308-0d34-0410-b5e6-96231b3b80d8
As reported in PR38952, postra-machine-sink relies on DBG_VALUE insns being
adjacent to the def of the register that they reference. This is not always
true, leading to register copies being sunk but not the associated DBG_VALUEs,
which gives the debugger a bad variable location.
This patch collects DBG_VALUEs as we walk through a BB looking for copies to
sink, then passes them down to performSink. Compile-time impact should be
negligable.
Differential Revision: https://reviews.llvm.org/D53992
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345996 91177308-0d34-0410-b5e6-96231b3b80d8
reduceBuildVecConvertToConvertBuildVec vectorizes int2float in the DAGCombiner, which means that even if the LV/SLP has decided to keep scalar code using the cost models, this will override this.
While there are cases where vectorization is necessary in the DAG (mainly due to legalization artefacts), I don't think this is the case here, we should assume that the vectorizers know what they are doing.
Differential Revision: https://reviews.llvm.org/D53712
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345964 91177308-0d34-0410-b5e6-96231b3b80d8
- Make some TargetPassConfig methods that just check whether options have
been set static.
- Shuffle code in LLVMTargetMachine around so addPassesToGenerateCode
only deals with TargetPassConfig now (but not with MCContext or the
creation of MachineModuleInfo)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345918 91177308-0d34-0410-b5e6-96231b3b80d8
I'm having trouble creating a test case for the ISD::TRUNCATE part of this that shows any codegen differences. But I was able to test the setcc path which is what the test changes here cover.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345908 91177308-0d34-0410-b5e6-96231b3b80d8
Instruction mapping in the outliner uses "illegal numbers" to signify that
something can't ever be part of an outlining candidate. This means that the
number is unique and can't be part of any repeated substring.
Because each of these is unique, we can use a single unique number to represent
a range of things we can't outline.
The outliner tries to leverage this using a flag which is set in an MBB when
the previous instruction we tried to map was "illegal". This patch improves
that logic to work across MBBs. As a bonus, this also simplifies the mapping
logic somewhat.
This also updates the machine-outliner-remarks test, which was impacted by the
order of Candidates on an OutlinedFunction changing. This order isn't
guaranteed, so I added a FIXME to fix that in a follow-up. The order of
Candidates on an OutlinedFunction isn't important, so this still is NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@345906 91177308-0d34-0410-b5e6-96231b3b80d8