Scavenging slots were only reserved when pseudo-instruction expansion in
frame lowering created new virtual registers. It is possible to still
need a scavenging slot even if no virtual registers were created, in cases
where the stack is large enough to overflow instruction offsets.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277355 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Allocating an AFGR64 shadows two GPR32's instead of just one.
This fixes an LNT regression detected by our internal buildbots.
Reviewers: sdardis
Subscribers: dsanders, sdardis, llvm-commits
Differential Revision: https://reviews.llvm.org/D23012
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277348 91177308-0d34-0410-b5e6-96231b3b80d8
Using RAUW was wrong here; if we have a switch transform such as:
18 -> 6 then
6 -> 0
If we use RAUW, while performing the second transform the *transformed* 6
from the first will be also replaced, so we end up with:
18 -> 0
6 -> 0
Found by clang stage2 bootstrap; testcase added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277332 91177308-0d34-0410-b5e6-96231b3b80d8
The branch relaxation pass is computing the wrong offsets because it assumes
TLSDESC_CALLSEQ eats up 4 bytes, when in fact it is lowered to an instruction
sequence taking up 16 bytes. This can become a problem in huge files with lots
of TLS accesses, as it may slowly move branch targets out of the range computed
by the branch relaxation pass.
Fixes PR24234 https://llvm.org/bugs/show_bug.cgi?id=24234
Differential Revision: https://reviews.llvm.org/D22870
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277331 91177308-0d34-0410-b5e6-96231b3b80d8
It looks like the two independent parts of the rotate operation (a lshr and shl) are being reordered on some bots. Add CHECK-DAGs to account for this.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277329 91177308-0d34-0410-b5e6-96231b3b80d8
If a switch is sparse and all the cases (once sorted) are in arithmetic progression, we can extract the common factor out of the switch and create a dense switch. For example:
switch (i) {
case 5: ...
case 9: ...
case 13: ...
case 17: ...
}
can become:
if ( (i - 5) % 4 ) goto default;
switch ((i - 5) / 4) {
case 0: ...
case 1: ...
case 2: ...
case 3: ...
}
or even better:
switch ( ROTR(i - 5, 2) {
case 0: ...
case 1: ...
case 2: ...
case 3: ...
}
The division and remainder operations could be costly so we only do this if the factor is a power of two, and emit a right-rotate instead of a divide/remainder sequence. Dense switches can be lowered significantly better than sparse switches and can even be transformed into lookup tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277325 91177308-0d34-0410-b5e6-96231b3b80d8
Initialize all AArch64-specific passes in the TargetMachine so they can be run
by llc. This can lead to conflicts in opt with some command line options that
share the same name as the pass, so I took this opportunity to do some cleanups:
* rename all relevant command line options from "aarch64-blah" to
"aarch64-enable-blah" and update the tests accordingly
* run clang-format on their declarations
* move all these declarations to a common place (the TargetMachine) as opposed
to having them scattered around (AArch64BranchRelaxation and
AArch64AddressTypePromotion were the only offenders)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277322 91177308-0d34-0410-b5e6-96231b3b80d8
Had to update a stack folding test to clobber the other 16 registers since this now made them get used instead of spilling.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277321 91177308-0d34-0410-b5e6-96231b3b80d8
No bots have yelled yet, but this test references an x86 intrinsic.
Also, it invokes llc on x86 IR.
Fixup to r277315.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277316 91177308-0d34-0410-b5e6-96231b3b80d8
When extracting a set of blocks make sure to inherit all of the target
dependent attributes to make sure that the function will be valid for
lowering. One example is the "target-features" attribute for x86, if the
extracted region has functionality that relies on a specific feature it
will fail to be lowered.
This also allows for extracted functions to be valid for inlining, at
least back into the parent function, as the target attributes are tested
when inlining for compatibility.
Patch by River Riddle!
Differential Revision: https://reviews.llvm.org/D22713
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277315 91177308-0d34-0410-b5e6-96231b3b80d8
The 2i32-2i64 legalization means that we can use the slightly quicker double bits + fptrunc approach for the same results
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277271 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When performing cmp for EQ/NE and the operand is sign extended, we can
avoid the truncaton if the bits to be tested are no less than origianl
bits.
Reviewers: eli.friedman
Subscribers: eli.friedman, aemerson, nemanjai, t.p.northover, llvm-commits
Differential Revision: https://reviews.llvm.org/D22933
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277252 91177308-0d34-0410-b5e6-96231b3b80d8
These come in two variants for now: G_INTRINSIC and G_INTRINSIC_W_SIDE_EFFECTS.
We may decide to split the latter up with finer-grained restrictions later, if
necessary.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277224 91177308-0d34-0410-b5e6-96231b3b80d8
in r277177 and added back this test which was deleted in r277196 while
I tracked down these problems.
Changed from constructing Twine's to std::string's as Twine's don't work
across statements. Also removed a few unneeded Twine() constructions.
Fix the write_escaped() calls to not pass the unintended second argument
fixing the warning on the ld-x86_64-win7 bot.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277223 91177308-0d34-0410-b5e6-96231b3b80d8
Up until now, we only had code to match PSADBW patterns that look like what
comes out of the loop vectorizer - a partial reduction inside the loop body
that gets fed into a horizontal operation in a different basic block.
This adds support for straight-line patterns, like those generated by the
SLP vectorizer.
Differential Revision: https://reviews.llvm.org/D22889
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277219 91177308-0d34-0410-b5e6-96231b3b80d8
Support for lowering to VBROADCASTF128 etc. in D22460 was not correctly ensuring that the only users of the 128-bit vector load were the insertions of the vector into the lower/upper subvectors.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277214 91177308-0d34-0410-b5e6-96231b3b80d8
This will be used during GlobalISel, where we need a more robust and readable
way to write tests than a simple immediate ID.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277209 91177308-0d34-0410-b5e6-96231b3b80d8
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.
The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.
This is how the patch affects the O3 pipeline:
Dominator Tree Construction
Natural Loop Information
Canonicalize natural loops
Loop-Closed SSA Form Pass
Basic Alias Analysis (stateless AA impl)
Function Alias Analysis Results
Scalar Evolution Analysis
+ Lazy Branch Probability Analysis
+ Lazy Block Frequency Analysis
+ Optimization Remark Emitter
Loop Pass Manager
Rotate Loops
Loop Invariant Code Motion
Unswitch loops
Simplify the CFG
Dominator Tree Construction
Basic Alias Analysis (stateless AA impl)
Function Alias Analysis Results
Combine redundant instructions
Natural Loop Information
Canonicalize natural loops
Loop-Closed SSA Form Pass
Scalar Evolution Analysis
+ Lazy Branch Probability Analysis
+ Lazy Block Frequency Analysis
+ Optimization Remark Emitter
Loop Pass Manager
Induction Variable Simplification
Recognize loop idioms
Delete dead loops
Unroll loops
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277203 91177308-0d34-0410-b5e6-96231b3b80d8