When the Constant Hoisting pass moves expensive constants into a
common block, it would assign a debug location equal to the last use
of that constant. While this is certainly intuitive, it places the
constant in an out-of-order location, according to the debug location
information. This produces out-of-order stepping when debugging
programs affected by this pass.
This patch creates in-order stepping behavior by merging the debug
locations for hoisted constants, and the new insertion point.
Patch by Matthew Voss!
Differential Revision: https://reviews.llvm.org/D38088
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317827 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The analysis of the store sequence goes in straight order - from the
first store to the last. Bu the best opportunity for vectorization will
happen if we're going to use reverse order - from last store to the
first. It may be best because usually users have some initialization
part + further processing and this first initialization may confuse
SLP vectorizer.
Reviewers: RKSimon, hfinkel, mkuper, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39606
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317821 91177308-0d34-0410-b5e6-96231b3b80d8
The toxic stew of created values named 'tmp' and tests that already have
values named 'tmp' and CHECK lines looking for values named 'tmp' causes
bad things to happen in our test line auto-generation scripts because it
wants to use 'TMP' as a prefix for unnamed values. Use less 'tmp' to
avoid that.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317818 91177308-0d34-0410-b5e6-96231b3b80d8
We must patch all existing incoming values of Phi node,
otherwise it is possible that we can see poison
where program does not expect to see it.
This is the similar what GVN does.
The added test test/Transforms/GVN/PRE/pre-jt-add.ll shows an
example of wrong optimization done by jump threading due to
GVN PRE did not patch existing incoming value.
Reviewers: mkazantsev, wmi, dberlin, davide
Reviewed By: dberlin
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D39637
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317768 91177308-0d34-0410-b5e6-96231b3b80d8
This patch implements Chandler's idea [0] for supporting languages that
require support for infinite loops with side effects, such as Rust, providing
part of a solution to bug 965 [1].
Specifically, it adds an `llvm.sideeffect()` intrinsic, which has no actual
effect, but which appears to optimization passes to have obscure side effects,
such that they don't optimize away loops containing it. It also teaches
several optimization passes to ignore this intrinsic, so that it doesn't
significantly impact optimization in most cases.
As discussed on llvm-dev [2], this patch is the first of two major parts.
The second part, to change LLVM's semantics to have defined behavior
on infinite loops by default, with a function attribute for opting into
potential-undefined-behavior, will be implemented and posted for review in
a separate patch.
[0] http://lists.llvm.org/pipermail/llvm-dev/2015-July/088103.html
[1] https://bugs.llvm.org/show_bug.cgi?id=965
[2] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118632.html
Differential Revision: https://reviews.llvm.org/D38336
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317729 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In ThinLTO compilation, we exit populateModulePassManager early and
were not adding PM extension passes meant to run at the end of the
pipeline. This includes sanitizer passes. Add these passes before
the early exit.
A test will be added to projects/compiler-rt.
Reviewers: pcc
Subscribers: mehdi_amini, inglorion, llvm-commits
Differential Revision: https://reviews.llvm.org/D39565
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317714 91177308-0d34-0410-b5e6-96231b3b80d8
Patch tries to improve vectorization of the following code:
void add1(int * __restrict dst, const int * __restrict src) {
*dst++ = *src++;
*dst++ = *src++ + 1;
*dst++ = *src++ + 2;
*dst++ = *src++ + 3;
}
Allows to vectorize even if the very first operation is not a binary add, but just a load.
Fixed PR34619 and other issues related to previous commit.
Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
Reviewed By: ABataev, RKSimon
Subscribers: llvm-commits, RKSimon
Differential Revision: https://reviews.llvm.org/D28907
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317618 91177308-0d34-0410-b5e6-96231b3b80d8
The hexagon test should be fixed now.
Original commit message:
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
This can allow us to get the select closer to other selects to enable removing one.
Differential Revision: https://reviews.llvm.org/D39222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317600 91177308-0d34-0410-b5e6-96231b3b80d8
We can't safely split arithmetic into multiple fragments because we
can't express carry-over between fragments.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317534 91177308-0d34-0410-b5e6-96231b3b80d8
This broke the CodeGen/Hexagon/loop-idiom/pmpy-mod.ll test on a bunch of buildbots.
> This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
>
> This can allow us to get the select closer to other selects to enable removing one.
>
> Differential Revision: https://reviews.llvm.org/D39222
>
> git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317518 91177308-0d34-0410-b5e6-96231b3b80d8
This pulls shifts through a select+binop with a constant where the select conditionally executes the binop. We already do this for just the binop, but not with the select.
This can allow us to get the select closer to other selects to enable removing one.
Differential Revision: https://reviews.llvm.org/D39222
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317510 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: When computing the SUM for indirect call promotion, if the callsite is already promoted in the profile, it will be promoted before ICP. In the current implementation, ICP only sees remaining counts in SUM. This may cause extra indirect call targets being promoted. This patch updates the SUM to include the counts already promoted earlier. This way we do not end up promoting too many indirect call targets.
Reviewers: tejohnson
Reviewed By: tejohnson
Subscribers: llvm-commits, sanjoy
Differential Revision: https://reviews.llvm.org/D38763
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317502 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html
...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.
As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic
reassociation - 'AllowReassoc'.
We're also adding a bit to allow approximations for library functions called 'ApproxFunc'
(this was initially proposed as 'libm' or similar).
...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits),
but that's apparently already used for other purposes. Also, I don't think we can just
add a field to FPMathOperator because Operator is not intended to be instantiated.
We'll defer movement of FMF to another day.
We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.
Finally, this change is binary incompatible with existing IR as seen in the
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile
them. For example, if nsw is ever replaced with something else, dropping it would be
a valid way to upgrade the IR."
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will
fail to optimize some previously 'fast' code because it's no longer recognized as
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.
Note: an inter-dependent clang commit to use the new API name should closely follow
commit.
Differential Revision: https://reviews.llvm.org/D39304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317488 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.
Originally commited as r317374, but reverted in r317395 to update some missed
tests.
Differential Revision: https://reviews.llvm.org/D35702
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317408 91177308-0d34-0410-b5e6-96231b3b80d8
This preserves the debug info for the cast operation in the original location.
rdar://problem/33460652
Reapplied r317340 with the test moved into an ARM-specific directory.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317375 91177308-0d34-0410-b5e6-96231b3b80d8
Now that we have a way to mark GlobalValues as local we can use the symbol
resolutions that the linker plugin provides as part of lto/thinlto link
step to refine the compilers view on what symbols will end up being local.
Differential Revision: https://reviews.llvm.org/D35702
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317374 91177308-0d34-0410-b5e6-96231b3b80d8
Merging conditional stores tries to check to see if the code is if convertible after the store is moved. But the store hasn't been moved yet so its being counted against the threshold.
The patch adds 1 to the threshold comparison to make sure we don't count the store. I've adjusted a test to use a lower threshold to ensure we still do that conversion with the lower threshold.
Differential Revision: https://reviews.llvm.org/D39570
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317368 91177308-0d34-0410-b5e6-96231b3b80d8
This recommit r317351 after fixing a buildbot failure.
Original commit message:
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :
1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.
Split from :
if (!ptr || c)
callee(ptr);
to :
if (!ptr)
callee(null ptr) // set the known constant value
else if (c)
callee(nonnull ptr) // set non-null attribute in the argument
2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
BB0:
%c = icmp eq i32 %i1, %i2
br i1 %c, label %BB2, label %BB1
BB1:
br label %BB2
BB2:
%p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
call void @bar(i32 %p)
to
BB0:
%c = icmp eq i32 %i1, %i2
br i1 %c, label %BB2-split0, label %BB1
BB1:
br label %BB2-split1
BB2-split0:
call void @bar(i32 0)
br label %BB2
BB2-split1:
call void @bar(i32 1)
br label %BB2
BB2:
%p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317362 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change add a pass which tries to split a call-site to pass
more constrained arguments if its argument is predicated in the control flow
so that we can expose better context to the later passes (e.g, inliner, jump
threading, or IPA-CP based function cloning, etc.).
As of now we support two cases :
1) If a call site is dominated by an OR condition and if any of its arguments
are predicated on this OR condition, try to split the condition with more
constrained arguments. For example, in the code below, we try to split the
call site since we can predicate the argument (ptr) based on the OR condition.
Split from :
if (!ptr || c)
callee(ptr);
to :
if (!ptr)
callee(null ptr) // set the known constant value
else if (c)
callee(nonnull ptr) // set non-null attribute in the argument
2) We can also split a call-site based on constant incoming values of a PHI
For example,
from :
BB0:
%c = icmp eq i32 %i1, %i2
br i1 %c, label %BB2, label %BB1
BB1:
br label %BB2
BB2:
%p = phi i32 [ 0, %BB0 ], [ 1, %BB1 ]
call void @bar(i32 %p)
to
BB0:
%c = icmp eq i32 %i1, %i2
br i1 %c, label %BB2-split0, label %BB1
BB1:
br label %BB2-split1
BB2-split0:
call void @bar(i32 0)
br label %BB2
BB2-split1:
call void @bar(i32 1)
br label %BB2
BB2:
%p = phi i32 [ 0, %BB2-split0 ], [ 1, %BB2-split1 ]
Reviewers: davidxl, huntergr, chandlerc, mcrosier, eraman, davide
Reviewed By: davidxl
Subscribers: sdesmalen, ashutosh.nema, fhahn, mssimpso, aemerson, mgorny, mehdi_amini, kristof.beyls, llvm-commits
Differential Revision: https://reviews.llvm.org/D39137
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317351 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The current LICM allows sinking an instruction only when it is exposed to exit
blocks through a trivially replacable PHI of which all incoming values are the
same instruction. This change enhance LICM to sink a sinkable instruction
through non-trivially replacable PHIs by spliting predecessors of loop
exits.
Reviewers: hfinkel, majnemer, davidxl, bmakam, mcrosier, danielcdh, efriedma, jtony
Reviewed By: efriedma
Subscribers: nemanjai, dberlin, llvm-commits
Differential Revision: https://reviews.llvm.org/D37163
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317335 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Refactored the code to separate out common functions that are being
reused.
This is to reduce the changes for changes coming up wrt loop
predication with reverse loops.
This refactoring is what we have in our downstream code.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317324 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Also added a reserve() method to MapVector since we want to use that from
ADCE.
DenseMap does not provide deterministic iteration order so with that
we will handle the members of BlockInfo in random order, eventually
leading to random order of the blocks in the predecessor lists.
Without this change, I get the same predecessor order in about 90% of the
time when I compile a certain reproducer and in 10% I get a different one.
No idea how to make a proper test case for this.
Reviewers: kuhar, david2050
Reviewed By: kuhar
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39593
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317323 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
InlineFunction can fail, for example when trying to inline vararg
fuctions. In those cases, we do not want to bump partial inlining
counters or set AnyInlined to true, because this could leave an unused
function hanging around.
Reviewers: davidxl, davide, gyiu
Reviewed By: davide
Subscribers: llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D39581
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317314 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently the block frequency analysis is an approximation for irreducible
loops.
The new irreducible loop metadata is used to annotate the irreducible loop
headers with their header weights based on the PGO profile (currently this is
approximated to be evenly weighted) and to help improve the accuracy of the
block frequency analysis for irreducible loops.
This patch is a basic support for this.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: mehdi_amini, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D39028
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317278 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch allows us to predicate range checks that have a type narrower than
the latch check type. We leverage SCEV analysis to identify a truncate for the
latchLimit and latchStart.
There is also safety checks in place which requires the start and limit to be
known at compile time. We require this to make sure that the SCEV truncate expr
for the IV corresponding to the latch does not cause us to lose information
about the IV range.
Added tests show the loop predication over range checks that are of various
types and are narrower than the latch type.
This enhancement has been in our downstream tree for a while.
Reviewers: apilipenko, sanjoy, mkazantsev
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39500
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317269 91177308-0d34-0410-b5e6-96231b3b80d8
The original change was reverted in rL317217 because of the failure in
the RS4GC testcase. I couldn't reproduce the failure on my local machine
(macbook) but could reproduce it on a linux box.
The failure was around removing the uses of invariant.start. The fix
here is to just RAUW undef (which was the first implementation in D39388).
This is perfectly valid IR as discussed in the review.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317225 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Invariant.start on memory locations has the property that the memory
location is unchanging. However, this is not true in the face of
rewriting statepoints for GC.
Teach RS4GC about removing invariant.start so that optimizations after
RS4GC does not incorrect sink a load from the memory location past a
statepoint.
Added test showcasing the issue.
Reviewers: reames, apilipenko, dneilson
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39388
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317215 91177308-0d34-0410-b5e6-96231b3b80d8
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage
This reverts commit eea333c33fa73ad225ef28607795984829f65688.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317213 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.
See the discussion in PR33325 for more details.
Reviewers: spatel
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D39456
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317211 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
SpeculativelyExecuteBB can flatten the CFG by doing
speculative execution followed by a select instruction.
When the speculatively executed BB contained dbg intrinsics
the result could be a little bit weird, since those dbg
intrinsics were inserted before the select in the flattened
CFG. So when single stepping in the debugger, printing the
value of the variable referenced in the dbg intrinsic, it
could happen that it looked like the variable had values
that never actually were assigned to the variable.
This patch simply discards all dbg intrinsics that were found
in the speculatively executed BB.
Reviewers: aprantl, chandlerc, craig.topper
Reviewed By: aprantl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D39494
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317198 91177308-0d34-0410-b5e6-96231b3b80d8
Fixes part of PR35113.
This reapplies r317105 with an additional check for isa<Instruction>
as found by the bots.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317120 91177308-0d34-0410-b5e6-96231b3b80d8