Using RAUW was wrong here; if we have a switch transform such as:
18 -> 6 then
6 -> 0
If we use RAUW, while performing the second transform the *transformed* 6
from the first will be also replaced, so we end up with:
18 -> 0
6 -> 0
Found by clang stage2 bootstrap; testcase added.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277332 91177308-0d34-0410-b5e6-96231b3b80d8
If a switch is sparse and all the cases (once sorted) are in arithmetic progression, we can extract the common factor out of the switch and create a dense switch. For example:
switch (i) {
case 5: ...
case 9: ...
case 13: ...
case 17: ...
}
can become:
if ( (i - 5) % 4 ) goto default;
switch ((i - 5) / 4) {
case 0: ...
case 1: ...
case 2: ...
case 3: ...
}
or even better:
switch ( ROTR(i - 5, 2) {
case 0: ...
case 1: ...
case 2: ...
case 3: ...
}
The division and remainder operations could be costly so we only do this if the factor is a power of two, and emit a right-rotate instead of a divide/remainder sequence. Dense switches can be lowered significantly better than sparse switches and can even be transformed into lookup tables.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277325 91177308-0d34-0410-b5e6-96231b3b80d8
When extracting a set of blocks make sure to inherit all of the target
dependent attributes to make sure that the function will be valid for
lowering. One example is the "target-features" attribute for x86, if the
extracted region has functionality that relies on a specific feature it
will fail to be lowered.
This also allows for extracted functions to be valid for inlining, at
least back into the parent function, as the target attributes are tested
when inlining for compatibility.
Patch by River Riddle!
Differential Revision: https://reviews.llvm.org/D22713
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277315 91177308-0d34-0410-b5e6-96231b3b80d8
LoopUnroll is a loop pass, so the analysis of OptimizationRemarkEmitter
is added to the common function analysis passes that loop passes
depend on.
The BFI and indirectly BPI used in this pass is computed lazily so no
overhead should be observed unless -pass-remarks-with-hotness is used.
This is how the patch affects the O3 pipeline:
Dominator Tree Construction
Natural Loop Information
Canonicalize natural loops
Loop-Closed SSA Form Pass
Basic Alias Analysis (stateless AA impl)
Function Alias Analysis Results
Scalar Evolution Analysis
+ Lazy Branch Probability Analysis
+ Lazy Block Frequency Analysis
+ Optimization Remark Emitter
Loop Pass Manager
Rotate Loops
Loop Invariant Code Motion
Unswitch loops
Simplify the CFG
Dominator Tree Construction
Basic Alias Analysis (stateless AA impl)
Function Alias Analysis Results
Combine redundant instructions
Natural Loop Information
Canonicalize natural loops
Loop-Closed SSA Form Pass
Scalar Evolution Analysis
+ Lazy Branch Probability Analysis
+ Lazy Block Frequency Analysis
+ Optimization Remark Emitter
Loop Pass Manager
Induction Variable Simplification
Recognize loop idioms
Delete dead loops
Unroll loops
...
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277203 91177308-0d34-0410-b5e6-96231b3b80d8
Patch by Sunita Marathe
Third try, now following fixes to MSan to handle mempcy in such a way that this commit won't break the MSan buildbots. (Thanks, Evegenii!)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277189 91177308-0d34-0410-b5e6-96231b3b80d8
Some instructions may have their uses replaced with a symbolic constant.
However, the instruction may still have side effects which percludes it
from being removed from the function. EarlyCSE treated such an
instruction as if it were removed, resulting in PR28763.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277114 91177308-0d34-0410-b5e6-96231b3b80d8
A ConstantVector can have ConstantExpr operands and vice versa.
However, the folder had no ability to fold ConstantVectors which, in
some cases, was an optimization barrier.
Instead, rephrase the folder in terms of Constants instead of
ConstantExprs and teach callers how to deal with failure.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277099 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
copypasta doc of ImportedFunctionsInliningStatistics class
\brief Calculate and dump ThinLTO specific inliner stats.
The main statistics are:
(1) Number of inlined imported functions,
(2) Number of imported functions inlined into importing module (indirect),
(3) Number of non imported functions inlined into importing module
(indirect).
The difference between first and the second is that first stat counts
all performed inlines on imported functions, but the second one only the
functions that have been eventually inlined to a function in the importing
module (by a chain of inlines). Because llvm uses bottom-up inliner, it is
possible to e.g. import function `A`, `B` and then inline `B` to `A`,
and after this `A` might be too big to be inlined into some other function
that calls it. It calculates this statistic by building graph, where
the nodes are functions, and edges are performed inlines and then by marking
the edges starting from not imported function.
If `Verbose` is set to true, then it also dumps statistics
per each inlined function, sorted by the greatest inlines count like
- number of performed inlines
- number of performed inlines to importing module
Reviewers: eraman, tejohnson, mehdi_amini
Subscribers: mehdi_amini, llvm-commits
Differential Revision: https://reviews.llvm.org/D22491
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277089 91177308-0d34-0410-b5e6-96231b3b80d8
Sanitizers set nobuiltin attribute on certain library functions to
avoid a situation where such function is neither instrumented nor
intercepted.
At the moment the list of interesting functions is hardcoded. This
change replaces it with logic based on
TargetLibraryInfo::hasOptimizedCodegen and the presense of readnone
function attribute (sanitizers are generally interested in memory
behavior of library functions).
This is expected to be a no-op change: the new logic matches exactly
the same set of functions.
r276771 (currently reverted) added mempcpy() to the list, breaking
MSan tests. With this change, r276771 can be safely re-landed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277086 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.
This is possible for code like this:
for (int i = 0; i < 3; i++) {
int x;
p = &x;
}
"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.
PR27453
Reviewers: eugenis
Differential Revision: https://reviews.llvm.org/D22842
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277072 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Asan stack-use-after-scope check should poison alloca even if there is
no access between start and end.
This is possible for code like this:
for (int i = 0; i < 3; i++) {
int x;
p = &x;
}
"Loop Invariant Code Motion" will move "p = &x;" out of the loop, making
start/end range empty.
PR27453
Reviewers: eugenis
Differential Revision: https://reviews.llvm.org/D22842
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277068 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
TargetBaseAlign is no longer required since LSV checks if target allows misaligned accesses.
A constant defining a base alignment is still needed for stack accesses where alignment can be adjusted.
Reviewers: llvm-commits, jlebar
Subscribers: mzolotukhin, arsenm
Differential Revision: https://reviews.llvm.org/D22936
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277038 91177308-0d34-0410-b5e6-96231b3b80d8
This adds boilerplate code for all coroutine passes,
the passes are no-ops for now.
Also, a small test has been added to verify that passes execute in
the expected order or not at all if coroutine support is disabled.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22847
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@277033 91177308-0d34-0410-b5e6-96231b3b80d8
The EP_CGSCCOptimizerLate extension point allows adding CallGraphSCC
passes at the end of the main CallGraphSCC passes and before any
function simplification passes run by CGPassManager.
Patch by Gor Nishanov!
Differential Revision: https://reviews.llvm.org/D22897
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276953 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
LCSSAWrapperPass currently doesn't override verifyAnalysis method, so pass
manager doesn't verify LCSSA. This patch adds the method so that we start
verifying LCSSA between loop passes.
Reviewers: chandlerc, sanjoy, hfinkel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D22888
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276941 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Given the crash in D22878, this patch converts the load/store vectorizer
to use explicit Instruction*s wherever possible. This is an overall
simplification and should be an improvement in safety, as we have fewer
naked cast<>s, and now where we use Value*, we really mean something
different from Instruction*.
This patch also gets rid of some cast<>s around Value*s returned by
Builder. Given that Builder constant-folds everything, we can't assume
much about what we get out of it.
One downside of this patch is that we have to copy our chain before
calling propagateMetadata. But I don't think this is a big deal, as our
chains are very small (usually 2 or 4 elems).
Reviewers: asbirlea
Subscribers: mzolotukhin, llvm-commits, arsenm
Differential Revision: https://reviews.llvm.org/D22887
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276938 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
When we ask the builder to create a bitcast on a constant, we get back a
constant, not an instruction.
Reviewers: asbirlea
Subscribers: jholewinski, mzolotukhin, llvm-commits, arsenm
Differential Revision: https://reviews.llvm.org/D22878
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276922 91177308-0d34-0410-b5e6-96231b3b80d8
When loading or storing in a field of a struct like "a.b.c", GVN is able to
detect the equivalent expressions, and GVN-hoist would fail in the code
generation. This is because the GEPs are not hoisted as scalar operations to
avoid moving the GEPs too far from their ld/st instruction when the ld/st is not
movable. So we end up having to generate code for the GEP of a ld/st when we
move the ld/st. In the case of a GEP referring to another GEP as in "a.b.c" we
need to code generate all the GEPs necessary to make all the operands available
at the new location for the ld/st. With this patch we recursively walk through
the GEP operands checking whether all operands are available, and in the case of
a GEP operand, it recursively makes all its operands available. Code generation
happens from the inner GEPs out until reaching the GEP that appears as an
operand of the ld/st.
Differential Revision: https://reviews.llvm.org/D22599
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276841 91177308-0d34-0410-b5e6-96231b3b80d8
Instead of DFS numbering basic blocks we now DFS number instructions that avoids
the costly operation of which instruction comes first in a basic block.
Patch mostly written by Daniel Berlin.
Differential Revision: https://reviews.llvm.org/D22777
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276714 91177308-0d34-0410-b5e6-96231b3b80d8
This patch adds an option to specify the maximum depth in a BB at which to
consider hoisting instructions. Hoisting instructions from a deeper level is
not profitable as it increases register pressure and compilation time.
Differential Revision: https://reviews.llvm.org/D22772
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276713 91177308-0d34-0410-b5e6-96231b3b80d8
Pre-instrumentation inline (pre-inliner) greatly improves the IR
instrumentation code performance, among other benefits. One issue of the
pre-inliner is it can introduce CFG-mismatch for COMDAT functions. This
is due to the fact that the same COMDAT function may have different early
inline decisions across different modules -- that means different copies
of COMDAT functions will have different CFG checksum.
In this patch, we propose a partially renaming the COMDAT group and its
member function/variable so we have different profile counter for each
version. We will post-fix the COMDAT function and the group name with its
FunctionHash.
Differential Revision: http://reviews.llvm.org/D22600
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276673 91177308-0d34-0410-b5e6-96231b3b80d8
There didn't appear to be a good reason to use iplist in this case, a regular
list of unique_ptr works just as well.
Change made in preparation to a new PM port (since iplist is not moveable).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@276668 91177308-0d34-0410-b5e6-96231b3b80d8