Summary:
In case we are loading on a phi-load in SimplifyPartiallyRedundantLoad.
Try to phi translate it into incoming values in the predecessors before
we search for available loads.
This needs https://reviews.llvm.org/D30524
Reviewers: davide, sanjoy, efriedma, dberlin, rengolin
Reviewed By: dberlin
Subscribers: junbuml, llvm-commits
Differential Revision: https://reviews.llvm.org/D30543
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298217 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
iterateOnFunction creates a ReversePostOrderTraversal object which does a post order traversal in its constructor and stores the results in an internal vector. Iteration over it just reads from the internal vector in reverse order.
The GVN code seems to be unaware of this and iterates over ReversePostOrderTraversal object and makes a copy of the vector into a local vector. (I think at one point in time we used a DFS here instead which would have required the local vector).
The net affect of this is that we have two vectors containing the basic block list. As I didn't want to expose the implementation detail of ReversePostOrderTraversal's constructor to GVN, I've changed the code to do an explicit post order traversal storing into the local vector and then reverse iterate over that.
I've also removed the reserve(256) since the ReversePostOrderTraversal wasn't doing that. I can add it back if we thinks it important. Though it seemed weird that it wasn't based on the size of the function.
Reviewers: davide, anemet, dberlin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31084
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298191 91177308-0d34-0410-b5e6-96231b3b80d8
When InstCombine calls into SimplifyLibCalls and it createa putChar calls, we don't infer the attributes. And since SimplifyLibCalls doesn't use InstCombine's IRBuilder the calls doesn't end up in the worklist on this iteration of InstCombine. So it gets picked up on the next iteration where it causes an IR change. This of course causes InstCombine to run another iteration.
So this patch just gets the attributes right the first time. We already did this for puts and some other libcalls.
Differential Revision: https://reviews.llvm.org/D31094
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298171 91177308-0d34-0410-b5e6-96231b3b80d8
Use a combination of !associated, comdat, @llvm.compiler.used and
custom sections to allow dead stripping of globals and their asan
metadata. Sometimes.
Currently this works on LLD, which supports SHF_LINK_ORDER with
sh_link pointing to the associated section.
This also works on BFD, which seems to treat comdats as
all-or-nothing with respect to linker GC. There is a weird quirk
where the "first" global in each link is never GC-ed because of the
section symbols.
At this moment it does not work on Gold (as in the globals are never
stripped).
Differential Revision: https://reviews.llvm.org/D30121
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298158 91177308-0d34-0410-b5e6-96231b3b80d8
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.
Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.
Differential Revision: https://reviews.llvm.org/D30796
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298104 91177308-0d34-0410-b5e6-96231b3b80d8
We were not handling getelemenptr instructions of vector type before.
Since getelemenptr instructions for vector types follow the same rule as
getelementptr instructions for non-vector types, we can just handle them
in the same way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298028 91177308-0d34-0410-b5e6-96231b3b80d8
Users often call getArgumentList().size(), which is a linear way to get
the number of function arguments. arg_size(), on the other hand, is
constant time.
In general, the fact that arguments are stored in an iplist is an
implementation detail, so I've removed it from the Function interface
and moved all other users to the argument container APIs (arg_begin(),
arg_end(), args(), arg_size()).
Reviewed By: chandlerc
Differential Revision: https://reviews.llvm.org/D31052
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298010 91177308-0d34-0410-b5e6-96231b3b80d8
[Reapplies r297971 and punting on finding a better API for findDbgValues()]
This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.
In the example in the testcase (which was extracted from XNU) there is a sequence of
%4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
%5 = bitcast %struct.entry* %4 to i8*, !dbg !42
%add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
%6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34
When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:
- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic
The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.
rdar://problem/30725338
Differential Revision: https://reviews.llvm.org/D30919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297994 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.
In the example in the testcase (which was extracted from XNU) there is a sequence of
%4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
%5 = bitcast %struct.entry* %4 to i8*, !dbg !42
%add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
%6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34
When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:
- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic
The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.
rdar://problem/30725338
Differential Revision: https://reviews.llvm.org/D30919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297971 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The call to canEvaluateZExtd in InstCombiner::visitZExt may
return with BitsToClear == SrcTy->getScalarSizeInBits(), but
there is an assert that BitsToClear should be smaller than
SrcTy->getScalarSizeInBits().
I have a test case that triggers the assert, but it only happens
for my downstream target. I've not been able to trigger it for
any upstream target.
The assert triggered for a piece of code such as this
%shr1 = lshr i16 undef, 15
...
%shr2 = lshr i16 %shr1, 1
%conv = zext i16 %shr2 to i32
Normally the lshr instructions are constant folded before we
visit the zext (that is why it is so hard to reproduce).
The original pattern, before instcombine, is of course a lot more
complicated in my test case. The shift count in the second lshr
is for example determined by the outcome of a PHI instruction.
It seems like other rewrites by instcombine leads up to
the pattern above. And then the zext is pulled from the
worklist, and visited (hitting the assert), before we detect
that the lshr instrucions can be constant folded.
Anyway, since the canEvaluateZExtd may return with BitsToClear
equal to SrcTy->getScalarSizeInBits(), and since the rewrite
that converts the expression type to avoid a zero extend works
also for the case where SrcBitsKept ends up being zero, then
it should be OK to liberate the assert to
assert(BitsToClear <= SrcTy->getScalarSizeInBits() &&
"Unreasonable BitsToClear");
Reviewers: hfinkel
Reviewed By: hfinkel
Subscribers: hfinkel, llvm-commits
Differential Revision: https://reviews.llvm.org/D30993
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297952 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In commit r289548 ([ADCE] Add code to remove dead branches) a redundant loop
nest was accidentally introduced, which implements exactly the same
functionality as has already been available right after. This redundancy has
been found when inspecting the ADCE code in the context of our recent
discussions on post-dominator modeling. This redundant code was also eliminated
by r296535 (which sparked the discussion), but only as part of a larger semantic
change of the post-dominance modeling. As this redundency in [ADCE] is really
just an oversight completely independent of the post-dominance changes under
discussion, we remove this redundancy independently.
Reviewers: dberlin, david2050
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31023
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297929 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
NSIs can be double-counted by different operations in
SelectInstVisitor. Sink the the update to VM_counting mode only.
Also reset the value for each counting operation.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: xur, llvm-commits
Differential Revision: https://reviews.llvm.org/D30999
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297892 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't safe on all targets, and since we don't have a way
to know it's safe, avoid doing it for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297788 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen:
1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName.
2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName
This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining:
1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import.
2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName.
3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote.
Reviewers: mehdi_amini, tejohnson
Reviewed By: tejohnson
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D30754
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297757 91177308-0d34-0410-b5e6-96231b3b80d8
This patch refactors the PHisToFix loop as follows:
- The loop itself now resides in its own method.
- The new method iterates on scalar-loop's header; the PHIsToFix map formerly
propagated as an output parameter and filled during phi widening is removed.
- The code handling reductions is moved into its own method, similar to the
existing fixFirstOrderRecurrence().
Differential Revision: https://reviews.llvm.org/D30755
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297740 91177308-0d34-0410-b5e6-96231b3b80d8
Refactoring Cost Model's selectVectorizationFactor() so that it handles only the
selection of the best VF from a pre-computed range of candidate VF's, extracting
early-exit criteria and the computation of a MaxVF upper-bound to other methods,
all driven by a newly introduced LoopVectorizationPlanner.
Differential Revision: https://reviews.llvm.org/D30653
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297737 91177308-0d34-0410-b5e6-96231b3b80d8
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.
Tests updates:
Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll
The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.
Review: Hal Finkel
https://reviews.llvm.org/D29540
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8
The typical use is a library vote function which
compares to 0. Fold the user condition into the intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297650 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is a follow-up on r297580. It fixes the FIXME added temporarily
by that commit to keep the removal of Unroller's specialized version of
scalarizeInstruction() an NFC. See https://reviews.llvm.org/D30715 for details.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297610 91177308-0d34-0410-b5e6-96231b3b80d8
Unroller's specialized scalarizeInstruction() is mostly duplicating Vectorizer's
variant. OTOH Vectorizer's scalarizeInstruction() already supports the special
case of VF==1 except for avoiding mask-bit extraction in that case. This patch
removes Unroller's specialized version in favor of a unified method.
The only functional difference between the two variants seems to be setting
memcheck metadata for loads and stores only in Vectorizer's variant, which is a
bug in Unroller. To keep this patch an NFC the unified method doesn't set
memcheck metadata for VF==1.
Differential Revision: https://reviews.llvm.org/D30715
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297580 91177308-0d34-0410-b5e6-96231b3b80d8