insertelement (insertelement X, Y, IdxC1), ScalarC, IdxC2 -->
insertelement (insertelement X, ScalarC, IdxC2), Y, IdxC1
As noted in the code comment and seen in the test changes, the motivation is that by pulling
constant insertion up, we may be able to constant fold some insertelement instructions.
Differential Revision: https://reviews.llvm.org/D31196
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298520 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Subtracts can have constants on the left side, but we don't shrink them based on demanded bits. This patch fixes that to match the right hand side.
Reviewers: davide, majnemer, spatel, sanjoy, hfinkel
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31119
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298478 91177308-0d34-0410-b5e6-96231b3b80d8
Currently the default C calling convention functions are treated
the same as compute kernels. Make this explicit so the default
calling convention can be changed to a non-kernel.
Converted with perl -pi -e 's/define void/define amdgpu_kernel void/'
on the relevant test directories (and undoing in one place that actually
wanted a non-kernel).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298444 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In case we are loading on a phi-load in SimplifyPartiallyRedundantLoad.
Try to phi translate it into incoming values in the predecessors before
we search for available loads.
This needs https://reviews.llvm.org/D30524
Reviewers: davide, sanjoy, efriedma, dberlin, rengolin
Reviewed By: dberlin
Subscribers: junbuml, llvm-commits
Differential Revision: https://reviews.llvm.org/D30543
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298217 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The reverse of an artbitrary bitpattern is also an arbitrary
bitpattern.
Reviewers: trentxintong, arsenm, majnemer
Reviewed By: majnemer
Subscribers: majnemer, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D31118
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298201 91177308-0d34-0410-b5e6-96231b3b80d8
Loop unswitching can be extremely harmful for a SIMT target. In case
if hoisted condition is not uniform a SIMT machine will execute both
clones of a loop sequentially. Therefor LoopUnswitch checks if the
condition is non-divergent.
Since DivergenceAnalysis adds an expensive PostDominatorTree analysis
not needed for non-SIMT targets a new option is added to avoid unneded
analysis initialization. The method getAnalysisUsage is called when
TargetTransformInfo is not yet available and we cannot use it here.
For that reason a new field DivergentTarget is added to PassManagerBuilder
to control the behavior and set this field from a target.
Differential Revision: https://reviews.llvm.org/D30796
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298104 91177308-0d34-0410-b5e6-96231b3b80d8
We were not handling getelemenptr instructions of vector type before.
Since getelemenptr instructions for vector types follow the same rule as
getelementptr instructions for non-vector types, we can just handle them
in the same way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298028 91177308-0d34-0410-b5e6-96231b3b80d8
[Reapplies r297971 and punting on finding a better API for findDbgValues()]
This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.
In the example in the testcase (which was extracted from XNU) there is a sequence of
%4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
%5 = bitcast %struct.entry* %4 to i8*, !dbg !42
%add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
%6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34
When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:
- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic
The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.
rdar://problem/30725338
Differential Revision: https://reviews.llvm.org/D30919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297994 91177308-0d34-0410-b5e6-96231b3b80d8
This patch improves debug info quality in InstCombine by looking at
values that are about to be deleted, checking whether there are any
dbg.value instrinsics referring to them, and potentially encoding the
semantics of the deleted instruction into the dbg.value's
DIExpression.
In the example in the testcase (which was extracted from XNU) there is a sequence of
%4 = load %struct.entry*, %struct.entry** %next2, align 8, !dbg !41
%5 = bitcast %struct.entry* %4 to i8*, !dbg !42
%add.ptr4 = getelementptr inbounds i8, i8* %5, i64 -8, !dbg !43
%6 = bitcast i8* %add.ptr4 to %struct.entry*, !dbg !44
call void @llvm.dbg.value(metadata %struct.entry* %6, i64 0, metadata !20, metadata !21), !dbg 34
When these instructions are eliminated by instcombine one after
another, we can still salvage the otherwise dead debug info:
- Bitcasts have no effect, so have the dbg.value point to operand(0)
- Loads can be expressed via a DW_OP_deref
- Constant gep instructions can be replaced by DWARF expression arithmetic
The API introduced by this patch is not specific to instcombine and
can be useful in other places, too.
rdar://problem/30725338
Differential Revision: https://reviews.llvm.org/D30919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297971 91177308-0d34-0410-b5e6-96231b3b80d8
in r297374.
I've extracted a small version of this from the C++ metaprogram Richard
came up with to exercise these kinds of issues and written comments to
describe both how to reproduce a fresh version of the test case and what
likely failure modes are.
The test case is still a bit brittle as it depends on the particular
inline cost modeling and SCC visitation order, but it definitely would
have caught the bug right away when developing things so it seems
a really valuable test case to have.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297935 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't safe on all targets, and since we don't have a way
to know it's safe, avoid doing it for now.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297788 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
In SamplePGO, if the profile is collected from non-LTO binary, and used to drive ThinLTO, the indirect call promotion may fail because ThinLTO adjusts local function names to avoid conflicts. There are two places of where the mismatch can happen:
1. thin-link prepends SourceFileName to front of FuncName to build the GUID (GlobalValue::getGlobalIdentifier). Unlike instrumentation FDO, SamplePGO does not use the PGOFuncName scheme and therefore the indirect call target profile data contains a hash of the OriginalName.
2. backend compiler promotes some local functions to global and appends .llvm.{$ModuleHash} to the end of the FuncName to derive PromotedFunctionName
This patch tries at the best effort to find the GUID from the original local function name (in profile), and use that in ICP promotion, and in SamplePGO matching that happens in the backend after importing/inlining:
1. in thin-link, it builds the map from OriginalName to GUID so that when thin-link reads in indirect call target profile (represented by OriginalName), it knows which GUID to import.
2. in backend compiler, if sample profile reader cannot find a profile match for PromotedFunctionName, it will try to find if there is a match for OriginalFunctionName.
3. in backend compiler, we build symbol table entry for OriginalFunctionName and pointer to the same symbol of PromotedFunctionName, so that ICP can find the correct target to promote.
Reviewers: mehdi_amini, tejohnson
Reviewed By: tejohnson
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D30754
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297757 91177308-0d34-0410-b5e6-96231b3b80d8
If it is possible for the RHS of a shift operation to be greater than or equal
to the bit-width, then the result might be undef, and we can't report any known
bits.
In some cases, this was allowing a transformation in instcombine which widened
an undef value from i1 to i32, increasing the range of values that a function
could return.
Differential revision: https://reviews.llvm.org/D30781
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297724 91177308-0d34-0410-b5e6-96231b3b80d8
getIntrinsicInstrCost() used to only compute scalarization cost based on types.
This patch improves this so that the actual arguments are checked when they are
available, in order to handle only unique non-constant operands.
Tests updates:
Analysis/CostModel/X86/arith-fp.ll
Transforms/LoopVectorize/AArch64/interleaved_cost.ll
Transforms/LoopVectorize/ARM/interleaved_cost.ll
The improvement in getOperandsScalarizationOverhead() to differentiate on
constants made it necessary to update the interleaved_cost.ll tests even
though they do not relate to intrinsics.
Review: Hal Finkel
https://reviews.llvm.org/D29540
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297705 91177308-0d34-0410-b5e6-96231b3b80d8
The typical use is a library vote function which
compares to 0. Fold the user condition into the intrinsic.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297650 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change solves the same problem as D30726, except that this only
throws out the bathwater.
AST was not correctly tracking and deleting UnknownInstructions via
handles. The existing code only tracks "pointers" in its
`ASTCallbackVH`, so an UnknownInstruction (that isn't also def'ing a
pointer used by another memory instruction) never gets a
`ASTCallbackVH`.
There are two other ways to solve this problem:
- Use the `PointerRec` scheme for both known and unknown instructions.
- Use a `CallbackVH` that erases the offending Instruction from the
UnknownInstruction list.
Both of the above changes seemed to be significantly (and unnecessarily
IMO) more complex than this.
Reviewers: chandlerc, dberlin, hfinkel, reames
Subscribers: mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D30849
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297539 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts r293386, r294027, r294029 and r296411.
Turns out the SLP tree isn't actually a "tree" and we don't handle
accessing the same packet of loads in several different orders well,
causing miscompiles.
Revert until we can fix this properly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297493 91177308-0d34-0410-b5e6-96231b3b80d8
The insertion point may be later than the next instruction,
so it is necessary to set it when replacing the call.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@297439 91177308-0d34-0410-b5e6-96231b3b80d8