Fixes leaving intermediate flat addressing computations
where a GEP instruction's source is a constant expression.
Still leaves behind a trivial addrspacecast + gep pair that
instcombine is able to handle, which ideally could be folded
here directly.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@301044 91177308-0d34-0410-b5e6-96231b3b80d8
places based on it.
Existing constant hoisting pass will merge a group of contants in a small range
and hoist the const materialization code to the common dominator of their uses.
However, if the uses are all in cold pathes, existing implementation may hoist
the materialization code from cold pathes to a hot place. This may hurt performance.
The patch introduces BFI to the pass and selects the best insertion places based
on it.
The change is controlled by an option consthoist-with-block-frequency which is
off by default for now.
Differential Revision: https://reviews.llvm.org/D28962
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300989 91177308-0d34-0410-b5e6-96231b3b80d8
CodeExtractor looks up the dominator node corresponding to return blocks
when splitting them. If one of these blocks is unreachable, there's no
node in the Dom and CodeExtractor crashes because it doesn't check
for domtree node validity.
In theory, we could add just a check for skipping null DTNodes in
`splitReturnBlock` but the fix I propose here is slightly different. To the
best of my knowledge, unreachable blocks are irrelevant for the algorithm,
therefore we can just skip them when building the candidate set in the
constructor.
Differential Revision: https://reviews.llvm.org/D32335
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300946 91177308-0d34-0410-b5e6-96231b3b80d8
The demanded mask and the constant should always be the same width for all callers today.
Also stop copying the demanded mask as its passed in. We should avoid allocating memory unless we are going to do something. The final AND to create the new constant will take care of it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300927 91177308-0d34-0410-b5e6-96231b3b80d8
getSignBit is a static function that creates an APInt with only the sign bit set. getSignMask seems like a better name to convey its functionality. In fact several places use it and then store in an APInt named SignMask.
Differential Revision: https://reviews.llvm.org/D32108
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300856 91177308-0d34-0410-b5e6-96231b3b80d8
This question comes up in many places in SimplifyDemandedBits. This makes it easy to ask without allocating additional temporary APInts.
The BitVector class provides a similar functionality through its (IMHO badly named) test(const BitVector&) method. Though its output polarity is reversed.
I've provided one example use case in this patch. I plan to do more as a follow up.
Differential Revision: https://reviews.llvm.org/D32258
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300851 91177308-0d34-0410-b5e6-96231b3b80d8
Currently we don't explicitly process ConstantDataSequential, ConstantAggregateZero, or ConstantVector, or Undef before applying the Depth limit. Instead they occur after the depth check in the non-instruction path.
For the constant types that we do handle, the code is replicated from computeKnownBits.
This patch fixes the missing constant handling and the reduces the amount of code by just using computeKnownBits directly for any type of Constant.
Differential Revision: https://reviews.llvm.org/D32123
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300849 91177308-0d34-0410-b5e6-96231b3b80d8
This change is correct because the verifier requires that at most one
argument be marked 'sret'.
NFC, removes a use of AttributeList slot APIs.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300784 91177308-0d34-0410-b5e6-96231b3b80d8
This is preparation for a clang change to improve the [[nodiscard]] warning to not be ignored on methods that return a class marked [[nodiscard]] that are defined in the class itself. See D32207.
We should consider adding wrapper methods to APInt that return the overflow flag directly and discard the APInt result. This would eliminate the void casts and the need to create a bool before the call to pass to the out param.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300758 91177308-0d34-0410-b5e6-96231b3b80d8
The most common case for a branch condition is
a single use compare. Directly invert the branch
predicate rather than adding a lot of xor i1 true
which the DAG will have to fold later.
This produces nicer to read structurizer output.
This produces some random changes in codegen
due to the DAG swapping branch conditions itself,
and then does a poor job of dealing with those
inverts.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300732 91177308-0d34-0410-b5e6-96231b3b80d8
This should simplify the call sites, which typically want to tweak one
attribute at a time. It should also avoid creating ephemeral
AttributeLists that live forever.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300718 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: In case all predecessor go to a single successor of current BB. We want to fold (not thread).
Reviewers: efriedma, sanjoy
Reviewed By: sanjoy
Subscribers: dberlin, majnemer, llvm-commits
Differential Revision: https://reviews.llvm.org/D30869
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300657 91177308-0d34-0410-b5e6-96231b3b80d8
In tryToVectorizeList, under a very limited circumstance (when entered
from tryToVectorizePair), the values may be reordered (swapped) and the
SLP tree is built with the new order. This extends that to the case when
starting from phis in vectorizeChainsInBlock when there are exactly two
phis. The textual order of phi nodes shouldn't really matter. Without
this change, the loop body in the accompnaying test case is fully vectorized
when we swap the orde of the phis but not with this order. While this
doesn't solve the phi-ordering problem in a general way (for more than 2
phis), this is simple fix that piggybacks on an existing mechanism and
is useful in cases like multiplying two complex numbers.
Differential revision: https://reviews.llvm.org/D32065
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300574 91177308-0d34-0410-b5e6-96231b3b80d8
This patch uses lshrInPlace to replace code where the object that lshr is called on is being overwritten with the result.
This adds an lshrInPlace(const APInt &) version as well.
Differential Revision: https://reviews.llvm.org/D32155
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300566 91177308-0d34-0410-b5e6-96231b3b80d8
This patch is part of D28975's breakdown.
Add caching for block masks similar to the cache already used for edge masks,
replacing generation per user with reusing the first generated value which
dominates all uses.
Differential Revision: https://reviews.llvm.org/D32054
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300557 91177308-0d34-0410-b5e6-96231b3b80d8
Before this patch, we always called method 'findCalleeFunctionSamples()' on
intrinsic calls. However, intrinsic calls like llvm.dbg.value() are not viable
candidates for obvious reasons.
No functional change intended.
Differential Revision: https://reviews.llvm.org/D32008
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300541 91177308-0d34-0410-b5e6-96231b3b80d8
The DWARF specification knows 3 kinds of non-empty simple location
descriptions:
1. Register location descriptions
- describe a variable in a register
- consist of only a DW_OP_reg
2. Memory location descriptions
- describe the address of a variable
3. Implicit location descriptions
- describe the value of a variable
- end with DW_OP_stack_value & friends
The existing DwarfExpression code is pretty much ignorant of these
restrictions. This used to not matter because we only emitted very
short expressions that we happened to get right by accident. This
patch makes DwarfExpression aware of the rules defined by the DWARF
standard and now chooses the right kind of location description for
each expression being emitted.
This would have been an NFC commit (for the existing testsuite) if not
for the way that clang describes captured block variables. Based on
how the previous code in LLVM emitted locations, DW_OP_deref
operations that should have come at the end of the expression are put
at its beginning. Fixing this means changing the semantics of
DIExpression, so this patch bumps the version number of DIExpression
and implements a bitcode upgrade.
There are two major changes in this patch:
I had to fix the semantics of dbg.declare for describing function
arguments. After this patch a dbg.declare always takes the *address*
of a variable as the first argument, even if the argument is not an
alloca.
When lowering a DBG_VALUE, the decision of whether to emit a register
location description or a memory location description depends on the
MachineLocation — register machine locations may get promoted to
memory locations based on their DIExpression. (Future) optimization
passes that want to salvage implicit debug location for variables may
do so by appending a DW_OP_stack_value. For example:
DBG_VALUE, [RBP-8] --> DW_OP_fbreg -8
DBG_VALUE, RAX --> DW_OP_reg0 +0
DBG_VALUE, RAX, DIExpression(DW_OP_deref) --> DW_OP_reg0 +0
All testcases that were modified were regenerated from clang. I also
added source-based testcases for each of these to the debuginfo-tests
repository over the last week to make sure that no synchronized bugs
slip in. The debuginfo-tests compile from source and run the debugger.
https://bugs.llvm.org/show_bug.cgi?id=32382
<rdar://problem/31205000>
Differential Revision: https://reviews.llvm.org/D31439
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@300522 91177308-0d34-0410-b5e6-96231b3b80d8