The sibling folds for 'and' with casts were added with https://reviews.llvm.org/rL273200.
This is a preliminary step for adding the 'or' variants for the folds added with https://reviews.llvm.org/rL301260.
The reason for the strange form with constant LHS in the 1st test is because there's another missing fold in that
case for the inverted predicate. That should be fixed when we add the ConstantRange functionality for 'or-of-icmps'
that already exists for 'and-of-icmps'.
I'm hoping to share more code for the and/or cases, so we won't have these differences. This will allow us to remove
code from InstCombine. It's also possible that we can remove some code here in InstSimplify. I think we have some
duplicated folds because patterns are not matched in a general way.
Differential Revision: https://reviews.llvm.org/D32876
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302189 91177308-0d34-0410-b5e6-96231b3b80d8
This happened on the PPC32/SVR4 path and was discovered when building
FreeBSD on PPC32. It was a typo-class error in the frame lowering code.
This fixes PR26519.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302183 91177308-0d34-0410-b5e6-96231b3b80d8
This avoids problems on code like this:
char buf[16];
__asm {
movups xmm0, [buf]
mov [buf], eax
}
The frontend size in this case (1) is wrong, and the register makes the
instruction matching unambiguous. There are also enough bytes available
that we shouldn't complain to the user that they are potentially using
an incorrectly sized instruction to access the variable.
Supersedes D32636 and D26586 and fixes PR28266
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302179 91177308-0d34-0410-b5e6-96231b3b80d8
Putting these next to each other should make it easier to see
what's missing from each side. Patch to plug one of those holes
should be posted soon.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302178 91177308-0d34-0410-b5e6-96231b3b80d8
Currently multiply is implemented in operator*=. Operator* makes a copy and uses operator*= to modify the copy.
Operator*= itself allocates a temporary buffer to hold the multiply result as it computes it. Then copies it to the buffer in *this.
Operator*= attempts to bound the size of the result based on the number of active bits in its inputs. It also has a couple special cases to handle 0 inputs without any memory allocations or multiply operations. The best case is that it calculates a single word regardless of input bit width. The worst case is that it calculates the a 2x input width result and drop the upper bits.
Since operator* uses operator*= it incurs two allocations, one for a copy of *this and one for the temporary allocation. Neither of these allocations are kept after the method operation is done.
The main usage in the backend appears to be ConstantRange::multiply which uses operator* rather than operator*=.
This patch moves the multiply operation to operator* and implements operator*= using it. This avoids the copy in operator*. operator* now allocates a result buffer sized the same width as its inputs no matter what. This buffer will be used as the buffer for the returned APInt. Finally, we reuse tcMultiply to implement the multiply operation. This function is capable of not calculating additional upper words that will be discarded.
This change does lose the special optimizations for the inputs using less words than their size implies. But it also removed the getActiveBits calls from all multiplies. If we think those optimizations are important we could look at providing additional bounds to tcMultiply to limit the computations.
Differential Revision: https://reviews.llvm.org/D32830
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302171 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
As of this patch, 350 out of 3938 rules are currently imported.
Depends on D32229
Reviewers: qcolombet, kristof.beyls, rovka, t.p.northover, ab, aditya_nandakumar
Reviewed By: ab
Subscribers: dberris, llvm-commits, igorb
Differential Revision: https://reviews.llvm.org/D32275
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302154 91177308-0d34-0410-b5e6-96231b3b80d8
When a 128 bit COPY is lowered into two instructions, an impl-use operand of
the super-reg should be added to each new instruction in case one of the
sub-regs is undefined.
Review: Ulrich Weigand
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302146 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r302108. This causes crash in clang bootstrap with LTO.
Contacted the auther in the original commit.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302140 91177308-0d34-0410-b5e6-96231b3b80d8
Follow up rL290858 by removing the MIPS specific version of XRayTable
emission in favour of the basic version.
This resolves a buildbot failure where the ELF sections were malformed
causing the linker to reject the object files with xray related sections.
Reviewers: dberris, slthakur
Differential Revision: https://reviews.llvm.org/D32808
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302138 91177308-0d34-0410-b5e6-96231b3b80d8
This fixes a regression since SVN rev 273808 (which was supposed to
not change functionality).
The regression caused miscompilations (noted in the wild when targeting
AArch64) on platforms with 32 bit long.
Differential Revision: https://reviews.llvm.org/D32850
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302137 91177308-0d34-0410-b5e6-96231b3b80d8
In this patch, I introduce a new altmacro string delimiter.
This review is the second review in a series of four reviews.
(one for each altmacro feature: LOCAL, string delimiter, string '!' escape sign and absolute expression as a string '%' ).
In the alternate macro mode, you can delimit strings with matching angle brackets <..>
when using it as a part of calling macro arguments.
As described in the https://sourceware.org/binutils/docs-2.27/as/Altmacro.html
"<string>
You can delimit strings with matching angle brackets."
assumptions:
1. If an argument begins with '<' and ends with '>'. The argument is considered as a string.
2. Except adding new string mark '<..>', a regular macro behavior is expected.
3. The altmacro cannot affect the regular less/greater behavior.
4. If a comma is present inside an angle brackets it considered as a character and not as a separator.
Differential Revision: https://reviews.llvm.org/D32701
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302135 91177308-0d34-0410-b5e6-96231b3b80d8
Added the integer data processing intrinsics from ACLE v2.1 Chapter 9
but I have missed out the saturation_occurred intrinsics for now. For
the instructions that read and write the GE bits, a chain is included
and the only instruction that reads these flags (sel) is only
selectable via the implemented intrinsic.
Differential Revision: https://reviews.llvm.org/D32281
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302126 91177308-0d34-0410-b5e6-96231b3b80d8
According to psABI, PLT stub clobbers XMM8-XMM15.
In Regcall calling convention those registers are used for passing parameters.
Thus we need to prevent lazy binding in Regcall.
Differential Revision: https://reviews.llvm.org/D32430
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302124 91177308-0d34-0410-b5e6-96231b3b80d8
This makes it simpler for the runtime to consistently handle the entries
in the function sled index in both 32 and 64 bit platforms where the
XRay runtime works.
Follow-up on D32693.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302111 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This change adds a new section to the xray-instrumented binary that
stores an index into ranges of the instrumentation map, where sleds
associated with the same function can be accessed as an array. At
runtime, we can get access to this index by function ID offset allowing
for selective patching and unpatching by function ID.
Each entry in this new section (xray_fn_idx) will include two pointers
indicating the start and one past the end of the sleds associated with
the same function. These entries will be 16 bytes long on x86 and
aarch64. On arm, we align to 16 bytes anyway so the runtime has to take
that into consideration.
__{start,stop}_xray_fn_idx will be the symbols that the runtime will
look for when we implement the selective patching/unpatching by function
id APIs. Because XRay synthesizes the function id's in a monotonically
increasing manner at runtime now, implementations (and users) can use
this table to look up the sleds associated with a specific function.
This is useful in implementations that want to do things like:
- Implement coverage mode for functions by patching everything
pre-main, then as functions are encountered, the installed handler
can unpatch the function that's been encountered after recording
that it's been called.
- Do "learning mode", so that the implementation can figure out some
statistical information about function calls by function id for a
time being, and then determine which functions are worth
uninstrumenting at runtime.
- Do "selective instrumentation" where an implementation can
specifically instrument only certain function id's at runtime
(either based on some external data, or through some other
heuristics) instead of patching all the instrumented functions at
runtime.
Reviewers: dblaikie, echristo, chandlerc, javed.absar
Subscribers: pelikan, aemerson, kpw, llvm-commits, rengolin
Differential Revision: https://reviews.llvm.org/D32693
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302109 91177308-0d34-0410-b5e6-96231b3b80d8
When profiling a no-op incremental link of Chromium I found that the functions
computeImportForFunction and computeDeadSymbols were consuming roughly 10% of
the profile. The goal of this change is to improve the performance of those
functions by changing the map lookups that they were previously doing into
pointer dereferences.
This is achieved by changing the ValueInfo data structure to be a pointer to
an element of the global value map owned by ModuleSummaryIndex, and changing
reference lists in the GlobalValueSummary to hold ValueInfos instead of GUIDs.
This means that a ValueInfo will take a client directly to the summary list
for a given GUID.
Differential Revision: https://reviews.llvm.org/D32471
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302108 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is an implementation of the loop detection logic that XRay needs to
determine whether a function might take time at runtime. Without this
heuristic, XRay will tend to not instrument short functions that have
loops that might have runtime dependent on inputs or external values.
While this implementation doesn't do any further analysis than just
figuring out whether there is a loop in the MachineFunction being
code-gen'ed, we're paving the way for being able to perform more
sophisticated analysis of the function in the future (for example to
determine whether the trip count for the loop might be constant, and
make a decision on that instead). This enables us to cover more
functions with the default heuristics, and potentially identify ones
that have variable runtime latency just by looking for the presence of
loops.
Reviewers: chandlerc, rnk, pelikan
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32274
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302103 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
The existing implementation creates a symbolic SCEV expression every
time we analyze a phi node and then has to remove it, when the analysis
is finished. This is very expensive, and in most of the cases it's also
unnecessary. According to the data I collected, ~60-70% of analyzed phi
nodes (measured on SPEC) have the following form:
PN = phi(Start, OP(Self, Constant))
Handling such cases separately significantly speeds this up.
Reviewers: sanjoy, pete
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D32663
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302096 91177308-0d34-0410-b5e6-96231b3b80d8
We'll set it back to true in emitPrologue if it gets called. It doesn't
get called for naked functions.
Fixes PR32912
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302092 91177308-0d34-0410-b5e6-96231b3b80d8
I don't believe its possible to have non-zero values here since DataLayout became required. The APInt constructor inside of the KnownBits object will assert if this ever happens.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302089 91177308-0d34-0410-b5e6-96231b3b80d8
This is the DAG equivalent of https://reviews.llvm.org/D32255 ,
which will hopefully be committed again. The functionality
(preferring a 'not' op) is already here in the DAG, so this is
just intended to be a clean-up and performance improvement.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@302087 91177308-0d34-0410-b5e6-96231b3b80d8