mir-canon (MIRCanonicalizerPass) is a pass designed to reorder instructions and
rename operands so that two similar programs will diff more cleanly after being
run through mir-canon than they would otherwise. This project is still a work
in progress and there are ideas still being discussed for improving diff
quality.
M include/llvm/InitializePasses.h
M lib/CodeGen/CMakeLists.txt
M lib/CodeGen/CodeGen.cpp
A lib/CodeGen/MIRCanonicalizerPass.cpp
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317285 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently the block frequency analysis is an approximation for irreducible
loops.
The new irreducible loop metadata is used to annotate the irreducible loop
headers with their header weights based on the PGO profile (currently this is
approximated to be evenly weighted) and to help improve the accuracy of the
block frequency analysis for irreducible loops.
This patch is a basic support for this.
Reviewers: davidxl
Reviewed By: davidxl
Subscribers: mehdi_amini, llvm-commits, eraman
Differential Revision: https://reviews.llvm.org/D39028
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317278 91177308-0d34-0410-b5e6-96231b3b80d8
undefined reference to `llvm::TargetPassConfig::ID' on
clang-ppc64le-linux-multistage
This reverts commit eea333c33fa73ad225ef28607795984829f65688.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317213 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This is mostly a noop (most of the test diffs are renamed blocks).
There are a few temporary register renames (eax<->ecx) and a few blocks are
shuffled around.
See the discussion in PR33325 for more details.
Reviewers: spatel
Subscribers: mgorny
Differential Revision: https://reviews.llvm.org/D39456
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317211 91177308-0d34-0410-b5e6-96231b3b80d8
When splitting a large load to smaller legally-typed loads, the last load should be padded to reach the size of the previous one so a CONCAT_VECTORS node could reunite them again.
The code currently pads the last load to reach the size of the first load (instead of the previous).
Differential Revision: https://reviews.llvm.org/D38495
Change-Id: Ib60b55ed26ce901fabf68108daf52683fbd5013f
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317206 91177308-0d34-0410-b5e6-96231b3b80d8
As of today we only use .cfi_offset to specify the offset of a CSR, but
we never use .cfi_restore when the CSR is restored.
If we want to perform a more advanced type of shrink-wrapping, we need
to use .cfi_restore in order to switch the CFI state between blocks.
This patch only aims at adding support for the directive.
Differential Revision: https://reviews.llvm.org/D36114
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317199 91177308-0d34-0410-b5e6-96231b3b80d8
This patch aims to provide correct dwarf unwind information in function
epilogue for X86.
It consists of two parts. The first part inserts CFI instructions that set
appropriate cfa offset and cfa register in emitEpilogue() in
X86FrameLowering. This part is X86 specific.
The second part is platform independent and ensures that:
- CFI instructions do not affect code generation
- Unwind information remains correct when a function is modified by
different passes. This is done in a late pass by analyzing information
about cfa offset and cfa register in BBs and inserting additional CFI
directives where necessary.
Changed CFI instructions so that they:
- are duplicable
- are not counted as instructions when tail duplicating or tail merging
- can be compared as equal
Added CFIInstrInserter pass:
- analyzes each basic block to determine cfa offset and register valid at
its entry and exit
- verifies that outgoing cfa offset and register of predecessor blocks match
incoming values of their successors
- inserts additional CFI directives at basic block beginning to correct the
rule for calculating CFA
Having CFI instructions in function epilogue can cause incorrect CFA
calculation rule for some basic blocks. This can happen if, due to basic
block reordering, or the existence of multiple epilogue blocks, some of the
blocks have wrong cfa offset and register values set by the epilogue block
above them.
CFIInstrInserter is currently run only on X86, but can be used by any target
that implements support for adding CFI instructions in epilogue.
Patch by Violeta Vukobrat.
Differential Revision: https://reviews.llvm.org/D35844
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317100 91177308-0d34-0410-b5e6-96231b3b80d8
Change the map key from DIFile* to the absolute path string. Computing
the absolute path isn't expensive because we already have a map that
caches the full path keyed on DIFile*.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317041 91177308-0d34-0410-b5e6-96231b3b80d8
The address can be presented as a bitcast of baseReg.
In this case it is still trivial but OriginalValue != baseReg.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316980 91177308-0d34-0410-b5e6-96231b3b80d8
Issue found by llvm-isel-fuzzer on OSS fuzz, https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=3725
If anyone actually cares about > 64 bit arithmetic, there's a lot more to do in this area. There's a bunch of obviously wrong code in the same function. I don't have the time to fix all of them and am just using this to understand what the workflow for fixing fuzzer cases might look like.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316967 91177308-0d34-0410-b5e6-96231b3b80d8
We don't need to extend/truncate the Known structure before calling computeKnownBits - it will reset at the start of the function.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316962 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For reference, see: http://lists.llvm.org/pipermail/llvm-dev/2017-August/116589.html
This patch fleshes out the instruction class hierarchy with respect to atomic and
non-atomic memory intrinsics. With this change, the relevant part of the class
hierarchy becomes:
IntrinsicInst
-> MemIntrinsicBase (methods-only class)
-> MemIntrinsic (non-atomic intrinsics)
-> MemSetInst
-> MemTransferInst
-> MemCpyInst
-> MemMoveInst
-> AtomicMemIntrinsic (atomic intrinsics)
-> AtomicMemSetInst
-> AtomicMemTransferInst
-> AtomicMemCpyInst
-> AtomicMemMoveInst
-> AnyMemIntrinsic (both atomicities)
-> AnyMemSetInst
-> AnyMemTransferInst
-> AnyMemCpyInst
-> AnyMemMoveInst
This involves some class renaming:
ElementUnorderedAtomicMemCpyInst -> AtomicMemCpyInst
ElementUnorderedAtomicMemMoveInst -> AtomicMemMoveInst
ElementUnorderedAtomicMemSetInst -> AtomicMemSetInst
A script for doing this renaming in downstream trees is included below.
An example of where the Any* classes should be used in LLVM is when reasoning
about the effects of an instruction (ex: aliasing).
---
Script for renaming AtomicMem* classes:
PREFIXES="[<,([:space:]]"
CLASSES="MemIntrinsic|MemTransferInst|MemSetInst|MemMoveInst|MemCpyInst"
SUFFIXES="[;)>,[:space:]]"
REGEX="(${PREFIXES})ElementUnorderedAtomic(${CLASSES})(${SUFFIXES})"
REGEX2="visitElementUnorderedAtomic(${CLASSES})"
FILES=$( grep -E "(${REGEX}|${REGEX2})" -r . | tr ':' ' ' | awk '{print $1}' | sort | uniq )
SED_SCRIPT="s~${REGEX}~\1Atomic\2\3~g"
SED_SCRIPT2="s~${REGEX2}~visitAtomic\1~g"
for f in $FILES; do
echo "Processing: $f"
sed -i ".bak" -E "${SED_SCRIPT};${SED_SCRIPT2};${EA_SED_SCRIPT};${EA_SED_SCRIPT2}" $f
done
Reviewers: sanjoy, deadalnix, apilipenko, anna, skatkov, mkazantsev
Reviewed By: sanjoy
Subscribers: hfinkel, jholewinski, arsenm, sdardis, nhaehnle, JDevlieghere, javed.absar, llvm-commits
Differential Revision: https://reviews.llvm.org/D38419
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316950 91177308-0d34-0410-b5e6-96231b3b80d8
- Targets that want to support memcmp expansions now return the list of
supported load sizes.
- Expansion codegen does not assume that all power-of-two load sizes
smaller than the max load size are valid. For examples, this is not the
case for x86(32bit)+sse2.
Fixes PR34887.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316905 91177308-0d34-0410-b5e6-96231b3b80d8
Introduce a isConstOrDemandedConstSplat helper function that can recognise a constant splat build vector for at least the demanded elts we care about.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316866 91177308-0d34-0410-b5e6-96231b3b80d8
For cases where we know the floating point representations match the bitcasted integer equivalent, allow bitcasting to these types.
This is especially useful for the X86 floating point compare results which return all/zero bits but as a floating point type.
Differential Revision: https://reviews.llvm.org/D39289
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316831 91177308-0d34-0410-b5e6-96231b3b80d8
In function DAGCombiner::visitSIGN_EXTEND_INREG, sext can be combined with extload even if sextload is not supported by target, then
if sext is the only user of extload, there is no big difference, no harm no benefit.
if extload has more than one user, the combined sextload may block extload from combining with other zext, causes extra zext instructions generated. As demonstrated by the attached test case.
This patch add the constraint that when sextload is not supported by target, sext can only be combined with extload if it is the only user of extload.
Differential Revision: https://reviews.llvm.org/D39108
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316802 91177308-0d34-0410-b5e6-96231b3b80d8
Not having the subclass data on an MemIntrinsicSDNodes means it was possible
to try to fold 2 nodes with the same operands but differing MMO flags. This
would trip an assertion when trying to refine the alignment between the 2
MachineMemOperands.
Differential Revision: https://reviews.llvm.org/D38898
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316737 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Currently we skip merging when extra moves may be added in the header of switch instead of the case block, if the case block is used as an incoming
block of a PHI. If all the incoming values of the PHIs are non-constants and the destination block is dominated by the switch block then extra moves are likely not added by ISel, so there is no need to skip merging in this case.
Reviewers: efriedma, junbuml, davidxl, hfinkel, qcolombet
Reviewed By: efriedma
Subscribers: dberlin, kuhar, mcrosier, llvm-commits
Differential Revision: https://reviews.llvm.org/D37343
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316711 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This seems to be the only place in llvm we directly call qsort. We can replace
this with a call to array_pod_sort. Also minor cleanup of the sorting function.
Reviewers: bkramer, Eugene.Zelenko, rafael
Reviewed By: bkramer
Subscribers: efriedma, llvm-commits
Differential Revision: https://reviews.llvm.org/D39214
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316671 91177308-0d34-0410-b5e6-96231b3b80d8
Use StringRef for CountingFunctionName, remove erroneous comment
copied from InstructionNamer, and drop some trailing whitespace.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316644 91177308-0d34-0410-b5e6-96231b3b80d8
Compute the actual decomposition only after deciding whether to expand
of not. Else, it's easy to make the compiler OOM with:
`memcpy(dst, src, 0xffffffffffffffff);`, which typically happens if
someone mistakenly passes a negative value. Add a test.
This reverts commit f8fc02fbd4ab33383c010d33675acf9763d0bd44.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316567 91177308-0d34-0410-b5e6-96231b3b80d8
Duplicated code found in three places put into a new static function:
/// Given a Count of resource usage and a Latency value, return true if a
/// SchedBoundary becomes resource limited.
static bool checkResourceLimit(unsigned LFactor, unsigned Count,
unsigned Latency) {
return (int)(Count - (Latency * LFactor)) > (int)LFactor;
}
Review: Florian Hahn, Matthias Braun
https://reviews.llvm.org/D39235
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316560 91177308-0d34-0410-b5e6-96231b3b80d8
This code added in r297930 assumed that it could create
a select with a condition type that is just an integer
bitcast of the selected type. For AMDGPU any vselect is
going to be scalarized (although the vector types are legal),
and all select conditions must be i1 (the same as getSetCCResultType).
This logic doesn't really make sense to me, but there's
never really been a consistent policy in what the select
condition mask type is supposed to be. Try to extend
the logic for skipping the transform for condition types
that aren't setccs. It doesn't seem quite right to me though,
but checking conditions that seem more sensible (like whether the
vselect is going to be expanded) doesn't work since this
seems to depend on that also.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316554 91177308-0d34-0410-b5e6-96231b3b80d8
Similar to how llvm::salvagDebugInfo hooks into InstCombine, this adds
a hook that can be invoked before an SDNode that is associated with an
SDDbgValue is erased to capture the effect of the deleted node in a
DIExpression.
The motivating example is an SDDebugValue attached to an ADD operation
that gets folded into a LOAD+OFFSET operation.
rdar://problem/32121503
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316525 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r316417, which causes internal compiles to OOM.
I don't unfortunately have a self-contained test case but will follow up
with courbet.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316497 91177308-0d34-0410-b5e6-96231b3b80d8
This updates the MIRPrinter to include the regclass when printing
virtual register defs, which is already valid syntax for the
parser. That is, given 64 bit %0 and %1 in a "gpr" regbank,
%1(s64) = COPY %0(s64)
would now be written as
%1:gpr(s64) = COPY %0(s64)
While this change alone introduces a bit of redundancy with the
registers block, it allows us to update the tests to be more concise
and understandable and brings us closer to being able to remove the
registers block completely.
Note: We generally only print the class in defs, but there is one
exception. If there are uses without any defs whatsoever, we'll print
the class on all uses. I'm not completely convinced this comes up in
meaningful machine IR, but for now the MIRParser and MachineVerifier
both accept that kind of stuff, so we don't want to have a situation
where we can print something we can't parse.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316479 91177308-0d34-0410-b5e6-96231b3b80d8
Refactor ExpandMemcmp:
- Stop duplicating the logic for computation of the sequence of loads to
generate (thsi was done in three different places), this is now done
only once in MemCmpExpansion::MemCmpExpansion().
- Add a FIXME to expose a bug with the computation of the number of loads
when not all sizes are loadable. For example, on X86-32 + SSE, possible
loads are {16,4,2,1} bytes. The current code considers that all loads
starting at MaxLoadSize are possible. This is not an issue right now as
vector loads are not enabled, so I'm not fixing the issue here to keep
the change as small as possible. I'm going to address this in a
subsequent revision, where I enable vector loads.
See https://bugs.llvm.org/show_bug.cgi?id=34887
Differential Revision: https://reviews.llvm.org/D38498
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316417 91177308-0d34-0410-b5e6-96231b3b80d8
Infrastructure designed for padding code with nop instructions in key places such that preformance improvement will be achieved.
The infrastructure is implemented such that the padding is done in the Assembler after the layout is done and all IPs and alignments are known.
This patch by itself in a NFC. Future patches will make use of this infrastructure to implement required policies for code padding.
Reviewers:
aaboud
zvi
craig.topper
gadi.haber
Differential revision: https://reviews.llvm.org/D34393
Change-Id: I92110d0c0a757080a8405636914a93ef6f8ad00e
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316413 91177308-0d34-0410-b5e6-96231b3b80d8