Consider this code:
BB:
%i = phi i32 [ 0, %if.then ], [ %c, %if.else ]
%add = add nsw i32 %i, %b
...
In this common case the add can be moved to the %if.else basic block, because
adding zero is an identity operation. If we go though %if.then branch it's
always a win, because add is not executed; if not, the number of instructions
stays the same.
This pattern applies also to other instructions like sub, shl, shr, ashr | 0,
mul, sdiv, div | 1.
Patch by Jakub Kuderski!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244887 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r244879, as it broke the test-suite on
SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244885 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r244880, as it broke the test-suite on
SingleSource/Regression/C/2004-03-15-IndirectGoto in AArch64.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244884 91177308-0d34-0410-b5e6-96231b3b80d8
in LoopIdiomRecognize. This is what started me staring at this code. Now
migrating it with the new AA stuff will be trivial.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244880 91177308-0d34-0410-b5e6-96231b3b80d8
simplified form to remove redundant checks and simplify the code for
popcount recognition. We don't actually need to handle all of these
cases.
I've left a FIXME for one in particular until I finish inspecting to
make sure we don't actually *rely* on the predicate in any way.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244879 91177308-0d34-0410-b5e6-96231b3b80d8
Most SSE/AVX (non-constant) vector shift instructions only use the lower 64-bits of the 128-bit shift amount vector operand, this patch calls SimplifyDemandedVectorElts to optimize for this.
I had to refactor some of my recent InstCombiner work on the vector shifts to avoid quite a bit of duplicate code, it means that SimplifyX86immshift now (re)decodes the type of shift.
Differential Revision: http://reviews.llvm.org/D11938
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244872 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch moves the check of OptimizeForSize before traversing over all basic blocks in current loop. If OptimizeForSize is set to true, no non-trivial unswitch is ever allowed. Therefore, the early exit will help reduce compilation time. This patch should be NFC.
Reviewers: reames, weimingz, broune
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11997
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244868 91177308-0d34-0410-b5e6-96231b3b80d8
code into methods on LoopIdiomRecognize.
This simplifies the code somewhat and also makes it much easier to move
the analyses around. Ultimately, the separate class wasn't providing
significant value over methods -- it contained the precondition basic
block and the current loop. The current loop is already available and
the precondition block wasn't needed everywhere and is easy to pass
around.
In several cases I just moved things to be static functions because they
already accepted most of their inputs as arguments.
This doesn't fix the way we manage analyses yet, that will be the next
patch, but it already makes the code over 50 lines shorter.
No functionality changed.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244851 91177308-0d34-0410-b5e6-96231b3b80d8
complexity.
There is only one function that was called from multiple locations, and
that was 'getBranch' which has a reasonable one-line spelling already:
dyn_cast<BranchInst>(BB->getTerminator). We could make this shorter, but
it doesn't seem to add much value. Instead, we should avoid calling it
so many times on the same basic blocks, but that will be in a subsequent
patch.
The other functions are only called in one location, so inline them
there, and take advantage of this to use direct early exit and reduce
indentation. This makes it much more clear what is being tested for, and
in fact makes it clear now to me that there are simpler ways to do this
work. However, this patch just does the mechanical inlining. I'll clean
up the functionality of the code to leverage loop simplified form more
effectively in a follow-up.
Despite lots of early line breaks due to early-exit, this is still
shorter than it was before.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244841 91177308-0d34-0410-b5e6-96231b3b80d8
a significant code cleanup here.
The handling of analyses in this pass is overly complex and can be
simplified significantly, but the right way to do that is to simplify
all of the code not just the analyses, and that'll require pretty
extensive edits that would be noisy with formatting changes mixed into
them.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244828 91177308-0d34-0410-b5e6-96231b3b80d8
To be clear: this is an *optimization* not a correctness change.
CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll.
One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust.
I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially.
Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly.
Differential Revision: http://reviews.llvm.org/D11819
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244821 91177308-0d34-0410-b5e6-96231b3b80d8
When rewriting the IR such that base pointers are available for every live pointer, we potentially need to duplicate instructions to propagate the base. The original code had only handled PHI and Select under the belief those were the only instructions which would need duplicated. When I added support for vector instructions, I'd added a collection of hacks for ExtractElement which caught most of the common cases. Of course, I then found the one test case my hacks couldn't cover. :)
This change removes all of the early hacks for extract element. By defining extractelement as a BDV (rather than trying to look through it), we can extend the rewriting algorithm to duplicate the extract as needed. Note that a couple of peephole optimizations were left in for the moment, because while we now handle extractelement as a first class citizen, we're not yet handling insertelement. That change will follow in the near future.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244808 91177308-0d34-0410-b5e6-96231b3b80d8
I forgot to add these in r244780 and r244778. Sorry about that.
Also order the static dependencies in a lexicographical order.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244787 91177308-0d34-0410-b5e6-96231b3b80d8
AliasAnalysis.
Same as the other commits, the TLI access from an alias analysis is
going away and isn't very clean -- it is better to explicitly mark the
dependencies.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244785 91177308-0d34-0410-b5e6-96231b3b80d8
just depend on it directly.
This was particularly frustrating because there was a really wide
mixture of using a member variable and re-extracting it from the AA that
happened to be around. I think the result is much more clear.
I've also deleted all of the pointless null checks and used references
across the APIs where I could to make it explicit that this cannot be
null in a useful fashion.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244780 91177308-0d34-0410-b5e6-96231b3b80d8
r243382 changed the behavior to always require a set of memchecks to be
passed to LoopVer. This change restores the prior behavior as an
alternative to the new behavior. This allows the checks to be
implicitly taken from the LAA object.
Patch by Ashutosh Nema!
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244763 91177308-0d34-0410-b5e6-96231b3b80d8
As discussed in D11886, this patch moves the SSE/AVX vector blend folding to instcombiner from PerformINTRINSIC_WO_CHAINCombine (which allows us to remove this completely).
InstCombiner already had partial support for this, I just had to add support for zero (ConstantAggregateZero) masks and also the case where both selection inputs were the same (allowing us to ignore the mask).
I also moved all the relevant combine tests into InstCombine/blend_x86.ll
Differential Revision: http://reviews.llvm.org/D11934
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244723 91177308-0d34-0410-b5e6-96231b3b80d8
`InstCombiner::OptimizeOverflowCheck` was asserting an
invariant (operands to binary operations are ordered by decreasing
complexity) that wasn't really an invariant. Fix this by instead having
`InstCombiner::OptimizeOverflowCheck` establish the invariant if it does
not hold.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244676 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: This patch adds check for dead blocks and skip them for processSwitchInst(). This will help reduce compilation time.
Reviewers: reames, hans
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11953
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244656 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: LowerSwitch crashed with the attached test case after deleting the default block. This happened because the current implementation of deleting dead blocks is wrong. After the default block being deleted, it contains no instruction or terminator, and it should no be traversed anymore. However, since the iterator is advanced before processSwitchInst() function is executed, the block advanced to could be deleted inside processSwitchInst(). The deleted block would then be visited next and crash dyn_cast<SwitchInst>(Cur->getTerminator()) because Cur->getTerminator() returns a nullptr. This patch fixes this problem by recording dead default blocks into a list, and delete them after all processSwitchInst() has been done. It still possible to visit dead default blocks and waste time process them. But it is a compile time issue, and I plan to have another patch to add support to skip dead blocks.
Reviewers: kariddi, resistor, hans, reames
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11852
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244642 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
For LTO we need to enable this pass in the LTO pipeline,
as it is skipped during the "-flto -c" compile step (when PrepareForLTO is
set).
Reviewers: rnk
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D11919
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244622 91177308-0d34-0410-b5e6-96231b3b80d8
The select pattern recognition in ValueTracking (as used by InstCombine
and SelectionDAGBuilder) only knew about integer patterns. This teaches
it about minimum and maximum operations.
matchSelectPattern() has been extended to return a struct containing the
existing Flavor and a new enum defining the pattern's behavior when
given one NaN operand.
C minnum() is defined to return the non-NaN operand in this case, but
the idiomatic C "a < b ? a : b" would return the NaN operand.
ARM and AArch64 at least have different instructions for these different cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244580 91177308-0d34-0410-b5e6-96231b3b80d8
This adds somewhat basic preparation functionality including:
- Formation of funclets via coloring basic blocks.
- Cloning of polychromatic blocks to ensure that funclets have unique
program counters.
- Demotion of values used between different funclets.
- Some amount of cleanup once we have removed predecessors from basic
blocks.
- Verification that we are left with a CFG that makes some amount of
sense.
N.B. Arguments and numbering still need to be done.
Differential Revision: http://reviews.llvm.org/D11750
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244558 91177308-0d34-0410-b5e6-96231b3b80d8
This patch and a relatec clang patch solve the problem of having to explicitly enable analysis when specifying a loop hint pragma to get the diagnostics. Passing AlwasyPrint as the pass name (see below) causes the front-end to print the diagnostic if the user has specified '-Rpass-analysis' without an '=<target-pass>’. Users of loop hints can pass that compiler option without having to specify the pass and they will get diagnostics for only those loops with loop hints.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244555 91177308-0d34-0410-b5e6-96231b3b80d8
This patch moves checking the threshold of runtime pointer checks to the vectorization requirements (late diagnostics) and emits a diagnostic that infroms the user the loop would be vectorized if not for exceeding the pointer-check threshold. Clang will also append the options that can be used to allow vectorization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244523 91177308-0d34-0410-b5e6-96231b3b80d8
This patch moves the verification of fast-math to just before vectorization is done. This way we can tell clang to append the command line options would that allow floating-point commutativity. Specifically those are enableing fast-math or specifying a loop hint.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244489 91177308-0d34-0410-b5e6-96231b3b80d8
Sometimes interleaving is not beneficial, as determined by the cost-model and sometimes it is disabled by a loop hint (by the user). This patch modifies the diagnostic messages to make it clear why interleaving wasn't done.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244485 91177308-0d34-0410-b5e6-96231b3b80d8
This change adds the unroll metadata "llvm.loop.unroll.enable" which directs
the optimizer to unroll a loop fully if the trip count is known at compile time, and
unroll partially if the trip count is not known at compile time. This differs from
"llvm.loop.unroll.full" which explicitly does not unroll a loop if the trip count is not
known at compile time.
The "llvm.loop.unroll.enable" is intended to be added for loops annotated with
"#pragma unroll".
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244466 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds a hook to TTI which enables us to selectively turn on by default
interleaved access vectorization for targets on which we have have performed
the required benchmarking.
Reviewers: rengolin
Subscribers: rengolin, llvm-commits
Differential Revision: http://reviews.llvm.org/D11901
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244449 91177308-0d34-0410-b5e6-96231b3b80d8
The scalarizer can cache incorrect entries when walking up a chain of
insertelement instructions. This occurs when it encounters more than one
instruction that it is not actively searching for, as it unconditionally caches
every element it finds. The fix is to only cache the first element that it
isn't searching for so we don't overwrite correct entries.
Reviewers: hfinkel
Differential Revision: http://reviews.llvm.org/D11559
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@244448 91177308-0d34-0410-b5e6-96231b3b80d8