Summary:
LSV wants to know the maximum size that can be loaded to a vector register.
On X86, this always matches the maximum register width. Implement this
accordingly and add a test to make sure that LSV can vectorize up to the
maximum permissible width on X86.
Reviewers: delena, arsenm
Reviewed By: arsenm
Subscribers: wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D31504
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299589 91177308-0d34-0410-b5e6-96231b3b80d8
This test case depends on the loop being vectorized without forcing the
vectorization factor. If the profitability ever changes in the future (due to
cost model improvements), the test may no longer work as intended. Instead of
checking the resulting IR, we should just check the instruction costs. The
costs will be computed regardless if vectorization is profitable.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299545 91177308-0d34-0410-b5e6-96231b3b80d8
This is a latent bug that's been hanging around for a while. For a loop-invariant
pointer, expandBounds would return the range {Ptr, Ptr}, but this was interpreted
as a half-open range, not a closed range. So we ended up planting incorrect
bounds checks. Even worse, they were tautological, so we ended up incorrectly
executing the optimized loop.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299526 91177308-0d34-0410-b5e6-96231b3b80d8
Fix a bug in ARC contract pass where an iterator that pointed to a
deleted instruction was dereferenced.
It appears that tryToContractReleaseIntoStoreStrong was incorrectly
assuming that a call to objc_retain would not immediately follow a call
to objc_release.
rdar://problem/25276306
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299507 91177308-0d34-0410-b5e6-96231b3b80d8
This patch optimizes two memory intrinsic operations: memset and memcpy based
on the profiled size of the operation. The high level transformation is like:
mem_op(..., size)
==>
switch (size) {
case s1:
mem_op(..., s1);
goto merge_bb;
case s2:
mem_op(..., s2);
goto merge_bb;
...
default:
mem_op(..., size);
goto merge_bb;
}
merge_bb:
Differential Revision: http://reviews.llvm.org/D28966
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299446 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Add a hook for simplification of shufflevector's with the following rules:
- Constant folding - NFC, as it was already being done by the default handler.
- If only one of the operands is constant, constant fold the shuffle if the
mask does not select elements from the variable operand - to show the hook is firing and affecting the test-cases.
Reviewers: RKSimon, craig.topper, spatel, sanjoy, nlopes, majnemer
Reviewed By: spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31525
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299393 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Depends on D30928.
This adds support for coercion of stores and memory instructions that do not require insertion to process.
Another few tests down.
I added the relevant tests from rle.ll
Reviewers: davide
Subscribers: llvm-commits, Prazek
Differential Revision: https://reviews.llvm.org/D30929
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299330 91177308-0d34-0410-b5e6-96231b3b80d8
Disable bypassing if one of the operands looks like a hash value. Slow
division often occurs in hashtable implementations and fast division is
never taken there because a hash value is extremely unlikely to have
enough upper bits set to zero.
A value is considered to be hash-like if it is produced by
1) XOR operation
2) Multiplication by a constant wider than the shorter type
3) PHI node with all incoming values being hash-like
Differential Revision: https://reviews.llvm.org/D28200
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299329 91177308-0d34-0410-b5e6-96231b3b80d8
A common way to implement nearbyint is by fiddling with the floating
point environment and calling rint. This is used at least by the BSD
libm and musl. As such, canonicalizing the latter to the former will
create infinite loops for libm and generally pessimize performance, at
least when the generic C versions are used.
This change preserves the rint in the libcall translation and also
handles the domain truncation logic, so that rint with float argument
will be reduced to rintf etc.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299247 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: Currently the VP metadata was dropped when InstCombine converts a call to direct call. This patch converts the VP metadata to branch_weights so that its hotness is recorded.
Reviewers: eraman, davidxl
Reviewed By: davidxl
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31344
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299228 91177308-0d34-0410-b5e6-96231b3b80d8
Adding some test-cases demonstrating cases that need to be improved.
To be followed by patches that improve these cases.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299189 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Triggered by commit r298620: "[LV] Vectorize GEPs".
If we encounter a vector GEP with scalar arguments, we splat the scalar
into a vector of appropriate size before we scatter the argument.
Reviewers: arsenm, mehdi_amini, bkramer
Reviewed By: arsenm
Subscribers: bjope, mssimpso, wdng, llvm-commits
Differential Revision: https://reviews.llvm.org/D31416
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299186 91177308-0d34-0410-b5e6-96231b3b80d8
Some of the GEP combines (e.g., descaling) can't handle vector GEPs. We have an
existing check that attempts to bail out if given a vector GEP. However, the
check only tests the GEP's pointer operand. A GEP results in a vector of
pointers if at least one of its operands is vector-typed (e.g., its pointer
operand could be a scalar, but its index could be a vector). We should just
check the type of the GEP itself. This should fix PR32414.
Reference: https://bugs.llvm.org/show_bug.cgi?id=32414
Differential Revision: https://reviews.llvm.org/D31470
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@299017 91177308-0d34-0410-b5e6-96231b3b80d8
The vectorizer tries to replace truncations of induction variables with new
induction variables having the smaller type. After r295063, this optimization
was applied to all integer induction variables, including non-primary ones.
When optimizing the truncation of a non-primary induction variable, we still
need to transform the new induction so that it has the correct start value.
This should fix PR32419.
Reference: https://bugs.llvm.org/show_bug.cgi?id=32419
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298882 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
We are incorrectly folding selects into phi nodes when the incoming value of a phi
node is a constant vector. This optimization is done in `FoldOpIntoPhi` when the
select condition is a phi node with constant incoming values.
Without the fix, we are miscompiling (i.e. incorrectly folding the
select into the phi node) when the vector contains non-zero
elements.
This patch fixes the miscompile and we will correctly fold based on the
select vector operand (see added test cases).
Reviewers: majnemer, sanjoy, spatel
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31189
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298845 91177308-0d34-0410-b5e6-96231b3b80d8
The first variant contains all current transformations except
transforming switches into lookup tables. The second variant
contains all current transformations.
The switch-to-lookup-table conversion results in code that is more
difficult to analyze and optimize by other passes. Most importantly,
it can inhibit Dead Code Elimination. As such it is often beneficial to
only apply this transformation very late. A common example is inlining,
which can often result in range restrictions for the switch expression.
Changes in execution time according to LNT:
SingleSource/Benchmarks/Misc/fp-convert +3.03%
MultiSource/Benchmarks/ASC_Sequoia/CrystalMk/CrystalMk -11.20%
MultiSource/Benchmarks/Olden/perimeter/perimeter -10.43%
and a couple of smaller changes. For perimeter it also results 2.6%
a smaller binary.
Differential Revision: https://reviews.llvm.org/D30333
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298799 91177308-0d34-0410-b5e6-96231b3b80d8
This moves it to the iterator facade utilities giving it full random
access semantics, etc. It can also now be used with standard algorithms
like std::all_of and std::any_of and range adaptors like llvm::reverse.
Also make the semantics of iterating match what every other iterator
uses and forbid decrementing past the begin iterator. This was used as
a hacky way to work around iterator invalidation. However, every
instance trying to do this failed to actually avoid touching invalid
iterators despite the clear documentation that the removed and all
subsequent iterators become invalid including the end iterator. So I've
added a return of the next iterator to removeCase and rewritten the
loops that were doing this to correctly follow the iterator pattern of
either incremneting or removing and assigning fresh values to the
iterator and the end.
In one case we were trying to go backwards to make this cleaner but it
doesn't actually work. I've made that code match the code we use
everywhere else to remove cases as we iterate. This changes the order of
cases in one test output and I moved that test to CHECK-DAG so it
wouldn't care -- the order isn't semantically meaningful anyways.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298791 91177308-0d34-0410-b5e6-96231b3b80d8
Reason: breaks linking Chromium with LLD + ThinLTO (a pass crashes)
LLVM bug: https://bugs.llvm.org//show_bug.cgi?id=32413
Original change description:
[LV] Vectorize GEPs
This patch adds support for vectorizing GEPs. Previously, we only generated
vector GEPs on-demand when creating gather or scatter operations. All GEPs from
the original loop were scalarized by default, and if a pointer was to be stored
to memory, we would have to build up the pointer vector with insertelement
instructions.
With this patch, we will vectorize all GEPs that haven't already been marked
for scalarization.
The patch refines collectLoopScalars to more exactly identify the scalar GEPs.
The function now more closely resembles collectLoopUniforms. And the patch
moves vector GEP creation out of vectorizeMemoryInstruction and into the main
vectorization loop. The vector GEPs needed for gather and scatter operations
will have already been generated before vectoring the memory accesses.
Original Differential Revision: https://reviews.llvm.org/D30710
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298735 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: In DeadArgumentElimination, the call instructions will be replaced. We also need to set the prof weights so that function inlining can find the correct profile.
Reviewers: eraman
Reviewed By: eraman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31143
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298660 91177308-0d34-0410-b5e6-96231b3b80d8
Library functions can have specific semantics that affect the behavior of
certain passes. DSE, for instance, gives special treatment to malloc-ed pointers
but not to pointers returned from an equivalently typed (but differently named)
function.
MetaRenamer ought not to alter program semantics, so library functions must
remain untouched.
Reviewers: mehdi_amini, majnemer, chandlerc, davide
Reviewed By: davide
Subscribers: davide, llvm-commits
Differential Revision: https://reviews.llvm.org/D31304
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298659 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: The current prefix based function layout algorithm only looks at function's entry count, which is not sufficient. A function should be grouped together if its entry count or any call edge count is hot.
Reviewers: davidxl, eraman
Reviewed By: eraman
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D31225
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@298656 91177308-0d34-0410-b5e6-96231b3b80d8