Have ScalarEvolution::getRange re-consider cases like "{C?A:B,+,C?P:Q}"
by factoring out "C" and computing RangeOf{A,+,P} union RangeOf({B,+,Q})
instead.
The latter can be easier to compute precisely in cases like
"{C?0:N,+,C?1:-1}" N is the backedge taken count of the loop; since in
such cases the latter form simplifies to [0,N+1) union [0,N+1).
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262438 91177308-0d34-0410-b5e6-96231b3b80d8
As noted in the code comment, I don't think we can do the same transform that we do for
*scalar* integers comparisons to *vector* integers comparisons because it might pessimize
the general case.
Exhibit A for an incomplete integer comparison ISA remains x86 SSE/AVX: it only has EQ and GT
for integer vectors.
But we should now recognize all the variants of this construct and produce the optimal code
for the cases shown in:
https://llvm.org/bugs/show_bug.cgi?id=26701
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262424 91177308-0d34-0410-b5e6-96231b3b80d8
Summary: SampleProfile pass needs to be performed after InstructionCombiningPass, which helps eliminate un-inlinable function calls.
Reviewers: davidxl, dnovillo
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17742
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262419 91177308-0d34-0410-b5e6-96231b3b80d8
On AMDGPU where operations i64 operations are often bitcasted to v2i32
and back, this pattern shows up regularly where it breaks some
expected combines on i64, such as load width reducing.
This fixes some test failures in a future commit when i64 loads
are changed to promote.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262397 91177308-0d34-0410-b5e6-96231b3b80d8
This isn't quite NFC because some of the SDLocs may change which could
cause scheduling differences. But no regression tests are affected and
there is no functional change intended.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262391 91177308-0d34-0410-b5e6-96231b3b80d8
Revert r262248 in an attempt to fix the clang-native-aarch64-full
bot and to investigate a performance regression in
SingleSource/Benchmarks/CoyoteBench/huffbench
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262388 91177308-0d34-0410-b5e6-96231b3b80d8
This reverts commit r262316.
It seems that my change breaks an out-of-tree chromium buildbot, so
I'm reverting this in order to investigate the situation further.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262387 91177308-0d34-0410-b5e6-96231b3b80d8
TableGen checks at compiletime that for scheduling models with
"CompleteModel = 1" one of the following holds:
- Is marked with the hasNoSchedulingInfo flag
- The instruction is a subclass of Sched
- There are InstRW definitions in the scheduling model
Typical steps necessary to complete a model:
- Ensure all pseudo instructions that are expanded before machine
scheduling (usually everything handled with EmitYYY() functions in
XXXTargetLowering).
- If a CPU does not support some instructions mark the corresponding
resource unsupported: "WriteRes<WriteXXX, []> { let Unsupported = 1; }".
- Add missing scheduling information.
Differential Revision: http://reviews.llvm.org/D17747
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262384 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Tablegen was unable to determine that param loads/stores were actually
reading or writing from memory. I think this isn't a problem in
practice for param stores, because those occur in a block right before
we make our call. But param loads don't have to at the very beginning
of a function, so should be annotated as mayLoad so we don't incorrectly
optimize them.
Reviewers: jholewinski
Subscribers: jholewinski, llvm-commits
Differential Revision: http://reviews.llvm.org/D17471
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262381 91177308-0d34-0410-b5e6-96231b3b80d8
Most portions of InstCombine properly propagate fast math flags, but
apparently the vector scalarization section was overlooked.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262376 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
Calls sometimes need to be convergent. This is already handled at the
LLVM IR level, but it also needs to be handled at the MI level.
Ideally we'd propagate convergence from instructions, down through the
selection DAG, and into MIs. But this is Hard, and would affect
optimizations in the SDNs -- right now only SDNs with two operands have
any flags at all.
Instead, here's a much simpler hack: Add new opcodes for NVPTX for
convergent calls, and generate these when lowering convergent LLVM
calls.
Reviewers: jholewinski
Subscribers: jholewinski, chandlerc, joker.eph, jhen, tra, llvm-commits
Differential Revision: http://reviews.llvm.org/D17423
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262373 91177308-0d34-0410-b5e6-96231b3b80d8
The _chkstk function is called by the compiler to probe the stack in an
order consistent with Windows' expectations. However, it is possible to
elide the call to _chkstk and manually adjust the stack pointer if we
can prove that the allocation is fixed size and smaller than the probe
size.
This shrinks chrome.dll, chrome_child.dll and chrome.exe by a
cummulative ~133 KB.
Differential Revision: http://reviews.llvm.org/D17679
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262370 91177308-0d34-0410-b5e6-96231b3b80d8
Code in visitEHPadPredecessors assume a little too much about the
validity of a cleanupret with an invalid cleanuppad operand.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262364 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This adds the beginning of an update API to preserve MemorySSA. In particular,
this patch adds a way to remove memory SSA accesses when instructions are
deleted.
It also adds relevant unit testing infrastructure for MemorySSA's API.
(There is an actual user of this API, i will make that diff dependent on this one. In practice, a ton of opt passes remove memory instructions, so it's hopefully an obviously useful API :P)
Reviewers: hfinkel, reames, george.burgess.iv
Subscribers: llvm-commits
Differential Revision: http://reviews.llvm.org/D17157
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262362 91177308-0d34-0410-b5e6-96231b3b80d8
This adds support to convert ProfileSummary object to Metadata and create a
ProfileSummary object from metadata. This would allow attaching profile summary
information to Module allowing optimization passes to use it.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262360 91177308-0d34-0410-b5e6-96231b3b80d8
This reduces the number of bitcast nodes and generally cleans up the
DAG when bitcasting between integers and vectors everywhere.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262358 91177308-0d34-0410-b5e6-96231b3b80d8
Summary:
This patch impleemnts DS_PERMUTE/DS_BPERMUTE instruction definitions and intrinsics,
which are new since VI.
Reviewers: tstellarAMD, arsenm
Subscribers: llvm-commits, arsenm
Differential Revision: http://reviews.llvm.org/D17614
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262356 91177308-0d34-0410-b5e6-96231b3b80d8
In the code below on 32-bit targets, x would previously get forwarded to g()
without sign-extension to 32 bits as required by the parameter attribute.
void g(signed short);
void f(unsigned short x) {
g(x);
}
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262352 91177308-0d34-0410-b5e6-96231b3b80d8
This patch fixes calculating correct value for builtin_object_size function
when pointer is used only in builtin_object_size function call and never
after that.
Patch by Strahinja Petrovic.
Differential Revision: http://reviews.llvm.org/D17337
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@262337 91177308-0d34-0410-b5e6-96231b3b80d8