Commit Graph

1480 Commits

Author SHA1 Message Date
Max Kazantsev
99dd56e4a9 [SCEV] Fix predicate usage in computeExitLimitFromICmp
In this method, we invoke `SimplifyICmpOperands` which takes the `Cond` predicate
by reference and may change it along with `LHS` and `RHS` SCEVs. But then we invoke
`computeShiftCompareExitLimit` with Values from which the SCEVs have been derived,
these Values have not been modified while `Cond` could be.

One of possible outcomes of this is that we may falsely prove that an infinite loop ends
within some finite number of iterations.

In this patch, we save the original `Cond` and pass it along with original operands.
This logic may be removed in future once `computeShiftCompareExitLimit` works
with SCEVs instead of value operands.

Reviewed By: sanjoy
Differential Revision: https://reviews.llvm.org/D40953


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320142 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-08 12:19:45 +00:00
Craig Topper
3c8bef9341 [X86] Promote fp_to_sint v16f32->v16i16/v16i8 to avoid scalarization.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319266 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-29 00:32:09 +00:00
Max Kazantsev
f66b47b328 [SCEV] Strengthen variance condition in calculateLoopDisposition
Given loops `L1` and `L2` with AddRecs `AR1` and `AR2` varying in them respectively.
When identifying loop disposition of `AR2` w.r.t. `L1`, we only say that it is varying if
`L1` contains `L2`. But there is also a possible situation where `L1` and `L2` are
consecutive sibling loops within the parent loop. In this case, `AR2` is also varying
w.r.t. `L1`, but we don't correctly identify it.

It can lead, for exaple, to attempt of incorrect folding. Consider:
  AR1 = {a,+,b}<L1>
  AR2 = {c,+,d}<L2>
  EXAR2 = sext(AR1)
  MUL = mul AR1, EXAR2
If we incorrectly assume that `EXAR2` is invariant w.r.t. `L1`, we can end up trying to
construct something like: `{a * {c,+,d}<L2>,+,b * {c,+,d}<L2>}<L1>`, which is incorrect
because `AR2` is not available on entrance of `L1`.

Both situations "`L1` contains `L2`" and "`L1` preceeds sibling loop `L2`" can be handled
with one check: "header of `L1` dominates header of `L2`". This patch replaces the old
insufficient check with this one.

Differential Revision: https://reviews.llvm.org/D39453


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318819 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-22 06:21:39 +00:00
Hiroshi Yamauchi
09128909d6 Add heuristics for irreducible loop metadata under PGO
Summary:
Add the following heuristics for irreducible loop metadata:

- When an irreducible loop header is missing the loop header weight metadata,
  give it the minimum weight seen among other headers.
- Annotate indirectbr targets with the loop header weight metadata (as they are
  likely to become irreducible loop headers after indirectbr tail duplication.)

These greatly improve the accuracy of the block frequency info of the Python
interpreter loop (eg. from ~3-16x off down to ~40-55% off) and the Python
performance (eg. unpack_sequence from ~50% slower to ~8% faster than GCC) due to
better register allocation under PGO.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D39980

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318693 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-20 21:03:38 +00:00
Mohammed Agabaria
1693ef3860 [LV][X86] Support of AVX2 Gathers code generation and update the LV with this
This patch depends on: https://reviews.llvm.org/D35348

Support of pattern selection of masked gathers of AVX2 (X86\AVX2 code gen)
Update LoopVectorize to generate gathers for AVX2 processors.

Reviewers: delena, zvi, RKSimon, craig.topper, aaboud, igorb

Reviewed By: delena, RKSimon

Differential Revision: https://reviews.llvm.org/D35772


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318641 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-20 08:18:12 +00:00
Yaxun Liu
35ad126c78 Let llvm.invariant.group.barrier accepts pointer to any address space
llvm.invariant.group.barrier may accept pointers to arbitrary address space.

This patch let it accept pointers to i8 in any address space and returns
pointer to i8 in the same address space.

Differential Revision: https://reviews.llvm.org/D39973


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318413 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-16 16:32:16 +00:00
Mohammed Agabaria
89ec3b09eb [TTI][X86] update costs of interleaved load\store of i64\double
This patch contains more accurate cost of interelaved load\store of stride 2 for the types int64\double on AVX2.

Reviewers: delena, RKSimon, craig.topper, dorit

Reviewed By: dorit

Differential Revision: https://reviews.llvm.org/D40008



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318385 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-16 09:38:32 +00:00
Mikael Holmen
c14b5c3b1e [Lint] Don't warn about passing alloca'd value to tail call if using byval
Summary:
This fixes PR35241.

When using byval, the data is effectively copied as part of the call
anyway, so the pointer returned by the alloca will not be leaked to the
callee and thus there is no reason to issue a warning.

Reviewers: rnk

Reviewed By: rnk

Subscribers: Ka-Ka, llvm-commits

Differential Revision: https://reviews.llvm.org/D40009

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318279 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-15 07:46:48 +00:00
Hiroshi Yamauchi
7f62a83c8d Simplify irreducible loop metadata test code.
Summary:
Shorten the irreducible loop metadata test code by removing insignificant
instructions.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40043

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318182 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-14 19:48:59 +00:00
Jatin Bhateja
5afa924e80 [SCEV] Handling for ICmp occuring in the evolution chain.
Summary:
 If a compare instruction is same or inverse of the compare in the
 branch of the loop latch, then return a constant evolution node.
 This shall facilitate computations of loop exit counts in cases
 where compare appears in the evolution chain of induction variables.

 Will fix PR 34538

Reviewers: sanjoy, hfinkel, junryoungju

Reviewed By: sanjoy, junryoungju

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@318050 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-13 16:43:24 +00:00
Mohammed Agabaria
84066df4d9 [LV][X86] update the cost of interleaving mem. access of floats
Recommit:
This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8.
fixed the location of the lit test it works with make check-all.

Differential Revision: https://reviews.llvm.org/D39403



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317471 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-06 10:56:20 +00:00
Mohammed Agabaria
08eb02abdb [REVERT][LV][X86] update the cost of interleaving mem. access of floats
reverted my changes will be committed later after fixing the failure
This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8.

Differential Revision: https://reviews.llvm.org/D39403



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317433 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-05 09:36:54 +00:00
Mohammed Agabaria
1511d2e7fe [LV][X86] update the cost of interleaving mem. access of floats
This patch contains update of the costs of interleaved loads of v8f32 of stride 3 and 8.

Differential Revision: https://reviews.llvm.org/D39403



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317432 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-05 09:06:23 +00:00
Hiroshi Yamauchi
dd33e177dd Irreducible loop metadata for more accurate block frequency under PGO.
Summary:
Currently the block frequency analysis is an approximation for irreducible
loops.

The new irreducible loop metadata is used to annotate the irreducible loop
headers with their header weights based on the PGO profile (currently this is
approximated to be evenly weighted) and to help improve the accuracy of the
block frequency analysis for irreducible loops.

This patch is a basic support for this.

Reviewers: davidxl

Reviewed By: davidxl

Subscribers: mehdi_amini, llvm-commits, eraman

Differential Revision: https://reviews.llvm.org/D39028

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317278 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 22:26:51 +00:00
Yichao Yu
52f6f2ce7b Allow inaccessiblememonly and inaccessiblemem_or_argmemonly to be overwriten on call site with operand bundle
Summary:
Similar to argmemonly, readonly and readnone.

Fix PR35128

Reviewers: andrew.w.kaylor, chandlerc, hfinkel

Reviewed By: hfinkel

Subscribers: hfinkel, llvm-commits

Differential Revision: https://reviews.llvm.org/D39434

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317201 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-02 12:18:33 +00:00
Geoff Berry
4bf7c62ec4 [BranchProbabilityInfo] Handle irreducible loops.
Summary:
Compute the strongly connected components of the CFG and fall back to
use these for blocks that are in loops that are not detected by
LoopInfo when computing loop back-edge and exit branch probabilities.

Reviewers: dexonsmith, davidxl

Subscribers: mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D39385

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@317094 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-01 15:16:50 +00:00
Sanjoy Das
b500839596 [SCEV] Fix an assertion failure in the max backedge taken count
Max backedge taken count is always expected to be a constant; and this is
usually true by construction -- it is a SCEV expression with constant inputs.
However, if the max backedge expression ends up being computed to be a udiv with
a constant zero denominator[0], SCEV does not fold the result to a constant
since there is no constant it can fold it to (SCEV has no representation for
"infinity" or "undef").

However, in computeMaxBECountForLT we already know the denominator is positive,
and thus at least 1; and we can use this fact to avoid dividing by zero.

[0]: We can end up with a constant zero denominator if the signed range of the
stride is more precise than the unsigned range.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316615 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-25 21:41:00 +00:00
Bjorn Pettersson
e1bccd3b6e [ConstantFolding] Avoid assert when folding ptrtoint of vectorized GEP
Summary:
Got asserts in llvm::CastInst::getCastOpcode saying:
`DestBits == SrcBits && "Illegal cast to vector (wrong type or size)"' failed.

Problem seemed to be that llvm::ConstantFoldCastInstruction did
not handle ptrtoint cast of a getelementptr returning a vector
correctly. I assume such situations are quite rare, since the
GEP needs to be considered as a constant value (base pointer
being null).
The solution used here is to simply avoid the constant fold
of ptrtoint when the value is a vector. It is not supported,
and by bailing out we do not fail on assertions later on.

Reviewers: craig.topper, majnemer, davide, filcab, efriedma

Reviewed By: efriedma

Subscribers: efriedma, filcab, llvm-commits

Differential Revision: https://reviews.llvm.org/D38546

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316430 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-24 12:08:11 +00:00
Sanjoy Das
b5cb868aaa Revert "[ScalarEvolution] Handling for ICmp occuring in the evolution chain."
This reverts commit r316054.  There was some confusion over the review process:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20171016/495884.html

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316129 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 22:00:57 +00:00
Michael Zuckerman
a631f93f8a [AVX512][AVX2]Cost calculation for interleave load/store patterns {v8i8,v16i8,v32i8,v64i8}
This patch adds accurate instructions cost.
The formula presents two cases(stride 3 and stride 4) and calculates the cost according to the VF and stride.

Reviewers:
1. delena
2. Farhana
3. zvi
4. dorit
5. Ayal

Differential Revision: https://reviews.llvm.org/D38762

Change-Id: If4cfbd4ac0e63694e8144cb78c7fa34850647ff7

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316072 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 11:41:55 +00:00
Jatin Bhateja
b8354dd5d8 [ScalarEvolution] Handling for ICmp occuring in the evolution chain.
Summary:
 If a compare instruction is same or inverse of the compare in the
 branch of the loop latch, then return a constant evolution node.
 Currently scope of evaluation is limited to SCEV computation for
 PHI nodes.

 This shall facilitate computations of loop exit counts in cases
 where compare appears in the evolution chain of induction variables.

 Will fix PR 34538
Reviewers: sanjoy, hfinkel, junryoungju

Reviewed By: junryoungju

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38494

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@316054 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-18 01:36:16 +00:00
Anna Thomas
7f1eb15289 [SCEV] Teach SCEV to find maxBECount when loop endbound is variant
Summary:
This patch teaches SCEV to calculate the maxBECount when the end bound
of the loop can vary. Note that we cannot calculate the exactBECount.

This will only be done when both conditions are satisfied:
1. the loop termination condition is strictly LT.
2. the IV is proven to not overflow.

This provides more information to users of SCEV and can be used to
improve identification of finite loops.

Reviewers: sanjoy, mkazantsev, silviu.baranga, atrick

Reviewed by: mkazantsev

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38825

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315683 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 14:30:43 +00:00
Daniel Jasper
e9712285e9 Revert r314923: "Recommit : Use the basic cost if a GEP is not used as addressing mode"
Significantly reduces performancei (~30%) of gipfeli
(https://github.com/google/gipfeli)

I have not yet managed to reproduce this regression with the open-source
version of the benchmark on github, but will work with others to get a
reproducer to you later today.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315680 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-13 14:04:21 +00:00
Sanjay Patel
dc813ccd3f [ValueTracking] return zero when there's conflict in known bits of a shift (PR34838)
Poison allows us to return a better result than undef.


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@315595 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-12 17:31:46 +00:00
Justin Lebar
d1dce4497d Convert an APInt to int64_t properly in TTI::getGEPCost().
Summary:
If the pointer width is 32 bits and the calculated GEP offset is
negative, we call APInt::getLimitedValue(), which does a
*zero*-extension of the offset.  That's wrong -- we should do an sext.

Fixes a bug introduced in rL314362 and found by Evgeny Astigeevich.

Reviewers: efriedma

Subscribers: sanjoy, javed.absar, llvm-commits, eastig

Differential Revision: https://reviews.llvm.org/D38557

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314935 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 20:47:33 +00:00
Guozhi Wei
7afcace1cd [TargetTransformInfo] Check if function pointer is valid before calling isLoweredToCall
Function isLoweredToCall can only accept non-null function pointer, but a function pointer can be null for indirect function call. So check it before calling isLoweredToCall from getInstructionLatency.

Differential Revision: https://reviews.llvm.org/D38204



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314927 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 20:14:08 +00:00
Jun Bum Lim
e3f6227d56 Recommit : Use the basic cost if a GEP is not used as addressing mode
Recommitting r314517 with the fix for handling ConstantExpr.

Original commit message:
  Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing
  mode in the target. However, since it doesn't check its actual users, it will
  return FREE even in cases where the GEP cannot be folded away as a part of
  actual addressing mode. For example, if an user of the GEP is a call
  instruction taking the GEP as a parameter, then the GEP may not be folded in
  isel.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314923 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-04 18:33:52 +00:00
Mikael Holmen
42c9fd02c2 [Lint] Avoid failed assertion by fetching the proper pointer type
Summary:
When checking if a constant expression is a noop cast we fetched the
IntPtrType by doing DL->getIntPtrType(V->getType())). However, there can
be cases where V doesn't return a pointer, and then getIntPtrType()
triggers an assertion.

Now we pass DataLayout to isNoopCast so the method itself can determine
what the IntPtrType is.

Reviewers: arsenm

Reviewed By: arsenm

Subscribers: wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D37894

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314763 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 06:03:49 +00:00
Craig Topper
4ca8fdba4d [X86] Add AVX512 check lines to the cost model truncate test.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314758 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-03 03:47:34 +00:00
Davide Italiano
0e47622a91 [PassManager] Retire cl::opt that have been set for a while. NFCI.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314740 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-02 23:39:20 +00:00
Adrian Prantl
733fe2f231 Move the stripping of invalid debug info from the Verifier to AutoUpgrade.
This came out of a recent discussion on llvm-dev
(https://reviews.llvm.org/D38042). Currently the Verifier will strip
the debug info metadata from a module if it finds the dbeug info to be
malformed. This feature is very valuable since it allows us to improve
the Verifier by making it stricter without breaking bcompatibility,
but arguable the Verifier pass should not be modifying the IR. This
patch moves the stripping of broken debug info into AutoUpgrade
(UpgradeDebugInfo to be precise), which is a much better location for
this since the stripping of malformed (i.e., produced by older, buggy
versions of Clang) is a (harsh) form of AutoUpgrade.

This change is mostly NFC in nature, the one big difference is the
behavior when LLVM module passes are introducing malformed debug
info. Prior to this patch, a NoAsserts build would have printed a
warning and stripped the debug info, after this patch the Verifier
will report a fatal error. I believe this behavior is actually more
desirable anyway.

Differential Revision: https://reviews.llvm.org/D38184

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314699 91177308-0d34-0410-b5e6-96231b3b80d8
2017-10-02 18:31:29 +00:00
Daniel Jasper
aaae73db73 Revert r314435: "[JumpThreading] Preserve DT and LVI across the pass"
Causes a segfault on a builtbot (and in our internal bootstrapping of
Clang). See Eli's response on the commit thread.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314589 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-30 11:57:19 +00:00
Alex Shlyapnikov
682384e698 Revert "Use the basic cost if a GEP is not used as addressing mode"
This reverts commit r314517.

This commit crashes sanitizer bots, for example:
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux/builds/4167

Stack snippet:
...
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Support/Casting.h:255:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getGEPCost(llvm::GEPOperator const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:742:0
llvm::TargetTransformInfoImplCRTPBase<llvm::X86TTIImpl>::getUserCost(llvm::User const*, llvm::ArrayRef<llvm::Value const*>)
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfoImpl.h:782:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/lib/Analysis/TargetTransformInfo.cpp:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:116:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:343:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/ADT/SmallVector.h:864:0
/mnt/b/sanitizer-buildbot1/sanitizer-x86_64-linux/build/llvm/include/llvm/Analysis/TargetTransformInfo.h:285:0
...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314560 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 22:04:45 +00:00
Jun Bum Lim
fd8d5dac84 Use the basic cost if a GEP is not used as addressing mode
Summary:
Currently, getGEPCost() returns TCC_FREE whenever a GEP is a legal addressing mode in the target.
However, since it doesn't check its actual users, it will return FREE even in cases
where the GEP cannot be folded away as a part of actual addressing mode.
For example, if an user of the GEP is a call instruction taking the GEP as a parameter,
then the GEP may not be folded in isel.

Reviewers: hfinkel, efriedma, mcrosier, jingyue, haicheng

Reviewed By: hfinkel

Subscribers: javed.absar, llvm-commits

Differential Revision: https://reviews.llvm.org/D38085

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314517 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-29 14:50:16 +00:00
Evandro Menezes
a10cceeff4 [JumpThreading] Preserve DT and LVI across the pass
JumpThreading now preserves dominance and lazy value information across the
entire pass.  The pass manager is also informed of this preservation with
the goal of DT and LVI being recalculated fewer times overall during
compilation.

This change prepares JumpThreading for enhanced opportunities; particularly
those across loop boundaries.

Patch by: Brian Rzycki <b.rzycki@samsung.com>,
          Sebastian Pop <s.pop@samsung.com>

Differential revision: https://reviews.llvm.org/D37528

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314435 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-28 17:24:40 +00:00
Justin Lebar
2036b38fe1 Check for overflows when calculating the offset in GetGEPCost.
Summary:
This avoids C++ UB if the GEP is weird and the calculation overflows
int64_t, and it's also observable in the cost model's results.

Such GEPs are almost surely not valid pointers, but LLVM nonetheless
generates them sometimes.

Reviewers: sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D38337

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314362 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-27 23:16:56 +00:00
Guozhi Wei
55d3e2aced [TargetTransformInfo] Handle intrinsic call in getInstructionLatency()
Usually an intrinsic is a simple target instruction, it should have a small latency. A real function call has much larger latency. So handle the intrinsic call in function getInstructionLatency().

Differential Revision: https://reviews.llvm.org/D38104



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@314003 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-22 18:25:53 +00:00
Guozhi Wei
8ef7e4db00 [TargetTransformInfo] Static alloca has 0 cost
Static alloca usually doesn't generate any machine instructions, so it has 0 cost.

Differential Revision: https://reviews.llvm.org/D37879



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313410 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-15 22:28:12 +00:00
Guozhi Wei
5cf5798b90 [TargetTransformInfo] Detect 0 latency instructions
For instructions that unlikely generate machine instructions, they should also have 0 latency.

Differential Revision: https://reviews.llvm.org/D37833



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313288 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-14 19:20:02 +00:00
Silviu Baranga
1ece28eb77 [LAA] Allow more run-time alias checks by coercing pointer expressions to AddRecExprs
Summary:
LAA can only emit run-time alias checks for pointers with affine AddRec
SCEV expressions. However, non-AddRecExprs can be now be converted to
affine AddRecExprs using SCEV predicates.

This change tries to add the minimal set of SCEV predicates in order
to enable run-time alias checking.

Reviewers: anemet, mzolotukhin, mkuper, sanjoy, hfinkel

Reviewed By: hfinkel

Subscribers: mssimpso, Ayal, dorit, roman.shirokiy, mzolotukhin, llvm-commits

Differential Revision: https://reviews.llvm.org/D17080

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@313012 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-12 07:48:22 +00:00
Guozhi Wei
19969b8b8f [TargetTransformInfo] Add a new public interface getInstructionCost
Current TargetTransformInfo can support throughput cost model and code size model, but sometimes we also need instruction latency cost model in different optimizations. Hal suggested we need a single public interface to query the different cost of an instruction. So I proposed following interface:

  enum TargetCostKind {
    TCK_RecipThroughput, ///< Reciprocal throughput.
    TCK_Latency,         ///< The latency of instruction.
    TCK_CodeSize         ///< Instruction code size.
  };

  int getInstructionCost(const Instruction *I, enum TargetCostKind kind) const;

All clients should mainly use this function to query the cost of an instruction, parameter <kind> specifies the desired cost model.

This patch also provides a simple default implementation of getInstructionLatency.

The default getInstructionLatency provides latency numbers for only small number of instruction classes, those latency numbers are only reasonable for modern OOO processors. It can be extended in following ways:

   Add more detail into this function.
   Add getXXXLatency function and call it from here.
   Implement target specific getInstructionLatency function.

Differential Revision: https://reviews.llvm.org/D37170



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312832 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-08 22:29:17 +00:00
Zvi Rackover
8a2fcfe5be X86: Improve AVX512 fptoui lowering
Summary:
Add patterns for
  fptoui <16 x float> to <16 x i8>
  fptoui <16 x float> to <16 x i16>

Reviewers: igorb, delena, craig.topper

Reviewed By: craig.topper

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37505

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312704 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-07 07:40:34 +00:00
Alexandre Isoard
3b88873b05 [SCEV] Add URem support to SCEV
In LLVM IR the following code:

    %r = urem <ty> %t, %b

is equivalent to

    %q = udiv <ty> %t, %b
    %s = mul <ty> nuw %q, %b
    %r = sub <ty> nuw %t, %q ; (t / b) * b + (t % b) = t

As UDiv, Mul and Sub are already supported by SCEV, URem can be implemented
with minimal effort using that relation:

    %r --> (-%b * (%t /u %b)) + %t

We implement two special cases:

  - if %b is 1, the result is always 0
  - if %b is a power-of-two, we produce a zext/trunc based expression instead

That is, the following code:

    %r = urem i32 %t, 65536

Produces:

    %r --> (zext i16 (trunc i32 %a to i16) to i32)

Note that while this helps get a tighter bound on the range analysis and the
known-bits analysis, this exposes some normalization shortcoming of SCEVs:

    %div = udim i32 %a, 65536
    %mul = mul i32 %div, 65536
    %rem = urem i32 %a, 65536
    %add = add i32 %mul, %rem

Will usually not be reduced.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312329 91177308-0d34-0410-b5e6-96231b3b80d8
2017-09-01 14:59:59 +00:00
Matt Arsenault
c3f95e0648 AMDGPU: Don't assert in TTI with fp32 denorms enabled
Also refine for f16 and rcp cases.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@312213 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-31 05:47:00 +00:00
Simon Pilgrim
b148872e50 [CostModel][X86][XOP] Improve costs for XOP shuffles
VPPERM/VPERMIL2PD/VPERMIL2PS all provide more effective 2-input shuffles than regular AVX instructions

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@311005 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-16 13:50:20 +00:00
Jakub Kuderski
c0f00a9516 [Dominators] Include infinite loops in PostDominatorTree
Summary:
This patch teaches PostDominatorTree about infinite loops. It is built on top of D29705 by @dberlin which includes a very detailed motivation for this change.

What's new is that the patch also teaches the incremental updater how to deal with reverse-unreachable regions and how to properly maintain and verify tree roots. Before that, the incremental algorithm sometimes ended up preserving reverse-unreachable regions after updates that wouldn't appear in the tree if it was constructed from scratch on the same CFG.

This patch makes the following assumptions:
- A sequence of updates should produce the same tree as a recalculating it.
- Any sequence of the same updates should lead to the same tree.
- Siblings and roots are unordered.

The last two properties are essential to efficiently perform batch updates in the future.
When it comes to the first one, we can decide later that the consistency between freshly built tree and an updated one doesn't matter match, as there are many correct ways to pick roots in infinite loops, and to relax this assumption. That should enable us to recalculate postdominators less frequently.

This patch is pretty conservative when it comes to incremental updates on reverse-unreachable regions and ends up recalculating the whole tree in many cases. It should be possible to improve the performance in many cases, if we decide that it's important enough.
That being said, my experiments showed that reverse-unreachable are very rare in the IR emitted by clang when bootstrapping  clang. Here are the statistics I collected by analyzing IR between passes and after each removePredecessor call:

```
# functions:  52283
# samples:  337609
# reverse unreachable BBs:  216022
# BBs:  247840796
Percent reverse-unreachable:  0.08716159869015269 %
Max(PercRevUnreachable) in a function:  87.58620689655172 %
# > 25 % samples:  471 ( 0.1395104988314885 % samples )
... in 145 ( 0.27733680163724345 % functions )
```

Most of the reverse-unreachable regions come from invalid IR where it wouldn't be possible to construct a PostDomTree anyway.

I would like to commit this patch in the next week in order to be able to complete the work that depends on it before the end of my internship, so please don't wait long to voice your concerns :).

Reviewers: dberlin, sanjoy, grosser, brzycki, davide, chandlerc, hfinkel

Reviewed By: dberlin

Subscribers: nhaehnle, javed.absar, kparzysz, uabelho, jlebar, hiraditya, llvm-commits, dberlin, david2050

Differential Revision: https://reviews.llvm.org/D35851

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310940 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-15 18:14:57 +00:00
Hal Finkel
7d99ae532b [ValueTracking] Don't delete assumes of side-effectful instructions
ValueTracking has to strike a balance when attempting to propagate information
backwards from assumes, because if the information is trivially propagated
backwards, it can appear to LLVM that the assumption is known to be true, and
therefore can be removed.

This is sound (because an assumption has no semantic effect except for causing
UB), but prevents the assume from allowing further optimizations.

The isEphemeralValueOf check exists to try and prevent this issue by not
removing the source of an assumption. This tries to make it a little bit more
general to handle the case of side-effectful instructions, such as in

  %0 = call i1 @get_val()
  %1 = xor i1 %0, true
  call void @llvm.assume(i1 %1)

Patch by Ariel Ben-Yehuda, thanks!

Differential Revision: https://reviews.llvm.org/D36590

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310859 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-14 17:11:43 +00:00
Chandler Carruth
821fe0674f [ValueTracking] Revert r310583 which enabled functionality that still is
causing compile time issues.

Moreover, the patch *deleted* the flag in addition to changing the
default, and links to a code review that doesn't even discuss the flag
and just has an update to a Clang test case.

I've followed up on the commit thread to ask for numbers on compile time
at this point, leaving the flag in place until things stabilize, and
pointing at specific code that seems to exhibit excessive compile time
with this patch.

Original commit message for r310583:
"""
[ValueTracking] Enabling ValueTracking patch by default (recommit). Part 2.

The original patch was an improvement to IR ValueTracking on
non-negative integers. It has been checked in to trunk (D18777,
r284022). But was disabled by default due to performance regressions.
Perf impact has improved. The patch would be enabled by default.
""""

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310816 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-14 07:03:24 +00:00
Simon Pilgrim
6f0ee4dc78 [CostModel][X86] Add SSE2 two-src shuffle costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310654 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 19:32:35 +00:00
Simon Pilgrim
9a17eb199c [CostModel][X86] Add avx1 two-src shuffle costs
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@310650 91177308-0d34-0410-b5e6-96231b3b80d8
2017-08-10 19:02:51 +00:00