Commit Graph

11090 Commits

Author SHA1 Message Date
Xinliang David Li
45f64ad571 Revert r320104: infinite loop profiling bug fix
Causes unexpected memory issue with New PM this time.
The new PM invalidates BPI but not BFI, leaving the
reference to BPI from BFI invalid.

Abandon this patch.  There is a more general solution
which also handles runtime infinite loop (but not statically).



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320180 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-08 19:38:07 +00:00
Alexey Bataev
b415523f30 [InstCombine] PR35354: Convert store(bitcast, load bitcast (select (Cond, &V1, &V2)) --> store (, load (select(Cond, load &V1, load &V2)))
Summary:
If we have the code like this:
```
float a, b;
a = std::max(a ,b);
```
it is converted into something like this:
```
%call = call dereferenceable(4) float* @_ZSt3maxIfERKT_S2_S2_(float* nonnull dereferenceable(4) %a.addr, float* nonnull dereferenceable(4) %b.addr)
%1 = bitcast float* %call to i32*
%2 = load i32, i32* %1, align 4
%3 = bitcast float* %a.addr to i32*
store i32 %2, i32* %3, align 4
```
After inlinning this code is converted to the next:
```
%1 = load float, float* %a.addr
%2 = load float, float* %b.addr
%cmp.i = fcmp fast olt float %1, %2
%__b.__a.i = select i1 %cmp.i, float* %a.addr, float* %b.addr
%3 = bitcast float* %__b.__a.i to i32*
%4 = load i32, i32* %3, align 4
%5 = bitcast float* %arrayidx to i32*
store i32 %4, i32* %5, align 4

```
This pattern is not recognized as minmax pattern.
Patch solves this problem by converting sequence
```
store (bitcast, (load bitcast (select ((cmp V1, V2), &V1, &V2))))
```
to a sequence
```
store (,load (select((cmp V1, V2), &V1, &V2)))
```
After this the code is recognized as minmax pattern.

Reviewers: RKSimon, spatel

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40304

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320157 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-08 15:32:10 +00:00
Xinliang David Li
134f1a833f [PGO] detect infinite loop and form MST properly
Differential Revision: http://reviews.llvm.org/D40873


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320104 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 22:23:28 +00:00
Sanjay Patel
1f609ed2b6 [InstCombine] add tests for abs using bit hackery; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320068 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 18:13:33 +00:00
Igor Laevsky
e18338969d [InstCombine] Don't crash on out of bounds index in the insertelement
Differential Revision: https://reviews.llvm.org/D40390



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320049 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 15:00:52 +00:00
Igor Laevsky
ab26b59c06 [InstSimplify] Add tests for the rL319894
Differential Revision: https://reviews.llvm.org/D40650



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@320014 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-07 08:52:24 +00:00
Adam Nemet
d62c5bdaf9 [LV] Interleaved access vectorization: fix computing new alias info
As a new access is generated spanning across multiple fields, we need to
propagate alias info from all the fields to form the most generic alias info.

rdar://35602528

Differential Revision: https://reviews.llvm.org/D40617

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319979 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-06 22:42:24 +00:00
Sanjay Patel
2b68e112a3 [InstCombine] canonicalize constant-minus-boolean to select-of-constants
This restores the half of:
https://reviews.llvm.org/rL75531
that was reverted at:
https://reviews.llvm.org/rL159230

For the x86 case mentioned there, we now produce:
leal 1(%rdi), %eax
subl %esi, %eax

We have target hooks to invert this in DAGCombiner (and x86 is enabled) with:
https://reviews.llvm.org/rL296977
https://reviews.llvm.org/rL311731

AArch64 and possibly other targets would probably benefit from enabling those hooks too. 
See PR30327:
https://bugs.llvm.org/show_bug.cgi?id=30327#c2

Differential Revision: https://reviews.llvm.org/D40612


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319964 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-06 21:22:57 +00:00
Zvi Rackover
138b4f2021 InstructionSimplify: 'extractelement' with an undef index is undef
Summary:
An undef extract index can be arbitrarily chosen to be an
out-of-range index value, which would result in the instruction being undef.

This change closes a gap identified while working on lowering vector permute intrinsics
with variable index vectors to pure LLVM IR.

Reviewers: arsenm, spatel, majnemer

Reviewed By: arsenm, spatel

Subscribers: fhahn, nhaehnle, wdng, llvm-commits

Differential Revision: https://reviews.llvm.org/D40231

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319910 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-06 17:51:46 +00:00
Igor Laevsky
3b06fccc77 [InstSimplify] Fold insertelement into undef if index is out of bounds
Differential Revision: https://reviews.llvm.org/D40650



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319894 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-06 14:04:45 +00:00
Hans Wennborg
d5bd8eeafe Revert r319482 and r319483 "[memcpyopt] Teach memcpyopt to optimize across basic blocks"
This caused PR35519.

> [memcpyopt] Teach memcpyopt to optimize across basic blocks
>
> This teaches memcpyopt to make a non-local memdep query when a local query
> indicates that the dependency is non-local. This notably allows it to
> eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.
>
> Fixes PR28958.
>
> Differential Revision: https://reviews.llvm.org/D38374
>

> [memcpyopt] Commit file missed in r319482.
>
> This change was meant to be included with r319482 but was accidentally
> omitted.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319873 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-06 01:47:55 +00:00
Xinliang David Li
4f7feb41e5 Revert r319794: [PGO] detect infinite loop and form MST properly: memory leak problem
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319841 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 21:54:01 +00:00
Joel Galenson
2bfaf4d2f2 [CVP] Remove some {s|u}sub.with.overflow checks.
This uses ConstantRange::makeGuaranteedNoWrapRegion's newly-added handling for subtraction to allow CVP to remove some subtraction overflow checks.

Differential Revision: https://reviews.llvm.org/D40039

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319807 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 18:14:24 +00:00
Xinliang David Li
9a320d9284 [PGO] detect infinite loop and form MST properly
Differential Revision: http://reviews.llvm.org/D40702


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319794 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 17:19:41 +00:00
Alexey Bataev
4996301956 [InstCombine] Additional test for PR35354, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319783 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 16:15:55 +00:00
Mikael Holmen
dd0ef7d32c Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by JesperAntonsson.

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: uabelho, llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319768 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 14:14:00 +00:00
Igor Laevsky
b8aa60d240 [InstCombine] Don't crash on out of bounds shifts
Differential Revision: https://reviews.llvm.org/D40649



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319761 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-05 12:18:15 +00:00
Haicheng Wu
15bf0436ab [ConstantFold] Support vector index when factoring out GEP index into preceding dimensions
Follow-up of r316824. This patch supports the vector type for both current and
previous index when factoring out the current one into the previous one.

Differential Revision: https://reviews.llvm.org/D39556

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319683 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 19:56:33 +00:00
Sanjoy Das
3745f0675a [BypassSlowDivision] Improve our handling of divisions by constants
(This reapplies r314253.  r314253 was reverted on r314482 because of a
correctness regression on P100, but that regression was identified to be
something else.)

Summary:
Don't bail out on constant divisors for divisions that can be narrowed without
introducing control flow .  This gives us a 32 bit multiply instead of an
emulated 64 bit multiply in the generated PTX assembly.

Reviewers: jlebar

Subscribers: jholewinski, mcrosier, llvm-commits

Differential Revision: https://reviews.llvm.org/D38265

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319677 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 19:21:58 +00:00
Anna Thomas
7c3eddc7d6 [Loop Predication] Teach LP about reverse loops
Summary:
Currently, we only support predication for forward loops with step
of 1.  This patch enables loop predication for reverse or
countdownLoops, which satisfy the following conditions:
   1. The step of the IV is -1.
   2. The loop has a singe latch as B(X) = X <pred>
latchLimit with pred as s> or u>
   3. The IV of the guard is the decrement
IV of the latch condition (Guard is: G(X) = X-1 u< guardLimit).

This patch was downstream for a while and is the last series of patches
that's from our LP implementation downstream.

Reviewers: apilipenko, mkazantsev, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40353

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319659 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-04 15:11:48 +00:00
Matt Morehouse
79f1f0032c Revert "[X86] Improvement in CodeGen instruction selection for LEAs."
This reverts r319543, due to ASan bot breakage.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319591 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 22:20:26 +00:00
Philip Reames
0fed3ad4cb [IndVars] Fix a bug introduced in r317012
Turns out we can have comparisons which are indirect users of the induction variable that we can make invariant.  In this case, there is no loop invariant value contributing and we'd fail an assert.

The test case was found by a java fuzzer and reduced.  It's a real cornercase.  You have to have a static loop which we've already proven only executes once, but haven't broken the backedge on, and an inner phi whose result can be constant folded by SCEV using exit count reasoning but not proven by isKnownPredicate.  To my knowledge, only the fuzzer has hit this case.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319583 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 20:57:19 +00:00
Adam Nemet
c4f204db88 [opt-remarks] If hotness threshold is set, ignore remarks without hotness
These are blocks that haven't not been executed during training.  For large
projects this could make a significant difference.  For the project, I was
looking at, I got an order of magnitude decrease in the size of the total YAML
files with this and r319235.

Differential Revision: https://reviews.llvm.org/D40678

Re-commit after fixing the failing testcase in rL319576, rL319577 and
rL319578.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319581 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 20:41:38 +00:00
Adam Nemet
53cc245050 Revert "[opt-remarks] If hotness threshold is set, ignore remarks without hotness"
This reverts commit r319556.

Something is not working with this when used with sample-based profiling.
Investigating...

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319562 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 18:12:29 +00:00
Adam Nemet
7cb910a453 [opt-remarks] If hotness threshold is set, ignore remarks without hotness
These are blocks that haven't not been executed during training.  For large
projects this could make a significant difference.  For the project, I was
looking at, I got an order of magnitude decrease in the size of the total YAML
files with this and r319235.

Differential Revision: https://reviews.llvm.org/D40678

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319556 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 17:02:04 +00:00
Hans Wennborg
faed772a25 Revert r319531 "[SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops."
It causes builds to fail with "Instruction does not dominate all uses" (PR35497).

> Patch tries to improve vectorization of the following code:
>
> void add1(int * __restrict dst, const int * __restrict src) {
>   *dst++ = *src++;
>   *dst++ = *src++ + 1;
>   *dst++ = *src++ + 2;
>   *dst++ = *src++ + 3;
> }
> Allows to vectorize even if the very first operation is not a binary add, but just a load.
>
> Fixed issues related to previous commit.
>
> Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
>
> Reviewed By: ABataev, RKSimon
>
> Subscribers: llvm-commits, RKSimon
>
> Differential Revision: https://reviews.llvm.org/D28907

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319550 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 16:17:24 +00:00
Jatin Bhateja
f5040aa4c6 [X86] Improvement in CodeGen instruction selection for LEAs.
Summary:
1/  Operand folding during complex pattern matching for LEAs has been extended, such that it promotes Scale to
     accommodate similar operand appearing in the DAG  e.g.
                 T1 = A + B
                 T2 = T1 + 10
                 T3 = T2 + A
    For above DAG rooted at T3, X86AddressMode will now look like
                Base = B , Index = A , Scale = 2 , Disp = 10

2/  During OptimizeLEAPass down the pipeline factorization is now performed over LEAs so that if there is an opportunity
     then complex LEAs (having 3 operands) could be factored out  e.g.
                 leal 1(%rax,%rcx,1), %rdx
                 leal 1(%rax,%rcx,2), %rcx
     will be factored as following
                 leal 1(%rax,%rcx,1), %rdx
                 leal (%rdx,%rcx)   , %edx

3/ Aggressive operand folding for AM based selection for LEAs is sensitive to loops, thus avoiding creation of any complex LEAs within a loop.

4/ Simplify LEA converts (lea (BASE,1,INDEX,0)  --> add (BASE, INDEX) which offers better through put.

PR32755 will be taken care of by this pathc.

Previous patch revisions : r313343 , r314886

Reviewers: lsaba, RKSimon, craig.topper, qcolombet, jmolloy, jbhateja

Reviewed By: lsaba, RKSimon, jbhateja

Subscribers: jmolloy, spatel, igorb, llvm-commits

Differential Revision: https://reviews.llvm.org/D35014

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319543 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 14:07:38 +00:00
Mikael Holmen
4ece91062b Revert r319537: Bail out of a SimplifyCFG switch table opt at undef values.
Broke build bots so reverting.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319539 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 13:11:39 +00:00
Florian Hahn
03c5509e66 [InstSimplify] More fcmp cases when comparing against negative constants.
Summary:
For known positive non-zero value X:
    fcmp uge X, -C => true
    fcmp ugt X, -C => true
    fcmp une X, -C => true
    fcmp oeq X, -C => false
    fcmp ole X, -C => false
    fcmp olt X, -C => false


Patch by Paul Walker.

Reviewers: majnemer, t.p.northover, spatel, RKSimon

Reviewed By: spatel

Subscribers: fhahn, llvm-commits

Differential Revision: https://reviews.llvm.org/D40012

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319538 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 12:34:16 +00:00
Mikael Holmen
8c4503a350 Bail out of a SimplifyCFG switch table opt at undef values.
Summary:
A true or false result is expected from a comparison, but it seems the possibility of undef was overlooked, which could lead to a failed assert. This is fixed by this patch by bailing out if we encounter undef.

The bug is old and the assert has been there since the end of 2014, so it seems this is unusual enough to forego optimization.

Patch by: JesperAntonsson

Reviewers: spatel, eeckstein, hans

Reviewed By: hans

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D40639

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319537 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 12:30:49 +00:00
Dinar Temirbulatov
8caaeced90 [SLPVectorizer] Failure to beneficially vectorize 'copyable' elements in integer binary ops.
Patch tries to improve vectorization of the following code:
    
            void add1(int * __restrict dst, const int * __restrict src) {
              *dst++ = *src++;
              *dst++ = *src++ + 1;
              *dst++ = *src++ + 2;
              *dst++ = *src++ + 3;
            }
            Allows to vectorize even if the very first operation is not a binary add, but just a load.
    
            Fixed issues related to previous commit.
    
            Reviewers: spatel, mzolotukhin, mkuper, hfinkel, RKSimon, filcab, ABataev
    
            Reviewed By: ABataev, RKSimon
    
            Subscribers: llvm-commits, RKSimon
    
            Differential Revision: https://reviews.llvm.org/D28907


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319531 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 11:10:47 +00:00
Hiroshi Inoue
e983865629 Recommit rL319407: [SROA] enable splitting for non-whole-alloca loads and stores
Recommiting once reverted patch rL319407 after adding a check for bit vector size to avoid failures in some build bots.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319522 91177308-0d34-0410-b5e6-96231b3b80d8
2017-12-01 06:05:05 +00:00
Peter Collingbourne
31b9aa7755 ThinLTOBitcodeWriter: Try harder to discard unused references to the merged module.
If the thin module has no references to an internal global in the
merged module, we need to make sure to preserve that property if the
global is a member of a comdat group, as otherwise promotion can end
up adding global symbols to the comdat, which is not allowed.

This situation can arise if the external global in the thin module
has dead constant users, which would cause use_empty() to return
false and would cause us to try to promote it. To prevent this from
happening, discard the dead constant users before asking whether a
global is empty.

Differential Revision: https://reviews.llvm.org/D40593

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319494 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 23:05:52 +00:00
Dan Gohman
7a991ca87c [memcpyopt] Teach memcpyopt to optimize across basic blocks
This teaches memcpyopt to make a non-local memdep query when a local query
indicates that the dependency is non-local. This notably allows it to
eliminate many more llvm.memcpy calls in common Rust code, often by 20-30%.

Fixes PR28958.

Differential Revision: https://reviews.llvm.org/D38374


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319482 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 22:10:53 +00:00
Xinliang David Li
64d8eced83 [PGO] Skip counter promotion for infinite loops
Differential Revision: http://reviews.llvm.org/D40662


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319462 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 19:16:25 +00:00
Alexey Bataev
06cb6a98c1 [InstCombine] Additional test for PR35354, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319436 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 14:33:58 +00:00
Hiroshi Inoue
50f1f9435a Revert rL319407: [SROA] enable splitting for non-whole-alloca loads and stores
This reverts commit rL319407 due to failures in some buildbot.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319410 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 08:29:51 +00:00
Hiroshi Inoue
39be023c86 [SROA] enable splitting for non-whole-alloca loads and stores
Currently, SROA splits loads and stores only when they are accessing the whole alloca.
This patch relaxes this limitation to allow splitting a load/store if all other loads and stores to the alloca are disjoint to or fully included in the current load/store. If there is no other load or store that crosses the boundary of the current load/store, the current splitting implementation works as is.
The whole-alloca loads and stores meet this new condition and so they are still splittable.

Here is a simplified motivating example.

struct record {
    long long a;
    int b;
    int c;
};

int func(struct record r) {
    for (int i = 0; i < r.c; i++)
        r.b++;
    return r.b;
}

When updating r.b (or r.c as well), LLVM generates redundant instructions on some platforms (such as x86_64, ppc64); here, r.b and r.c are packed into one 64-bit GPR when the struct is passed as a method argument.

With this patch, the above example is compiled into only few instructions without loop.
Without the patch, unnecessary loop-carried dependency is introduced by SROA and the loop cannot be eliminated by the later optimizers.

Differential Revision: https://reviews.llvm.org/D32998



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319407 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 07:44:46 +00:00
Graham Yiu
9d522dc415 - Removed unused lamba (IsReturnBlock) causing build bots to fail for r319398
- Added lit testcases that were supposed to be part of r319398

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319399 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-30 03:36:57 +00:00
Sanjay Patel
0221a1478f [InstCombine] add tests for select-of-constants; NFC
These are variants of a test that was originally added in:
https://reviews.llvm.org/rL75531
...but removed with:
https://reviews.llvm.org/rL159230



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319327 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-29 17:21:39 +00:00
Serguei Katkov
c2659f12f7 [CGP] Fix common type handling in optimizeMemoryInst
If common type is different we should bail out due to we will not be
able to create a select or Phi of these values.

Basically it is done in ExtAddrMode::compare however it does not work
if we handle the null first and then two values of different types.
so add a check in initializeMap as well. The check in ExtAddrMode::compare
is used as earlier bail out.

Reviewers: reames, john.brawn
Reviewed By: john.brawn
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D40479


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319292 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-29 05:51:26 +00:00
Adam Nemet
999c1417c7 Remove this test
After r319235, we no longer generate this remark.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319242 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 22:39:38 +00:00
Adam Nemet
1f9846951b Demote this opt remark to DEBUG.
From a random opt-stat output:

Top 10 remarks:
  tailcallelim/tailcall          53%
  inline/AlwaysInline            13%
  gvn/LoadClobbered              13%
  inline/Inlined                  8%
  inline/TooCostly                2%
  inline/NoDefinition             2%
  licm/LoadWithLoopInvariantAddressInvalidated  2%
  licm/Hoisted                    1%
  asm-printer/InstructionCount    1%
  prologepilog/StackSize          1%

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319235 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 22:11:00 +00:00
Alexey Bataev
4ff622660e [SLP] Additional test for PR35354, NFC.
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319224 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 20:48:24 +00:00
Sanjay Patel
67b235da2e [InstCombine] auto-generate complete test checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319205 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 19:13:23 +00:00
Sanjay Patel
68e8f78488 [InstCombine] auto-generate complete test checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319203 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 19:07:28 +00:00
Hans Wennborg
dd74d55407 EntryExitInstrumenter: set DebugLocs on the inserted call instructions (PR35412)
Apparently the verifier requires that inlineable calls in a function
with debug info have debug locations.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319199 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 18:44:26 +00:00
Sanjay Patel
240ac15972 [InstCombine] add tests from D39421 to show current transforms; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319182 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 16:40:30 +00:00
Chandler Carruth
6efdd0fb59 Add a new pass to speculate around PHI nodes with constant (integer) operands when profitable.
The core idea is to (re-)introduce some redundancies where their cost is
hidden by the cost of materializing immediates for constant operands of
PHI nodes. When the cost of the redundancies is covered by this,
avoiding materializing the immediate has numerous benefits:
1) Less register pressure
2) Potential for further folding / combining
3) Potential for more efficient instructions due to immediate operand

As a motivating example, consider the remarkably different cost on x86
of a SHL instruction with an immediate operand versus a register
operand.

This pattern turns up surprisingly frequently, but is somewhat rarely
obvious as a significant performance problem.

The pass is entirely target independent, but it does rely on the target
cost model in TTI to decide when to speculate things around the PHI
node. I've included x86-focused tests, but any target that sets up its
immediate cost model should benefit from this pass.

There is probably more that can be done in this space, but the pass
as-is is enough to get some important performance on our internal
benchmarks, and should be generally performance neutral, but help with
more extensive benchmarking is always welcome.

One awkward part is that this pass has to be scheduled after
*everything* that can eliminate these kinds of redundancies. This
includes SimplifyCFG, GVN, etc. I'm open to suggestions about better
places to put this. We could in theory make it part of the codegen pass
pipeline, but there doesn't really seem to be a good reason for that --
it isn't "lowering" in any sense and only relies on pretty standard cost
model based TTI queries, so it seems to fit well with the "optimization"
pipeline model. Still, further thoughts on the pipeline position are
welcome.

I've also only implemented this in the new pass manager. If folks are
very interested, I can try to add it to the old PM as well, but I didn't
really see much point (my use case is already switched over to the new
PM).

I've tested this pretty heavily without issue. A wide range of
benchmarks internally show no change outside the noise, and I don't see
any significant changes in SPEC either. However, the size class
computation in tcmalloc is substantially improved by this, which turns
into a 2% to 4% win on the hottest path through tcmalloc for us, so
there are definitely important cases where this is going to make
a substantial difference.

Differential revision: https://reviews.llvm.org/D37467

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319164 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 11:32:31 +00:00
Florian Hahn
04289f6357 [TailRecursionElimination] Skip debug intrinsics.
Summary:
I think we do not need to analyze debug intrinsics here, as they should
not impact codegen. This has 2 benefits: 1) slightly less work to do and
2) avoiding generating optimization remarks for converting calls to
debug intrinsics to tail calls, which are not really helpful for users.

Based on work by Sander de Smalen.

Reviewers: davide, trentxintong, aprantl

Reviewed By: aprantl

Subscribers: llvm-commits, JDevlieghere

Tags: #debug-info

Differential Revision: https://reviews.llvm.org/D40440

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@319158 91177308-0d34-0410-b5e6-96231b3b80d8
2017-11-28 09:32:25 +00:00