429 Commits

Author SHA1 Message Date
Philip Reames
9efd72ad02 [IndVars] Eliminate loop exits with equivalent exit counts
We can end up with two loop exits whose exit counts are equivalent, but whose textual representation is different and non-obvious. For the sub-case where we have a series of exits which dominate one another (common), eliminate any exits which would iterate *after* a previous exit on the exiting iteration.

As noted in the TODO being removed, I'd always thought this was a good idea, but I've now seen this in a real workload as well.

Interestingly, in review, Nikita pointed out there's let another oppurtunity to leverage SCEV's reasoning.  If we kept track of the min of dominanting exits so far, we could discharge exits with EC >= MDE.  This is less powerful than the existing transform (since later exits aren't considered), but potentially more powerful for any case where SCEV can prove a >= b, but neither a == b or a > b.  I don't have an example to illustrate that oppurtunity, but won't be suprised if we find one and return to handle that case as well.  

Differential Revision: https://reviews.llvm.org/D69009



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375379 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-20 23:38:02 +00:00
Philip Reames
91e38641d2 Remove a stale comment, noted in post commit review for rL375038
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375040 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-16 20:27:10 +00:00
Philip Reames
25ddbc8218 [IndVars] Fix a miscompile in off-by-default loop predication implementation
The problem is that we can have two loop exits, 'a' and 'b', where 'a' and 'b' would exit at the same iteration, 'a' precedes 'b' along some path, and 'b' is predicated while 'a' is not. In this case (see the previously submitted test case), we causing the loop to exit through 'b' whereas it should have exited through 'a'.

This only applies to loop exits where the exit counts are not provably inequal, but that isn't as much of a restriction as it appears. If we could order the exit counts, we'd have already removed one of the two exits. In theory, we might be able to prove inequality w/o ordering, but I didn't really explore that piece. Instead, I went for the obvious restriction and ensured we didn't predicate exits following non-predicateable exits.

Credit goes to Evgeny Brevnov for figuring out the problematic case. Fuzzing probably also found it (failures seen), but due to some silly infrastructure problems I hadn't gotten to the results before Evgeny hand reduced it from a benchmark (he manually enabled the transform). Once this is fixed, I'll try to filter through the fuzzer failures to see if there's anything additional lurking.

Differential Revision https://reviews.llvm.org/D68956



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@375038 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-16 19:58:26 +00:00
Philip Reames
a8b14d4c58 [Tests] Add a test demonstrating a miscompile in the off-by-default loop-pred transform
Credit goes to Evgeny Brevnov for figuring out the problematic case.

Fuzzing probably also found it (lots of failures), but due to some silly infrastructure problems I hadn't gotten to the results before Evgeny hand reduced it from a benchmark.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374812 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-14 19:49:40 +00:00
Philip Reames
f7875e639f [Tests] Add a few more tests for idioms with FP induction variables
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@374807 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-14 19:10:39 +00:00
Philip Reames
dbfbe60dac [IndVars] An implementation of loop predication without a need for speculation
This patch implements a variation of a well known techniques for JIT compilers - we have an implementation in tree as LoopPredication - but with an interesting twist. This version does not assume the ability to execute a path which wasn't taken in the original program (such as a guard or widenable.condition intrinsic). The benefit is that this works for arbitrary IR from any frontend (including C/C++/Fortran). The tradeoff is that it's restricted to read only loops without implicit exits.

This builds on SCEV, and can thus eliminate the loop varying portion of the any early exit where all exits are understandable by SCEV. A key advantage is that fixing deficiency exposed in SCEV - already found one while writing test cases - will also benefit all of full redundancy elimination (and most other loop transforms).

I haven't seen anything in the literature which quite matches this. Given that, I'm not entirely sure that keeping the name "loop predication" is helpful. Anyone have suggestions for a better name? This is analogous to partial redundancy elimination - since we remove the condition flowing around the backedge - and has some parallels to our existing transforms which try to make conditions invariant in loops.

Factoring wise, I chose to put this in IndVarSimplify since it's a generally applicable to all workloads. I could split this off into it's own pass, but we'd then probably want to add that new pass every place we use IndVars.  One solid argument for splitting it off into it's own pass is that this transform is "too good". It breaks a huge number of existing IndVars test cases as they tend to be simple read only loops.  At the moment, I've opted it off by default, but if we add this to IndVars and enable, we'll have to update around 20 test files to add side effects or disable this transform.

Near term plan is to fuzz this extensively while off by default, reflect and discuss on the factoring issue mentioned just above, and then enable by default.  I also need to give some though to supporting widenable conditions in this framing.

Differential Revision: https://reviews.llvm.org/D67408



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@373351 91177308-0d34-0410-b5e6-96231b3b80d8
2019-10-01 17:03:44 +00:00
Alexey Lapshin
70343be031 [Debuginfo] dbg.value points to undef value after Induction Variable Simplification.
Induction Variable Simplification pass does not update dbg.value intrinsic.

Before:

%add = add nuw nsw i32 %ArgIndex.06, 1
call void @llvm.dbg.value(metadata i32 %add, metadata !17, metadata !DIExpression())

After:

%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 undef, metadata !17, metadata !DIExpression())

There should be:

%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
call void @llvm.dbg.value(metadata i64 %indvars.iv.next, metadata !17, metadata !DIExpression())

Differential Revision: https://reviews.llvm.org/D67770

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372703 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-24 08:47:03 +00:00
Roman Lebedev
d1f4a55c8c [SimplifyCFG] FoldTwoEntryPHINode(): consider *total* speculation cost, not per-BB cost
Summary:
Previously, if the threshold was 2, we were willing to speculatively
execute 2 cheap instructions in both basic blocks (thus we were willing
to speculatively execute cost = 4), but weren't willing to speculate
when one BB had 3 instructions and other one had no instructions,
even thought that would have total cost of 3.

This looks inconsistent to me.
I don't think `cmov`-like instructions will start executing
until both of it's inputs are available: https://godbolt.org/z/zgHePf
So i don't see why the existing behavior is the correct one.

Also, let's add it's own `cl::opt` for this threshold,
with default=4, so it is not stricter than the previous threshold:
will allow to fold when there are 2 BB's each with cost=2.
And since the logic has changed, it will also allow to fold when
one BB has cost=3 and other cost=1, or there is only one BB with cost=4.

This is an alternative solution to D65148:
This fix is mainly motivated by `signbit-like-value-extension.ll` test.
That pattern comes up in JPEG decoding, see e.g.
`Figure F.12 – Extending the sign bit of a decoded value in V`
of `ITU T.81` (JPEG specification).
That branch is not predictable, and it is within the innermost loop,
so the fact that that pattern ends up being stuck with a branch
instead of `select` (i.e. `CMOV` for x86) is unlikely to be beneficial.

This has great results on the final assembly (vanilla test-suite + RawSpeed): (metric pass - D67240)
| metric                                 |     old |     new | delta |      % |
| x86-mi-counting.NumMachineFunctions    |   37720 |   37721 |     1 |  0.00% |
| x86-mi-counting.NumMachineBasicBlocks  |  773545 |  771181 | -2364 | -0.31% |
| x86-mi-counting.NumMachineInstructions | 7488843 | 7486442 | -2401 | -0.03% |
| x86-mi-counting.NumUncondBR            |  135770 |  135543 |  -227 | -0.17% |
| x86-mi-counting.NumCondBR              |  423753 |  422187 | -1566 | -0.37% |
| x86-mi-counting.NumCMOV                |   24815 |   25731 |   916 |  3.69% |
| x86-mi-counting.NumVecBlend            |      17 |      17 |     0 |  0.00% |

We significantly decrease basic block count, notably decrease instruction count,
significantly decrease branch count and very significantly increase `cmov` count.

Performance-wise, unsurprisingly, this has great effect on
target RawSpeed benchmark. I'm seeing 5 **major** improvements:
```
Benchmark                                                                                             Time             CPU      Time Old      Time New       CPU Old       CPU New
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_mean                                  -0.3064         -0.3064      226.9913      157.4452      226.9800      157.4384
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_median                                -0.3057         -0.3057      226.8407      157.4926      226.8282      157.4828
Samsung/NX3000/_3184416.SRW/threads:8/process_time/real_time_stddev                                -0.4985         -0.4954        0.3051        0.1530        0.3040        0.1534
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_mean                                   -0.1747         -0.1747       80.4787       66.4227       80.4771       66.4146
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_median                                 -0.1742         -0.1743       80.4686       66.4542       80.4690       66.4436
Kodak/DCS760C/86L57188.DCR/threads:8/process_time/real_time_stddev                                 +0.6089         +0.5797        0.0670        0.1078        0.0673        0.1062
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_pvalue                                 0.0000          0.0000      U Test, Repetitions: 49 vs 49
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_mean                                  -0.1598         -0.1598      171.6996      144.2575      171.6915      144.2538
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_median                                -0.1598         -0.1597      171.7109      144.2755      171.7018      144.2766
Sony/DSLR-A230/DSC08026.ARW/threads:8/process_time/real_time_stddev                                +0.4024         +0.3850        0.0847        0.1187        0.0848        0.1175
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_mean                                   -0.0550         -0.0551      280.3046      264.8800      280.3017      264.8559
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_median                                 -0.0554         -0.0554      280.2628      264.7360      280.2574      264.7297
Canon/EOS 77D/IMG_4049.CR2/threads:8/process_time/real_time_stddev                                 +0.7005         +0.7041        0.2779        0.4725        0.2775        0.4729
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_pvalue                                  0.0000          0.0000      U Test, Repetitions: 49 vs 49
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_mean                                   -0.0354         -0.0355      316.7396      305.5208      316.7342      305.4890
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_median                                 -0.0354         -0.0356      316.6969      305.4798      316.6917      305.4324
Canon/EOS 5DS/2K4A9929.CR2/threads:8/process_time/real_time_stddev                                 +0.0493         +0.0330        0.3562        0.3737        0.3563        0.3681
```

That being said, it's always best-effort, so there will likely
be cases where this worsens things.

Reviewers: efriedma, craig.topper, dmgreen, jmolloy, fhahn, Carrot, hfinkel, chandlerc

Reviewed By: jmolloy

Subscribers: xbolva00, hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D67318

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@372009 91177308-0d34-0410-b5e6-96231b3b80d8
2019-09-16 16:18:24 +00:00
Philip Reames
919a01374a [IndVars] Fix a bug noticed by inspection
We were computing the loop exit value, but not ensuring the addrec belonged to the loop whose exit value we were computing.  I couldn't actually trip this; the test case shows the basic setup which *might* trip this, but none of the variations I've tried actually do.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@369730 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-23 04:03:23 +00:00
Philip Reames
4a1169edbd [RLEV] Rewrite loop exit values for multiple exit loops w/o overall loop exit count
We already supported rewriting loop exit values for multiple exit loops, but if any of the loop exits were not computable, we gave up on all loop exit values. This patch generalizes the existing code to handle individual computable loop exits where possible.

As discussed in the review, this is a starting point for figuring out a better API.  The code is a bit ugly, but getting it in lets us test as we go.  

Differential Revision: https://reviews.llvm.org/D65544



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@368898 91177308-0d34-0410-b5e6-96231b3b80d8
2019-08-14 18:27:57 +00:00
Philip Reames
d5a3ed4880 [IndVars, RLEV] Support rewriting exit values in loops without known exits (prep work)
This is a prepatory patch for future work on support exit value rewriting in loops with a mixture of computable and non-computable exit counts.  The intention is to be "mostly NFC" - i.e. not enable any interesting new transforms - but in practice, there are some small output changes.

The test differences are caused by cases wherewhere getSCEVAtScope can simplify a single entry phi without needing any knowledge of the loop.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@367485 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-31 21:15:21 +00:00
Philip Reames
b1937e022d [IndVars] Fix a subtle bug in optimizeLoopExits
The original code failed to account for the fact that one exit can have a pointer exit count without all of them having pointer exit counts.  This could cause two separate bugs:
1) We might exit the loop early, and leave optimizations undone.  This is what triggered the assertion failure in the reported test case.
2) We might optimize one exit, then exit without indicating a change.  This could result in an analysis invalidaton bug if no other transform is done by the rest of indvars.

Note that the pointer exit counts are a really fragile concept.  They show up only when we have a pointer IV w/o a datalayout to provide their size.  It's really questionable to me whether the complexity implied is worth it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366829 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-23 17:45:11 +00:00
Roman Lebedev
cdff95b29b [IndVarSimplify][NFC] Autogenerate check lines in loop_evaluate_1.ll
Being affected by upcoming patch.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@366746 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-22 22:08:27 +00:00
Philip Reames
d406311e80 [IndVars] Use exit count reasoning to discharge obviously untaken exits
Continue in the spirit of D63618, and use exit count reasoning to prove away loop exits which can not be taken since the backedge taken count of the loop as a whole is provably less than the minimal BE count required to take this particular loop exit.

As demonstrated in the newly added tests, this triggers in a number of cases where IndVars was previously unable to discharge obviously redundant exit tests. And some not so obvious ones.

Differential Revision: https://reviews.llvm.org/D63733



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365920 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-12 17:05:35 +00:00
Nikita Popov
48617a9069 [LFTR] Regenerate test checks; NFC
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365262 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-06 08:54:15 +00:00
Philip Reames
da1c0f6889 [LFTR] Use SCEVExpander for the pointer limit case instead of manual IR gen
As noted in the test change, this is not trivially NFC, but all of the changes in output are cases where the SCEVExpander form is more canonical/optimal than the hand generation.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365075 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-03 20:03:46 +00:00
Philip Reames
d07577708c [LFTR] Hoist extend expressions outside of loops w/o waiting for LICM
The motivation for this is two fold:
1) Make the output (and thus tests)  a bit more readable to a human trying to understand the result of the transform
2) Reduce spurious diffs in a potential future change to restructure all of this logic to use SCEVExpander (which hoists by default)



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@365066 91177308-0d34-0410-b5e6-96231b3b80d8
2019-07-03 18:18:36 +00:00
Nikita Popov
d6d7595593 [LFTR] Fix post-inc pointer IV with truncated exit count (PR41998)
Fixes https://bugs.llvm.org/show_bug.cgi?id=41998. Usually when we
have a truncated exit count we'll truncate the IV when comparing
against the limit, in which case exit count overflow in post-inc
form doesn't matter. However, for pointer IVs we don't do that, so
we have to be careful about incrementing the IV in the wide type.

I'm fixing this by removing the IVCount variable (which was
ExitCount or ExitCount+1) and replacing it with a UsePostInc flag,
and then moving the actual limit adjustment to the individual cases
(which are: pointer IV where we add to the wide type, integer IV
where we add to the narrow type, and constant integer IV where we
add to the wide type).

Differential Revision: https://reviews.llvm.org/D63686

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364709 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-29 09:24:12 +00:00
Philip Reames
72415d3093 [Tests] Add cases where we're failing to discharge provably loop exits (tests for D63733)
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364220 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-24 19:26:17 +00:00
Philip Reames
d77254d26e [Tests] Autogen and improve test readability
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364156 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-23 17:13:53 +00:00
Philip Reames
91371a5153 [IndVars] Remove dead instructions after folding trivial loop exit
In rL364135, I taught IndVars to fold exiting branches in loops with a zero backedge taken count (i.e. loops that only run one iteration).  This extends that to eliminate the dead comparison left around.  



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364155 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-23 17:06:57 +00:00
Philip Reames
32acc1f62e Exploit a zero LoopExit count to eliminate loop exits
This turned out to be surprisingly effective. I was originally doing this just for completeness sake, but it seems like there are a lot of cases where SCEV's exit count reasoning is stronger than it's isKnownPredicate reasoning.

Once this is in, I'm thinking about trying to build on the same infrastructure to eliminate provably untaken checks. There may be something generally interesting here.

Differential Revision: https://reviews.llvm.org/D63618



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364135 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-22 17:54:25 +00:00
Nikita Popov
761589542d [LFTR] Add tests for PR41998; NFC
The limit for the pointer case is incorrect.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@364128 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-22 09:57:59 +00:00
Philip Reames
79b609c7e5 [Tests] Add a tricky LFTR case for documentation purposes
Thought of this case while working on something else.  We appear to get it right in all of the variations I tried, but that's by accident.  So, add a test which would catch the potential bug.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363953 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-20 17:16:53 +00:00
Philip Reames
c48c6af149 LFTR for multiple exit loops
Teach IndVarSimply's LinearFunctionTestReplace transform to handle multiple exit loops. LFTR does two key things 1) it rewrites (all) exit tests in terms of a common IV potentially eliminating one in the process and 2) it moves any offset/indexing/f(i) style logic out of the loop.

This turns out to actually be pretty easy to implement. SCEV already has all the information we need to know what the backedge taken count is for each individual exit. (We use that when computing the BE taken count for the loop as a whole.) We basically just need to iterate through the exiting blocks and apply the existing logic with the exit specific BE taken count. (The previously landed NFC makes this super obvious.)

I chose to go ahead and apply this to all loop exits instead of only latch exits as originally proposed. After reviewing other passes, the only case I could find where LFTR form was harmful was LoopPredication. I've fixed the latch case, and guards aren't LFTRed anyways. We'll have some more work to do on the way towards widenable_conditions, but that's easily deferred.

I do want to note that I added one bit after the review.  When running tests, I saw a new failure (no idea why didn't see previously) which pointed out LFTR can rewrite a constant condition back to a loop varying one.  This was theoretically possible with a single exit, but the zero case covered it in practice.  With multiple exits, we saw this happening in practice for the eliminate-comparison.ll test case because we'd compute a ExitCount for one of the exits which was guaranteed to never actually be reached.  Since LFTR ran after simplifyAndExtend, we'd immediately turn around and undo the simplication work we'd just done.  The solution seemed obvious, so I didn't bother with another round of review.

Differential Revision: https://reviews.llvm.org/D62625



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363883 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 21:58:25 +00:00
Philip Reames
cf5225b141 [Tests] Autogen a test so that future changes are understandable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363882 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-19 21:39:07 +00:00
Philip Reames
10c34d1b8d Teach getSCEVAtScope how to handle loop phis w/invariant operands in loops w/taken backedges
This patch really contains two pieces:
    Teach SCEV how to fold a phi in the header of a loop to the value on the backedge when a) the backedge is known to execute at least once, and b) the value is safe to use globally within the scope dominated by the original phi.
    Teach IndVarSimplify's rewriteLoopExitValues to allow loop invariant expressions which already exist (and thus don't need new computation inserted) even in loops where we can't optimize away other uses.

Differential Revision: https://reviews.llvm.org/D63224



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363619 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 21:06:17 +00:00
Philip Reames
d773cb7e1c Fix a bug w/inbounds invalidation in LFTR (recommit)
Recommit r363289 with a bug fix for crash identified in pr42279.  Issue was that a loop exit test does not have to be an icmp, leading to a null dereference crash when new logic was exercised for that case.  Test case previously committed in r363601.

Original commit comment follows:

This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying. As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363613 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 20:32:22 +00:00
Philip Reames
185d55ed9f Reduced test case for pr42279 in advance of the relevant re-commit + fix
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363601 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-17 19:27:45 +00:00
Nikita Popov
bf4a9f7325 [SimplifyIndVar] Simplify non-overflowing saturating add/sub
If we can detect that saturating math that depends on an IV cannot
overflow, replace it with simple math. This is similar to the CVP
optimization from D62703, just based on a different underlying
analysis (SCEV vs LVI) that catches different cases.

Differential Revision: https://reviews.llvm.org/D62792

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363489 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-15 08:48:52 +00:00
Florian Hahn
6c7043a228 Revert Fix a bug w/inbounds invalidation in LFTR
Reverting because it breaks a green dragon build:
    http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/18208

This reverts r363289 (git commit eb88badff96dacef8fce3f003dec34c2ef6900bf)

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363427 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-14 17:23:09 +00:00
Sam Parker
d021f415d1 [SCEV] Pass NoWrapFlags when expanding an AddExpr
InsertBinop now accepts NoWrapFlags, so pass them through when
expanding a simple add expression.

This is the first re-commit of the functional changes from rL362687,
which was previously reverted.

Differential Revision: https://reviews.llvm.org/D61934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363364 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-14 09:19:41 +00:00
Philip Reames
b8ff4408f1 Fix a bug w/inbounds invalidation in LFTR
This contains fixes for two cases where we might invalidate inbounds and leave it stale in the IR (a miscompile). Case 1 is when switching to an IV with no dynamically live uses, and case 2 is when doing pre-to-post conversion on the same pointer type IV.

The basic scheme used is to prove that using the given IV (pre or post increment forms) would have to already trigger UB on the path to the test we're modifying.  As such, our potential UB triggering use does not change the semantics of the original program.

As was pointed out in the review thread by Nikita, this is defending against a separate issue from the hasConcreteDef case. This is about poison, that's about undef. Unfortunately, the two are different, see Nikita's comment for a fuller explanation, he explains it well.

(Note: I'm going to address Nikita's last style comment in a separate commit just to minimize chance of subtle bugs being introduced due to typos.)

Differential Revision: https://reviews.llvm.org/D62939



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363289 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-13 18:23:13 +00:00
Philip Reames
aac3847fc2 [Tests] Highlight impact of multiple exit LFTR (D62625) as requested by reviewer
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363217 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 23:39:49 +00:00
Philip Reames
58c9643afb [Tests] Autogen RLEV test and add tests for a future enhancement
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363193 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 19:23:10 +00:00
Philip Reames
5ea6536b43 [Tests] Add tests to highlight sibling loop optimization order issue for exit rewriting
The issue addressed in r363180 is more broadly relevant.  For the moment, we don't actually get any of these cases because we a) restrict SCEV formation due to SCEExpander needing to preserve LCSSA, and b) don't iterate between loops.




git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363192 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 19:04:51 +00:00
Philip Reames
4298a2dd20 [SCEV] Teach computeSCEVAtScope benefit from one-input Phi. PR39673
SCEV does not propagate arguments through one-input Phis so as to make it easy for the SCEV expander (and related code) to preserve LCSSA.  It's not entirely clear this restriction is neccessary, but for the moment it exists.   For this reason, we don't analyze single-entry phi inputs.  However it is possible that when an this input leaves the loop through LCSSA Phi, it is a provable constant.  Missing that results in an order of optimization issue in loop exit value rewriting where we miss some oppurtunities based on order in which we visit sibling loops.

This patch teaches computeSCEVAtScope about this case. We can generalize it later, but so far we can only replace LCSSA Phis with their constant loop-exiting values.  We should probably also add similiar logic directly in the SCEV construction path itself.

Patch by: mkazantsev (with revised commit message by me)
Differential Revision: https://reviews.llvm.org/D58113



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363180 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-12 17:21:47 +00:00
Philip Reames
28bea3dbfe Generalize icmp matching in IndVars' eliminateTrunc
We were only matching RHS being a loop invariant value, not the inverse. Since there's nothing which appears to canonicalize loop invariant values to RHS, this means we missed cases.

Differential Revision: https://reviews.llvm.org/D63112



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363108 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-11 22:43:25 +00:00
Philip Reames
85a2f58ba7 [Tests] Adjust LFTR dead-iv tests to bypass undef cases
As pointed out by Nikita in review, undef and poison need to be handled separately.  Since we're no longer expecting any test improvements - just fixes for miscompiles - update the tests to bypass the existing undef check.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@363002 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-10 23:17:10 +00:00
Philip Reames
150fff8ce4 [Tests] Split an LFTR dead-iv case
There are two interesting sub-cases here.  1) Switching IVs is legal, but only in pre-increment form.  and 2) Switching IVs is legal, and so is post-increment form.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362993 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-10 22:33:20 +00:00
Philip Reames
656142808a [Tests] Add tests for D62939 (miscompiles around dead pointer IVs)
Flesh out a collection of tests for switching to a dead IV within LFTR, both for the current miscompile, and for some cases which we should be able to handle via simple reasoning.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362976 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-10 19:45:59 +00:00
Philip Reames
4d44152a0f [LFTR] Use recomputed BE count
This was discussed as part of D62880.  The basic thought is that computing BE taken count after widening should produce (on average) an equally good backedge taken count as the one before widening.  Since there's only one test in the suite which is impacted by this change, and it's essentially equivelent codegen, that seems to be a reasonable assertion.  This change was separated from r362971 so that if this turns out to be problematic, the triggering piece is obvious and easily revertable.

For the nestedIV example from elim-extend.ll, we end up with the following BE counts:
BEFORE: (-2 + (-1 * %innercount) + %limit)
AFTER: (-1 + (sext i32 (-1 + %limit) to i64) + (-1 * (sext i32 %innercount to i64))<nsw>)

Note that before is an i32 type, and the after is an i64.  Truncating the i64 produces the i32. 



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362975 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-10 19:18:53 +00:00
Benjamin Kramer
600f7b5a8d Revert "[SCEV] Use wrap flags in InsertBinop"
This reverts commit r362687. Miscompiles llvm-profdata during selfhost.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362699 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 12:35:46 +00:00
Sam Parker
6892c31cf9 [SCEV] Use wrap flags in InsertBinop
If the given SCEVExpr has no (un)signed flags attached to it, transfer
these to the resulting instruction or use them to find an existing
instruction.

Differential Revision: https://reviews.llvm.org/D61934


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362687 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-06 08:56:26 +00:00
Philip Reames
999ea6229b [Tests] Add poison inference tests for indvars showing both existing transforms, and some room for improvement
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362628 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-05 18:00:59 +00:00
Philip Reames
2b5c011fba [Tests] Autogen a test so future changes are visible
Oddly, I had to change a value name from "tmp0" to "bc0" to get the autogened test to pass.  I'm putting this down to an oddity of update_test_checks or FileCheck, but don't understand it.



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362532 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 17:29:55 +00:00
Philip Reames
453fdb5df4 [Tests] Update a test to consistently use new pass manager and FileCheck the result
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362518 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 16:19:34 +00:00
Philip Reames
782c705555 [Tests] Autogen tests so that diffs for a future change are understandable
git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362516 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-04 16:15:19 +00:00
Philip Reames
b58d51e15c [Tests] Add LFTR tests for multiple exit loops (try 2)
(Recommit after fixing a keymash in the run line.  Sorry for breakage.)

This is preparation for D62625 <https://reviews.llvm.org/D62625>



git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362426 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 17:41:12 +00:00
Dmitri Gribenko
4e8306274d Revert "[Tests] Add LFTR tests for multiple exit loops"
This reverts commit r362417.  There's a syntax error in the RUN line.

git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@362418 91177308-0d34-0410-b5e6-96231b3b80d8
2019-06-03 16:58:11 +00:00