102 Commits

Author SHA1 Message Date
Sanjoy Das
2136a5bc49 Revert "Relax constraints for reduction vectorization"
This reverts commit r355868.  Breaks hexagon.

llvm-svn: 355873
2019-03-11 22:37:31 +00:00
Sanjoy Das
93f8cc186a Relax constraints for reduction vectorization
Summary:
Gating vectorizing reductions on *all* fastmath flags seems unnecessary;
`reassoc` should be sufficient.

Reviewers: tvvikram, mkuper, kristof.beyls, sdesmalen, Ayal

Reviewed By: sdesmalen

Subscribers: dcaballe, huntergr, jmolloy, mcrosier, jlebar, bixia, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57728

llvm-svn: 355868
2019-03-11 21:36:41 +00:00
Chijun Sima
f131d6110e [DTU] Deprecate insertEdge*/deleteEdge*
Summary: This patch converts all existing `insertEdge*/deleteEdge*` to `applyUpdates` and marks `insertEdge*/deleteEdge*` as deprecated.

Reviewers: kuhar, brzycki

Reviewed By: kuhar, brzycki

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D58443

llvm-svn: 354652
2019-02-22 05:41:43 +00:00
Alina Sbirlea
97468e9282 [MemorySSA & LoopPassManager] Update MemorySSA in formDedicatedExitBlocks.
MemorySSA is now updated when forming dedicated exit blocks.
Resolves PR40037.

llvm-svn: 354623
2019-02-21 21:13:34 +00:00
Craig Topper
784929d045 Implementation of asm-goto support in LLVM
This patch accompanies the RFC posted here:
http://lists.llvm.org/pipermail/llvm-dev/2018-October/127239.html

This patch adds a new CallBr IR instruction to support asm-goto
inline assembly like gcc as used by the linux kernel. This
instruction is both a call instruction and a terminator
instruction with multiple successors. Only inline assembly
usage is supported today.

This also adds a new INLINEASM_BR opcode to SelectionDAG and
MachineIR to represent an INLINEASM block that is also
considered a terminator instruction.

There will likely be more bug fixes and optimizations to follow
this, but we felt it had reached a point where we would like to
switch to an incremental development model.

Patch by Craig Topper, Alexander Ivchenko, Mikhail Dvoretckii

Differential Revision: https://reviews.llvm.org/D53765

llvm-svn: 353563
2019-02-08 20:48:56 +00:00
Richard Trieu
5f436fc57a Move DomTreeUpdater from IR to Analysis
DomTreeUpdater depends on headers from Analysis, but is in IR.  This is a
layering violation since Analysis depends on IR.  Relocate this code from IR
to Analysis to fix the layering violation.

llvm-svn: 353265
2019-02-06 02:52:52 +00:00
Michael Kruse
70560a0a2c [WarnMissedTransforms] Do not warn about already vectorized loops.
LoopVectorize adds llvm.loop.isvectorized, but leaves
llvm.loop.vectorize.enable. Do not consider such a loop for user-forced
vectorization since vectorization already happened -- by prioritizing
llvm.loop.isvectorized except for TM_SuppressedByUser.

Fixes http://llvm.org/PR40546

Differential Revision: https://reviews.llvm.org/D57542

llvm-svn: 353082
2019-02-04 19:55:59 +00:00
Alina Sbirlea
f9027e554a Check bool attribute value in getOptionalBoolLoopAttribute.
Summary:
Check the bool value of the attribute in getOptionalBoolLoopAttribute
not just its existance.
Eliminates the warning noise generated when vectorization is explicitly disabled.

Reviewers: Meinersbur, hfinkel, dmgreen

Subscribers: jlebar, sanjoy, llvm-commits

Differential Revision: https://reviews.llvm.org/D57260

llvm-svn: 352555
2019-01-29 22:33:20 +00:00
Chandler Carruth
2946cd7010 Update the file headers across all of the LLVM projects in the monorepo
to reflect the new license.

We understand that people may be surprised that we're moving the header
entirely to discuss the new license. We checked this carefully with the
Foundation's lawyer and we believe this is the correct approach.

Essentially, all code in the project is now made available by the LLVM
project under our new license, so you will see that the license headers
include that license only. Some of our contributors have contributed
code under our old license, and accordingly, we have retained a copy of
our old license notice in the top-level files in each project and
repository.

llvm-svn: 351636
2019-01-19 08:50:56 +00:00
Max Kazantsev
a78dc4d6c8 [NFC] Move some functions to LoopUtils
llvm-svn: 351179
2019-01-15 09:51:34 +00:00
Michael Kruse
978ba61536 Introduce llvm.loop.parallel_accesses and llvm.access.group metadata.
The current llvm.mem.parallel_loop_access metadata has a problem in that
it uses LoopIDs. LoopID unfortunately is not loop identifier. It is
neither unique (there's even a regression test assigning the some LoopID
to multiple loops; can otherwise happen if passes such as LoopVersioning
make copies of entire loops) nor persistent (every time a property is
removed/added from a LoopID's MDNode, it will also receive a new LoopID;
this happens e.g. when calling Loop::setLoopAlreadyUnrolled()).
Since most loop transformation passes change the loop attributes (even
if it just to mark that a loop should not be processed again as
llvm.loop.isvectorized does, for the versioned and unversioned loop),
the parallel access information is lost for any subsequent pass.

This patch unlinks LoopIDs and parallel accesses.
llvm.mem.parallel_loop_access metadata on instruction is replaced by
llvm.access.group metadata. llvm.access.group points to a distinct
MDNode with no operands (avoiding the problem to ever need to add/remove
operands), called "access group". Alternatively, it can point to a list
of access groups. The LoopID then has an attribute
llvm.loop.parallel_accesses with all the access groups that are parallel
(no dependencies carries by this loop).

This intentionally avoid any kind of "ID". Loops that are clones/have
their attributes modifies retain the llvm.loop.parallel_accesses
attribute. Access instructions that a cloned point to the same access
group. It is not necessary for each access to have it's own "ID" MDNode,
but those memory access instructions with the same behavior can be
grouped together.

The behavior of llvm.mem.parallel_loop_access is not changed by this
patch, but should be considered deprecated.

Differential Revision: https://reviews.llvm.org/D52116

llvm-svn: 349725
2018-12-20 04:58:07 +00:00
Davide Italiano
9737096bb1 [LoopUtils] Use i32 instead of void.
The actual type of the first argument of the @dbg intrinsic
doesn't really matter as we're setting it to `undef`, but the
bitcode reader is picky about `void` types.

llvm-svn: 349069
2018-12-13 18:37:23 +00:00
Davide Italiano
8ee59ca653 [LoopUtils] Prefer a set over a map. NFCI.
llvm-svn: 348999
2018-12-13 01:11:52 +00:00
Davide Italiano
744c3c327f [LoopDeletion] Update debug values after loop deletion.
When loops are deleted, we don't keep track of variables modified inside
the loops, so the DI will contain the wrong value for these.

e.g.

int b() {

int i;
for (i = 0; i < 2; i++)
  ;
patatino();
return a;
-> 6 patatino();

7     return a;
8   }
9   int main() { b(); }
(lldb) frame var i
(int) i = 0

We mark instead these values as unavailable inserting a
@llvm.dbg.value(undef to make sure we don't end up printing an incorrect
value in the debugger. We could consider doing something fancier,
for, e.g. constants, in the future.

PR39868.
rdar://problem/46418795)

Differential Revision: https://reviews.llvm.org/D55299

llvm-svn: 348988
2018-12-12 23:32:35 +00:00
Michael Kruse
7244852557 [Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.
When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.

    #pragma clang loop unroll_and_jam(enable)
    #pragma clang loop distribute(enable)

is the same as

    #pragma clang loop distribute(enable)
    #pragma clang loop unroll_and_jam(enable)

and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.

This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,

    !0 = !{!0, !1, !2}
    !1 = !{!"llvm.loop.unroll_and_jam.enable"}
    !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
    !3 = !{!"llvm.loop.distribute.enable"}

defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.

Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.

For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.

Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.

To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.

With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).

Reviewed By: hfinkel, dmgreen

Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288

llvm-svn: 348944
2018-12-12 17:32:52 +00:00
Reid Kleckner
41390b47de Revert r346810 "Preserve loop metadata when splitting exit blocks"
It broke the Windows self-host:
http://lab.llvm.org:8011/builders/clang-x64-windows-msvc/builds/1457

llvm-svn: 346823
2018-11-14 01:47:32 +00:00
Craig Topper
3c87c2a3c5 Preserve loop metadata when splitting exit blocks
LoopUtils.cpp contains a utility that splits an loop exit block, so that the new block contains only edges coming from the loop. In the case of nested loops, the exit path for the inner loop might also be the back-edge of the outer loop. The new block which is inserted on this path, is now a latch for the outer loop, and it needs to hold the loop metadata for the outer loop. (The test case gives a more concrete view of the situation.)

Patch by Chang Lin (clin1)

Differential Revision: https://reviews.llvm.org/D53876

llvm-svn: 346810
2018-11-13 23:06:49 +00:00
Vikram TV
7e98d69847 Break LoopUtils into an Analysis file.
Summary:
The InductionDescriptor and RecurrenceDescriptor classes basically analyze the IR to identify the respective IVs. So, it is better to have them in the "Analysis" directory instead of the "Transforms" directory.

The rationale for this is to make the Induction and Recurrence descriptor classes available for analysis passes. Currently including them in an analysis pass produces link error (http://lists.llvm.org/pipermail/llvm-dev/2018-July/124456.html).

Induction and Recurrence descriptors are moved from Transforms/Utils/LoopUtils.h|cpp to Analysis/IVDescriptors.h|cpp.

Reviewers: dmgreen, llvm-commits, hfinkel

Reviewed By: dmgreen

Subscribers: mgorny

Differential Revision: https://reviews.llvm.org/D51153

llvm-svn: 342016
2018-09-12 01:59:43 +00:00
Vikram TV
09be521d4d Move a transformation routine from LoopUtils to LoopVectorize.
Summary:
Move InductionDescriptor::transform() routine from LoopUtils to its only uses in LoopVectorize.cpp.
Specifically, the function is renamed as InnerLoopVectorizer::emitTransformedIndex().

This is a child to D51153.

Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51837

llvm-svn: 341776
2018-09-10 06:16:44 +00:00
Vikram TV
6594dc377d Move createMinMaxOp() out of RecurrenceDescriptor.
Reviewers: dmgreen, llvm-commits

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D51838

llvm-svn: 341773
2018-09-10 05:05:08 +00:00
Alina Sbirlea
ab6f84f763 Update MemorySSA in BasicBlockUtils.
Summary:
Extend BasicBlocksUtils to update MemorySSA.

Subscribers: sanjoy, arsenm, nhaehnle, jlebar, Prazek, llvm-commits

Differential Revision: https://reviews.llvm.org/D45300

llvm-svn: 340365
2018-08-21 23:32:03 +00:00
David Green
6cb6478739 [UnJ] Rename hasInvariantIterationCount to hasIterationCountInvariantInParent NFC
This hopefully describes the API of the function more precisely.

llvm-svn: 339762
2018-08-15 10:59:41 +00:00
David Green
395b80cd3c [UnJ] Create a hasInvariantIterationCount function. NFC
Pulled out a separate function for some code that calculates
if an inner loop iteration count is invariant to it's outer
loop.

Differential Revision: https://reviews.llvm.org/D50063

llvm-svn: 339500
2018-08-11 06:57:28 +00:00
Chijun Sima
21a8b605a1 [Dominators] Convert existing passes and utils to use the DomTreeUpdater class
Summary:
This patch is the second in a series of patches related to the [[ http://lists.llvm.org/pipermail/llvm-dev/2018-June/123883.html | RFC - A new dominator tree updater for LLVM ]].

It converts passes (e.g. adce/jump-threading) and various functions which currently accept DDT in local.cpp and BasicBlockUtils.cpp to use the new DomTreeUpdater class.
These converted functions in utils can accept DomTreeUpdater with either UpdateStrategy and can deal with both DT and PDT held by the DomTreeUpdater.

Reviewers: brzycki, kuhar, dmgreen, grosser, davide

Reviewed By: brzycki

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D48967

llvm-svn: 338814
2018-08-03 05:08:17 +00:00
Nicola Zaghen
d34e60ca85 Rename DEBUG macro to LLVM_DEBUG.
The DEBUG() macro is very generic so it might clash with other projects.
The renaming was done as follows:
- git grep -l 'DEBUG' | xargs sed -i 's/\bDEBUG\s\?(/LLVM_DEBUG(/g'
- git diff -U0 master | ../clang/tools/clang-format/clang-format-diff.py -i -p1 -style LLVM
- Manual change to APInt
- Manually chage DOCS as regex doesn't match it.

In the transition period the DEBUG() macro is still present and aliased
to the LLVM_DEBUG() one.

Differential Revision: https://reviews.llvm.org/D43624

llvm-svn: 332240
2018-05-14 12:53:11 +00:00
Adrian Prantl
5f8f34e459 Remove \brief commands from doxygen comments.
We've been running doxygen with the autobrief option for a couple of
years now. This makes the \brief markers into our comments
redundant. Since they are a visual distraction and we don't want to
encourage more \brief markers in new code either, this patch removes
them all.

Patch produced by

  for i in $(git grep -l '\\brief'); do perl -pi -e 's/\\brief //g' $i & done

Differential Revision: https://reviews.llvm.org/D46290

llvm-svn: 331272
2018-05-01 15:54:18 +00:00
Simon Pilgrim
23c2182c2b Support generic expansion of ordered vector reduction (PR36732)
Without the fast math flags, the llvm.experimental.vector.reduce.fadd/fmul intrinsic expansions must be expanded in order.

This patch scalarizes the reduction, applying the accumulator at the start of the sequence: ((((Acc + Scl[0]) + Scl[1]) + Scl[2]) + ) ... + Scl[NumElts-1]

Differential Revision: https://reviews.llvm.org/D45366

llvm-svn: 329585
2018-04-09 15:44:20 +00:00
Simon Pilgrim
a74f4ae404 Strip trailing whitespace. NFCI.
llvm-svn: 329421
2018-04-06 17:01:54 +00:00
Philip Reames
23aed5ef6f [MustExecute] Move isGuaranteedToExecute and related rourtines to Analysis
Next step is to actually merge the implementations and get both implementations tested through the new printer.

llvm-svn: 328055
2018-03-20 22:45:23 +00:00
Philip Reames
8a106272e8 [LICM/mustexec] Extend first iteration must execute logic to fcmps
This builds on the work from https://reviews.llvm.org/D44287. It turned out supporting fcmp was much easier than I realized, so let's do that now.

As an aside, our -O3 handling of a floating point IVs leaves a lot to be desired. We do convert the float IV to an integer IV, but do so late enough that many other optimizations are missed (e.g. we don't vectorize).

Differential Revision: https://reviews.llvm.org/D44542

llvm-svn: 327722
2018-03-16 16:33:49 +00:00
Philip Reames
a21d5f1e18 [LICM] Ignore exits provably not taken on first iteration when computing must execute
It is common to have conditional exits within a loop which are known not to be taken on some iterations, but not necessarily all. This patches extends our reasoning around guaranteed to execute (used when establishing whether it's safe to dereference a location from the preheader) to handle the case where an exit is known not to be taken on the first iteration and the instruction of interest *is* known to be taken on the first iteration.

This case comes up in two major ways:
* If we have a range check which we've been unable to eliminate, we frequently know that it doesn't fail on the first iteration.
* Pass ordering. We may have a check which will be eliminated through some sequence of other passes, but depending on the exact pass sequence we might never actually do so or we might miss other optimizations from passes run before the check is finally eliminated.

The initial version (here) is implemented via InstSimplify. At the moment, it catches a few cases, but misses a lot too. I added test cases for missing cases in InstSimplify which I'll follow up on separately. Longer term, we should probably wire SCEV through to here to get much smarter loop aware simplification of the first iteration predicate.

Differential Revision: https://reviews.llvm.org/D44287

llvm-svn: 327664
2018-03-15 21:04:28 +00:00
Philip Reames
fbffd126b8 [NFC] Factor out a helper function for checking if a block has a potential early implicit exit.
llvm-svn: 327065
2018-03-08 21:25:30 +00:00
David Green
0d5f9651f2 Move llvm::computeLoopSafetyInfo from LICM.cpp to LoopUtils.cpp. NFC
Move computeLoopSafetyInfo, defined in Transforms/Utils/LoopUtils.h,
into the corresponding LoopUtils.cpp, as opposed to LICM where it resides
at the moment. This will allow other functions from Transforms/Utils
to reference it.

llvm-svn: 325151
2018-02-14 18:34:53 +00:00
Chad Rosier
a097bc69df [LV] Use Demanded Bits and ValueTracking for reduction type-shrinking
The type-shrinking logic in reduction detection, although narrow in scope, is
also rather ad-hoc, which has led to bugs (e.g., PR35734). This patch modifies
the approach to rely on the demanded bits and value tracking analyses, if
available. We currently perform type-shrinking separately for reductions and
other instructions in the loop. Long-term, we should probably think about
computing minimal bit widths in a more complete way for the loops we want to
vectorize.

PR35734
Differential Revision: https://reviews.llvm.org/D42309

llvm-svn: 324195
2018-02-04 15:42:24 +00:00
Hiroshi Inoue
d24ddcd6c4 [NFC] fix trivial typos in comments
"the the" -> "the"

llvm-svn: 322934
2018-01-19 10:55:29 +00:00
Serguei Katkov
a757d65cec [LoopDeletion] Handle users in unreachable block
This is a fix for PR35884.

When we want to delete dead loop we must clean uses in unreachable blocks
otherwise we'll get an assert during deletion of instructions from the loop.

Reviewers: anna, davide
Reviewed By: anna
Subscribers: llvm-commits, lebedev.ri
Differential Revision: https://reviews.llvm.org/D41943

llvm-svn: 322357
2018-01-12 07:24:43 +00:00
Benjamin Kramer
c7fc81e659 Use phi ranges to simplify code. No functionality change intended.
llvm-svn: 321585
2017-12-30 15:27:33 +00:00
Benjamin Kramer
802e6255b2 Make helpers static. No functionality change.
llvm-svn: 321425
2017-12-24 12:46:22 +00:00
Dorit Nuzman
4750c785b3 [LV] Support efficient vectorization of an induction with redundant casts
D30041 extended SCEVPredicateRewriter to improve handling of Phi nodes whose
update chain involves casts; PSCEV can now build an AddRecurrence for some
forms of such phi nodes, under the proper runtime overflow test. This means
that we can identify such phi nodes as an induction, and the loop-vectorizer
can now vectorize such inductions, however inefficiently. The vectorizer
doesn't know that it can ignore the casts, and so it vectorizes them.

This patch records the casts in the InductionDescriptor, so that they could
be marked to be ignored for cost calculation (we use VecValuesToIgnore for
that) and ignored for vectorization/widening/scalarization (i.e. treated as
TriviallyDead).

In addition to marking all these casts to be ignored, we also need to make
sure that each cast is mapped to the right vector value in the vector loop body
(be it a widened, vectorized, or scalarized induction). So whenever an
induction phi is mapped to a vector value (during vectorization/widening/
scalarization), we also map the respective cast instruction (if exists) to that
vector value. (If the phi-update sequence of an induction involves more than one
cast, then the above mapping to vector value is relevant only for the last cast
of the sequence as we allow only the "last cast" to be used outside the
induction update chain itself).

This is the last step in addressing PR30654.

llvm-svn: 320672
2017-12-14 07:56:31 +00:00
Sanjay Patel
3e069f5724 [LoopUtils] simplify createTargetReduction(); NFCI
llvm-svn: 319946
2017-12-06 19:37:00 +00:00
Sanjay Patel
1ea7b6f7a1 [LoopUtils] fix variable name to match FMF vocabulary; NFC
llvm-svn: 319928
2017-12-06 19:11:23 +00:00
Sanjay Patel
629c411538 [IR] redefine 'UnsafeAlgebra' / 'reassoc' fast-math-flags and add 'trans' fast-math-flag
As discussed on llvm-dev:
http://lists.llvm.org/pipermail/llvm-dev/2016-November/107104.html
and again more recently:
http://lists.llvm.org/pipermail/llvm-dev/2017-October/118118.html

...this is a step in cleaning up our fast-math-flags implementation in IR to better match
the capabilities of both clang's user-visible flags and the backend's flags for SDNode.

As proposed in the above threads, we're replacing the 'UnsafeAlgebra' bit (which had the 
'umbrella' meaning that all flags are set) with a new bit that only applies to algebraic 
reassociation - 'AllowReassoc'.

We're also adding a bit to allow approximations for library functions called 'ApproxFunc' 
(this was initially proposed as 'libm' or similar).

...and we're out of bits. 7 bits ought to be enough for anyone, right? :) FWIW, I did 
look at getting this out of SubclassOptionalData via SubclassData (spacious 16-bits), 
but that's apparently already used for other purposes. Also, I don't think we can just 
add a field to FPMathOperator because Operator is not intended to be instantiated. 
We'll defer movement of FMF to another day.

We keep the 'fast' keyword. I thought about removing that, but seeing IR like this:
%f.fast = fadd reassoc nnan ninf nsz arcp contract afn float %op1, %op2
...made me think we want to keep the shortcut synonym.

Finally, this change is binary incompatible with existing IR as seen in the 
compatibility tests. This statement:
"Newer releases can ignore features from older releases, but they cannot miscompile 
them. For example, if nsw is ever replaced with something else, dropping it would be 
a valid way to upgrade the IR." 
( http://llvm.org/docs/DeveloperPolicy.html#ir-backwards-compatibility )
...provides the flexibility we want to make this change without requiring a new IR 
version. Ie, we're not loosening the FP strictness of existing IR. At worst, we will 
fail to optimize some previously 'fast' code because it's no longer recognized as 
'fast'. This should get fixed as we audit/squash all of the uses of 'isFast()'.

Note: an inter-dependent clang commit to use the new API name should closely follow 
commit.

Differential Revision: https://reviews.llvm.org/D39304

llvm-svn: 317488
2017-11-06 16:27:15 +00:00
Hans Wennborg
899809d531 Fix a -Wparentheses warning. NFC.
llvm-svn: 314936
2017-10-04 21:14:07 +00:00
Marcello Maggioni
df3e71e037 [LoopDeletion] Move deleteDeadLoop to to LoopUtils. NFC
llvm-svn: 314934
2017-10-04 20:42:46 +00:00
Alina Sbirlea
7ed5856a32 Refactor collectChildrenInLoop to LoopUtils [NFC]
Summary: Move to LoopUtils method that collects all children of a node inside a loop.

Reviewers: majnemer, sanjoy

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D37870

llvm-svn: 313322
2017-09-15 00:04:16 +00:00
Ayal Zaks
25e2800e20 [LV] Minor savings to Sink casts to unravel first order recurrence
Two minor savings: avoid copying the SinkAfter map and avoid moving a cast if it
is not needed.

Differential Revision: https://reviews.llvm.org/D36408

llvm-svn: 310910
2017-08-15 08:32:59 +00:00
Dinar Temirbulatov
a61f4b8957 [LoopUtils] Add an extra parameter OpValue to propagateIRFlags function,
If OpValue is non-null, we only consider operations similar to OpValue
when intersecting.

Differential Revision: https://reviews.llvm.org/D35292

llvm-svn: 308428
2017-07-19 10:02:07 +00:00
Ayal Zaks
2ff59d4350 [LV] Sink casts to unravel first order recurrence
Check if a single cast is preventing handling a first-order-recurrence Phi,
because the scheduling constraints it imposes on the first-order-recurrence
shuffle are infeasible; but they can be made feasible by moving the cast
downwards. Record such casts and move them when vectorizing the loop.

Differential Revision: https://reviews.llvm.org/D33058

llvm-svn: 306884
2017-06-30 21:05:06 +00:00
Chandler Carruth
4a000883c7 [LoopSimplify] Re-instate r306081 with a bug fix w.r.t. indirectbr.
This was reverted in r306252, but I already had the bug fixed and was
just trying to form a test case.

The original commit factored the logic for forming dedicated exits
inside of LoopSimplify into a helper that could be used elsewhere and
with an approach that required fewer intermediate data structures. See
that commit for full details including the change to the statistic, etc.

The code looked fine to me and my reviewers, but in fact didn't handle
indirectbr correctly -- it left the 'InLoopPredecessors' vector dirty.

If you have code that looks *just* right, you can end up leaking these
predecessors into a subsequent rewrite, and crash deep down when trying
to update PHI nodes for predecessors that don't exist.

I've added an assert that makes the bug much more obvious, and then
changed the code to reliably clear the vector so we don't get this bug
again in some other form as the code changes.

I've also added a test case that *does* manage to catch this while also
giving some nice positive coverage in the face of indirectbr.

The real code that found this came out of what I think is CPython's
interpreter loop, but any code with really "creative" interpreter loops
mixing indirectbr and other exit paths could manage to tickle the bug.
I was hard to reduce the original test case because in addition to
having a particular pattern of IR, the whole thing depends on the order
of the predecessors which is in turn depends on use list order. The test
case added here was designed so that in multiple different predecessor
orderings it should always end up going down the same path and tripping
the same bug. I hope. At least, it tripped it for me without
manipulating the use list order which is better than anything bugpoint
could do...

llvm-svn: 306257
2017-06-25 22:45:31 +00:00
Daniel Jasper
4c6cd4ccb7 Revert "[LoopSimplify] Factor the logic to form dedicated exits into a utility."
This leads to a segfault. Chandler already has a test case and should be
able to recommit with a fix soon.

llvm-svn: 306252
2017-06-25 17:58:25 +00:00