[Unroll/UnrollAndJam/Vectorizer/Distribute] Add followup loop attributes.

When multiple loop transformation are defined in a loop's metadata, their order of execution is defined by the order of their respective passes in the pass pipeline. For instance, e.g.

    #pragma clang loop unroll_and_jam(enable)
    #pragma clang loop distribute(enable)

is the same as

    #pragma clang loop distribute(enable)
    #pragma clang loop unroll_and_jam(enable)

and will try to loop-distribute before Unroll-And-Jam because the LoopDistribute pass is scheduled after UnrollAndJam pass. UnrollAndJamPass only supports one inner loop, i.e. it will necessarily fail after loop distribution. It is not possible to specify another execution order. Also,t the order of passes in the pipeline is subject to change between versions of LLVM, optimization options and which pass manager is used.

This patch adds 'followup' attributes to various loop transformation passes. These attributes define which attributes the resulting loop of a transformation should have. For instance,

    !0 = !{!0, !1, !2}
    !1 = !{!"llvm.loop.unroll_and_jam.enable"}
    !2 = !{!"llvm.loop.unroll_and_jam.followup_inner", !3}
    !3 = !{!"llvm.loop.distribute.enable"}

defines a loop ID (!0) to be unrolled-and-jammed (!1) and then the attribute !3 to be added to the jammed inner loop, which contains the instruction to distribute the inner loop.

Currently, in both pass managers, pass execution is in a fixed order and UnrollAndJamPass will not execute again after LoopDistribute. We hope to fix this in the future by allowing pass managers to run passes until a fixpoint is reached, use Polly to perform these transformations, or add a loop transformation pass which takes the order issue into account.

For mandatory/forced transformations (e.g. by having been declared by #pragma omp simd), the user must be notified when a transformation could not be performed. It is not possible that the responsible pass emits such a warning because the transformation might be 'hidden' in a followup attribute when it is executed, or it is not present in the pipeline at all. For this reason, this patche introduces a WarnMissedTransformations pass, to warn about orphaned transformations.

Since this changes the user-visible diagnostic message when a transformation is applied, two test cases in the clang repository need to be updated.

To ensure that no other transformation is executed before the intended one, the attribute `llvm.loop.disable_nonforced` can be added which should disable transformation heuristics before the intended transformation is applied. E.g. it would be surprising if a loop is distributed before a #pragma unroll_and_jam is applied.

With more supported code transformations (loop fusion, interchange, stripmining, offloading, etc.), transformations can be used as building blocks for more complex transformations (e.g. stripmining+stripmining+interchange -> tiling).

Reviewed By: hfinkel, dmgreen

Differential Revision: https://reviews.llvm.org/D49281
Differential Revision: https://reviews.llvm.org/D55288


git-svn-id: https://llvm.org/svn/llvm-project/llvm/trunk@348944 91177308-0d34-0410-b5e6-96231b3b80d8
This commit is contained in:
Michael Kruse 2018-12-12 17:32:52 +00:00
parent e004ab80e8
commit 9a395de086
56 changed files with 2470 additions and 119 deletions

View File

@ -5076,6 +5076,8 @@ optimizations related to compare and branch instructions. The metadata
is treated as a boolean value; if it exists, it signals that the branch
or switch that it is attached to is completely unpredictable.
.. _llvm.loop:
'``llvm.loop``'
^^^^^^^^^^^^^^^
@ -5109,6 +5111,26 @@ suggests an unroll factor to the loop unroller:
!0 = !{!0, !1}
!1 = !{!"llvm.loop.unroll.count", i32 4}
'``llvm.loop.disable_nonforced``'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata disables all optional loop transformations unless
explicitly instructed using other transformation metdata such as
``llvm.loop.unroll.enable''. That is, no heuristic will try to determine
whether a transformation is profitable. The purpose is to avoid that the
loop is transformed to a different loop before an explicitly requested
(forced) transformation is applied. For instance, loop fusion can make
other transformations impossible. Mandatory loop canonicalizations such
as loop rotation are still applied.
It is recommended to use this metadata in addition to any llvm.loop.*
transformation directive. Also, any loop should have at most one
directive applied to it (and a sequence of transformations built using
followup-attributes). Otherwise, which transformation will be applied
depends on implementation details such as the pass pipeline order.
See :ref:`transformation-metadata` for details.
'``llvm.loop.vectorize``' and '``llvm.loop.interleave``'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -5167,6 +5189,29 @@ vectorization of the loop. If ``llvm.loop.vectorize.width`` is set to
0 or if the loop does not have this metadata the width will be
determined automatically.
'``llvm.loop.vectorize.followup_vectorized``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which loop attributes the vectorized loop will
have. See :ref:`transformation-metadata` for details.
'``llvm.loop.vectorize.followup_epilogue``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which loop attributes the epilogue will have. The
epilogue is not vectorized and is executed when either the vectorized
loop is not known to preserve semantics (because e.g., it processes two
arrays that are found to alias by a runtime check) or for the last
iterations that do not fill a complete set of vector lanes. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.vectorize.followup_all``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Attributes in the metadata will be added to both the vectorized and
epilogue loop.
See :ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.unroll``'
^^^^^^^^^^^^^^^^^^^^^^
@ -5235,6 +5280,19 @@ For example:
!0 = !{!"llvm.loop.unroll.full"}
'``llvm.loop.unroll.followup``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which loop attributes the unrolled loop will have.
See :ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.unroll.followup_remainder``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which loop attributes the remainder loop after
partial/runtime unrolling will have. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.unroll_and_jam``'
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -5288,6 +5346,43 @@ string ``llvm.loop.unroll_and_jam.enable``. For example:
!0 = !{!"llvm.loop.unroll_and_jam.enable"}
'``llvm.loop.unroll_and_jam.followup_outer``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which loop attributes the outer unrolled loop will
have. See :ref:`Transformation Metadata <transformation-metadata>` for
details.
'``llvm.loop.unroll_and_jam.followup_inner``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which loop attributes the inner jammed loop will
have. See :ref:`Transformation Metadata <transformation-metadata>` for
details.
'``llvm.loop.unroll_and_jam.followup_remainder_outer``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which attributes the epilogue of the outer loop
will have. This loop is usually unrolled, meaning there is no such
loop. This attribute will be ignored in this case. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.unroll_and_jam.followup_remainder_inner``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which attributes the inner loop of the epilogue
will have. The outer epilogue will usually be unrolled, meaning there
can be multiple inner remainder loops. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.unroll_and_jam.followup_all``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Attributes specified in the metadata is added to all
``llvm.loop.unroll_and_jam.*`` loops. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.licm_versioning.disable``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
@ -5320,6 +5415,34 @@ enabled. A value of 0 disables distribution:
This metadata should be used in conjunction with ``llvm.loop`` loop
identification metadata.
'``llvm.loop.distribute.followup_coincident``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which attributes extracted loops with no cyclic
dependencies will have (i.e. can be vectorized). See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.distribute.followup_sequential``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This metadata defines which attributes the isolated loops with unsafe
memory dependencies will have. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.distribute.followup_fallback``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
If loop versioning is necessary, this metadata defined the attributes
the non-distributed fallback version will have. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.loop.distribute.followup_all``' Metadata
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Thes attributes in this metdata is added to all followup loops of the
loop distribution pass. See
:ref:`Transformation Metadata <transformation-metadata>` for details.
'``llvm.mem``'
^^^^^^^^^^^^^^^

View File

@ -1224,3 +1224,8 @@ Displays the post dominator tree using the GraphViz tool.
Displays the post dominator tree using the GraphViz tool, but omitting function
bodies.
``-transform-warning``: Report missed forced transformations
------------------------------------------------------------
Emits warnings about not yet applied forced transformations (e.g. from
``#pragma omp simd``).

441
docs/TransformMetadata.rst Normal file
View File

@ -0,0 +1,441 @@
.. _transformation-metadata:
============================
Code Transformation Metadata
============================
.. contents::
:local:
Overview
========
LLVM transformation passes can be controlled by attaching metadata to
the code to transform. By default, transformation passes use heuristics
to determine whether or not to perform transformations, and when doing
so, other details of how the transformations are applied (e.g., which
vectorization factor to select).
Unless the optimizer is otherwise directed, transformations are applied
conservatively. This conservatism generally allows the optimizer to
avoid unprofitable transformations, but in practice, this results in the
optimizer not applying transformations that would be highly profitable.
Frontends can give additional hints to LLVM passes on which
transformations they should apply. This can be additional knowledge that
cannot be derived from the emitted IR, or directives passed from the
user/programmer. OpenMP pragmas are an example of the latter.
If any such metadata is dropped from the program, the code's semantics
must not change.
Metadata on Loops
=================
Attributes can be attached to loops as described in :ref:`llvm.loop`.
Attributes can describe properties of the loop, disable transformations,
force specific transformations and set transformation options.
Because metadata nodes are immutable (with the exception of
``MDNode::replaceOperandWith`` which is dangerous to use on uniqued
metadata), in order to add or remove a loop attributes, a new ``MDNode``
must be created and assigned as the new ``llvm.loop`` metadata. Any
connection between the old ``MDNode`` and the loop is lost. The
``llvm.loop`` node is also used as LoopID (``Loop::getLoopID()``), i.e.
the loop effectively gets a new identifier. For instance,
``llvm.mem.parallel_loop_access`` references the LoopID. Therefore, if
the parallel access property is to be preserved after adding/removing
loop attributes, any ``llvm.mem.parallel_loop_access`` reference must be
updated to the new LoopID.
Transformation Metadata Structure
=================================
Some attributes describe code transformations (unrolling, vectorizing,
loop distribution, etc.). They can either be a hint to the optimizer
that a transformation might be beneficial, instruction to use a specific
option, , or convey a specific request from the user (such as
``#pragma clang loop`` or ``#pragma omp simd``).
If a transformation is forced but cannot be carried-out for any reason,
an optimization-missed warning must be emitted. Semantic information
such as a transformation being safe (e.g.
``llvm.mem.parallel_loop_access``) can be unused by the optimizer
without generating a warning.
Unless explicitly disabled, any optimization pass may heuristically
determine whether a transformation is beneficial and apply it. If
metadata for another transformation was specified, applying a different
transformation before it might be inadvertent due to being applied on a
different loop or the loop not existing anymore. To avoid having to
explicitly disable an unknown number of passes, the attribute
``llvm.loop.disable_nonforced`` disables all optional, high-level,
restructuring transformations.
The following example avoids the loop being altered before being
vectorized, for instance being unrolled.
.. code-block:: llvm
br i1 %exitcond, label %for.exit, label %for.header, !llvm.loop !0
...
!0 = distinct !{!0, !1, !2}
!1 = !{!"llvm.loop.vectorize.enable", i1 true}
!2 = !{!"llvm.loop.disable_nonforced"}
After a transformation is applied, follow-up attributes are set on the
transformed and/or new loop(s). This allows additional attributes
including followup-transformations to be specified. Specifying multiple
transformations in the same metadata node is possible for compatibility
reasons, but their execution order is undefined. For instance, when
``llvm.loop.vectorize.enable`` and ``llvm.loop.unroll.enable`` are
specified at the same time, unrolling may occur either before or after
vectorization.
As an example, the following instructs a loop to be vectorized and only
then unrolled.
.. code-block:: llvm
!0 = distinct !{!0, !1, !2, !3}
!1 = !{!"llvm.loop.vectorize.enable", i1 true}
!2 = !{!"llvm.loop.disable_nonforced"}
!3 = !{!"llvm.loop.vectorize.followup_vectorized", !{"llvm.loop.unroll.enable"}}
If, and only if, no followup is specified, the pass may add attributes itself.
For instance, the vectorizer adds a ``llvm.loop.isvectorized`` attribute and
all attributes from the original loop excluding its loop vectorizer
attributes. To avoid this, an empty followup attribute can be used, e.g.
.. code-block:: llvm
!3 = !{!"llvm.loop.vectorize.followup_vectorized"}
The followup attributes of a transformation that cannot be applied will
never be added to a loop and are therefore effectively ignored. This means
that any followup-transformation in such attributes requires that its
prior transformations are applied before the followup-transformation.
The user should receive a warning about the first transformation in the
transformation chain that could not be applied if it a forced
transformation. All following transformations are skipped.
Pass-Specific Transformation Metadata
=====================================
Transformation options are specific to each transformation. In the
following, we present the model for each LLVM loop optimization pass and
the metadata to influence them.
Loop Vectorization and Interleaving
-----------------------------------
Loop vectorization and interleaving is interpreted as a single
transformation. It is interpreted as forced if
``!{"llvm.loop.vectorize.enable", i1 true}`` is set.
Assuming the pre-vectorization loop is
.. code-block:: c
for (int i = 0; i < n; i+=1) // original loop
Stmt(i);
then the code after vectorization will be approximately (assuming an
SIMD width of 4):
.. code-block:: c
int i = 0;
if (rtc) {
for (; i + 3 < n; i+=4) // vectorized/interleaved loop
Stmt(i:i+3);
}
for (; i < n; i+=1) // epilogue loop
Stmt(i);
where ``rtc`` is a generated runtime check.
``llvm.loop.vectorize.followup_vectorized`` will set the attributes for
the vectorized loop. If not specified, ``llvm.loop.isvectorized`` is
combined with the original loop's attributes to avoid it being
vectorized multiple times.
``llvm.loop.vectorize.followup_epilogue`` will set the attributes for
the remainder loop. If not specified, it will have the original loop's
attributes combined with ``llvm.loop.isvectorized`` and
``llvm.loop.unroll.runtime.disable`` (unless the original loop already
has unroll metadata).
The attributes specified by ``llvm.loop.vectorize.followup_all`` are
added to both loops.
When using a follow-up attribute, it replaces any automatically deduced
attributes for the generated loop in question. Therefore it is
recommended to add ``llvm.loop.isvectorized`` to
``llvm.loop.vectorize.followup_all`` which avoids that the loop
vectorizer tries to optimize the loops again.
Loop Unrolling
--------------
Unrolling is interpreted as forced any ``!{!"llvm.loop.unroll.enable"}``
metadata or option (``llvm.loop.unroll.count``, ``llvm.loop.unroll.full``)
is present. Unrolling can be full unrolling, partial unrolling of a loop
with constant trip count or runtime unrolling of a loop with a trip
count unknown at compile-time.
If the loop has been unrolled fully, there is no followup-loop. For
partial/runtime unrolling, the original loop of
.. code-block:: c
for (int i = 0; i < n; i+=1) // original loop
Stmt(i);
is transformed into (using an unroll factor of 4):
.. code-block:: c
int i = 0;
for (; i + 3 < n; i+=4) // unrolled loop
Stmt(i);
Stmt(i+1);
Stmt(i+2);
Stmt(i+3);
}
for (; i < n; i+=1) // remainder loop
Stmt(i);
``llvm.loop.unroll.followup_unrolled`` will set the loop attributes of
the unrolled loop. If not specified, the attributes of the original loop
without the ``llvm.loop.unroll.*`` attributes are copied and
``llvm.loop.unroll.disable`` added to it.
``llvm.loop.unroll.followup_remainder`` defines the attributes of the
remainder loop. If not specified the remainder loop will have no
attributes. The remainder loop might not be present due to being fully
unrolled in which case this attribute has no effect.
Attributes defined in ``llvm.loop.unroll.followup_all`` are added to the
unrolled and remainder loops.
To avoid that the partially unrolled loop is unrolled again, it is
recommended to add ``llvm.loop.unroll.disable`` to
``llvm.loop.unroll.followup_all``. If no follow-up attribute specified
for a generated loop, it is added automatically.
Unroll-And-Jam
--------------
Unroll-and-jam uses the following transformation model (here with an
unroll factor if 2). Currently, it does not support a fallback version
when the transformation is unsafe.
.. code-block:: c
for (int i = 0; i < n; i+=1) { // original outer loop
Fore(i);
for (int j = 0; j < m; j+=1) // original inner loop
SubLoop(i, j);
Aft(i);
}
.. code-block:: c
int i = 0;
for (; i + 1 < n; i+=2) { // unrolled outer loop
Fore(i);
Fore(i+1);
for (int j = 0; j < m; j+=1) { // unrolled inner loop
SubLoop(i, j);
SubLoop(i+1, j);
}
Aft(i);
Aft(i+1);
}
for (; i < n; i+=1) { // remainder outer loop
Fore(i);
for (int j = 0; j < m; j+=1) // remainder inner loop
SubLoop(i, j);
Aft(i);
}
``llvm.loop.unroll_and_jam.followup_outer`` will set the loop attributes
of the unrolled outer loop. If not specified, the attributes of the
original outer loop without the ``llvm.loop.unroll.*`` attributes are
copied and ``llvm.loop.unroll.disable`` added to it.
``llvm.loop.unroll_and_jam.followup_inner`` will set the loop attributes
of the unrolled inner loop. If not specified, the attributes of the
original inner loop are used unchanged.
``llvm.loop.unroll_and_jam.followup_remainder_outer`` sets the loop
attributes of the outer remainder loop. If not specified it will not
have any attributes. The remainder loop might not be present due to
being fully unrolled.
``llvm.loop.unroll_and_jam.followup_remainder_inner`` sets the loop
attributes of the inner remainder loop. If not specified it will have
the attributes of the original inner loop. It the outer remainder loop
is unrolled, the inner remainder loop might be present multiple times.
Attributes defined in ``llvm.loop.unroll_and_jam.followup_all`` are
added to all of the aforementioned output loops.
To avoid that the unrolled loop is unrolled again, it is
recommended to add ``llvm.loop.unroll.disable`` to
``llvm.loop.unroll_and_jam.followup_all``. It suppresses unroll-and-jam
as well as an additional inner loop unrolling. If no follow-up
attribute specified for a generated loop, it is added automatically.
Loop Distribution
-----------------
The LoopDistribution pass tries to separate vectorizable parts of a loop
from the non-vectorizable part (which otherwise would make the entire
loop non-vectorizable). Conceptually, it transforms a loop such as
.. code-block:: c
for (int i = 1; i < n; i+=1) { // original loop
A[i] = i;
B[i] = 2 + B[i];
C[i] = 3 + C[i - 1];
}
into the following code:
.. code-block:: c
if (rtc) {
for (int i = 1; i < n; i+=1) // coincident loop
A[i] = i;
for (int i = 1; i < n; i+=1) // coincident loop
B[i] = 2 + B[i];
for (int i = 1; i < n; i+=1) // sequential loop
C[i] = 3 + C[i - 1];
} else {
for (int i = 1; i < n; i+=1) { // fallback loop
A[i] = i;
B[i] = 2 + B[i];
C[i] = 3 + C[i - 1];
}
}
where ``rtc`` is a generated runtime check.
``llvm.loop.distribute.followup_coincident`` sets the loop attributes of
all loops without loop-carried dependencies (i.e. vectorizable loops).
There might be more than one such loops. If not defined, the loops will
inherit the original loop's attributes.
``llvm.loop.distribute.followup_sequential`` sets the loop attributes of the
loop with potentially unsafe dependencies. There should be at most one
such loop. If not defined, the loop will inherit the original loop's
attributes.
``llvm.loop.distribute.followup_fallback`` defines the loop attributes
for the fallback loop, which is a copy of the original loop for when
loop versioning is required. If undefined, the fallback loop inherits
all attributes from the original loop.
Attributes defined in ``llvm.loop.distribute.followup_all`` are added to
all of the aforementioned output loops.
It is recommended to add ``llvm.loop.disable_nonforced`` to
``llvm.loop.distribute.followup_fallback``. This avoids that the
fallback version (which is likely never executed) is further optimzed
which would increase the code size.
Versioning LICM
---------------
The pass hoists code out of loops that are only loop-invariant when
dynamic conditions apply. For instance, it transforms the loop
.. code-block:: c
for (int i = 0; i < n; i+=1) // original loop
A[i] = B[0];
into:
.. code-block:: c
if (rtc) {
auto b = B[0];
for (int i = 0; i < n; i+=1) // versioned loop
A[i] = b;
} else {
for (int i = 0; i < n; i+=1) // unversioned loop
A[i] = B[0];
}
The runtime condition (``rtc``) checks that the array ``A`` and the
element `B[0]` do not alias.
Currently, this transformation does not support followup-attributes.
Loop Interchange
----------------
Currently, the ``LoopInterchange`` pass does not use any metadata.
Ambiguous Transformation Order
==============================
If there multiple transformations defined, the order in which they are
executed depends on the order in LLVM's pass pipeline, which is subject
to change. The default optimization pipeline (anything higher than
``-O0``) has the following order.
When using the legacy pass manager:
- LoopInterchange (if enabled)
- SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
- VersioningLICM (if enabled)
- LoopDistribute
- LoopVectorizer
- LoopUnrollAndJam (if enabled)
- LoopUnroll (partial and runtime unrolling)
When using the legacy pass manager with LTO:
- LoopInterchange (if enabled)
- SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
- LoopVectorizer
- LoopUnroll (partial and runtime unrolling)
When using the new pass manager:
- SimpleLoopUnroll/LoopFullUnroll (only performs full unrolling)
- LoopDistribute
- LoopVectorizer
- LoopUnrollAndJam (if enabled)
- LoopUnroll (partial and runtime unrolling)
Leftover Transformations
========================
Forced transformations that have not been applied after the last
transformation pass should be reported to the user. The transformation
passes themselves cannot be responsible for this reporting because they
might not be in the pipeline, there might be multiple passes able to
apply a transformation (e.g. ``LoopInterchange`` and Polly) or a
transformation attribute may be 'hidden' inside another passes' followup
attribute.
The pass ``-transform-warning`` (``WarnMissedTransformationsPass``)
emits such warnings. It should be placed after the last transformation
pass.
The current pass pipeline has a fixed order in which transformations
passes are executed. A transformation can be in the followup of a pass
that is executed later and thus leftover. For instance, a loop nest
cannot be distributed and then interchanged with the current pass
pipeline. The loop distribution will execute, but there is no loop
interchange pass following such that any loop interchange metadata will
be ignored. The ``-transform-warning`` should emit a warning in this
case.
Future versions of LLVM may fix this by executing transformations using
a dynamic ordering.

View File

@ -292,6 +292,7 @@ For API clients and LLVM developers.
Statepoints
MergeFunctions
TypeMetadata
TransformMetadata
FaultMaps
MIRLangRef
Coroutines

View File

@ -400,6 +400,7 @@ void initializeUnreachableMachineBlockElimPass(PassRegistry&);
void initializeVerifierLegacyPassPass(PassRegistry&);
void initializeVirtRegMapPass(PassRegistry&);
void initializeVirtRegRewriterPass(PassRegistry&);
void initializeWarnMissedTransformationsLegacyPass(PassRegistry &);
void initializeWasmEHPreparePass(PassRegistry&);
void initializeWholeProgramDevirtPass(PassRegistry&);
void initializeWinEHPreparePass(PassRegistry&);

View File

@ -220,6 +220,7 @@ namespace {
(void) llvm::createFloat2IntPass();
(void) llvm::createEliminateAvailableExternallyPass();
(void) llvm::createScalarizeMaskedMemIntrinPass();
(void) llvm::createWarnMissedTransformationsPass();
(void)new llvm::IntervalPartition();
(void)new llvm::ScalarEvolutionWrapperPass();

View File

@ -484,6 +484,13 @@ FunctionPass *createLibCallsShrinkWrapPass();
// primarily to help other loop passes.
//
Pass *createLoopSimplifyCFGPass();
//===----------------------------------------------------------------------===//
//
// WarnMissedTransformations - This pass emits warnings for leftover forced
// transformations.
//
Pass *createWarnMissedTransformationsPass();
} // End llvm namespace
#endif

View File

@ -0,0 +1,38 @@
//===- WarnMissedTransforms.h -----------------------------------*- C++ -*-===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// Emit warnings if forced code transformations have not been performed.
//
//===----------------------------------------------------------------------===//
#ifndef LLVM_TRANSFORMS_SCALAR_WARNMISSEDTRANSFORMS_H
#define LLVM_TRANSFORMS_SCALAR_WARNMISSEDTRANSFORMS_H
#include "llvm/IR/PassManager.h"
namespace llvm {
class Function;
class Loop;
class LPMUpdater;
// New pass manager boilerplate.
class WarnMissedTransformationsPass
: public PassInfoMixin<WarnMissedTransformationsPass> {
public:
explicit WarnMissedTransformationsPass() {}
PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);
};
// Legacy pass manager boilerplate.
Pass *createWarnMissedTransformationsPass();
void initializeWarnMissedTransformationsLegacyPass(PassRegistry &);
} // end namespace llvm
#endif // LLVM_TRANSFORMS_SCALAR_WARNMISSEDTRANSFORMS_H

View File

@ -171,6 +171,77 @@ SmallVector<Instruction *, 8> findDefsUsedOutsideOfLoop(Loop *L);
Optional<const MDOperand *> findStringMetadataForLoop(Loop *TheLoop,
StringRef Name);
/// Find named metadata for a loop with an integer value.
llvm::Optional<int> getOptionalIntLoopAttribute(Loop *TheLoop, StringRef Name);
/// Create a new loop identifier for a loop created from a loop transformation.
///
/// @param OrigLoopID The loop ID of the loop before the transformation.
/// @param FollowupAttrs List of attribute names that contain attributes to be
/// added to the new loop ID.
/// @param InheritAttrsExceptPrefix Selects which attributes should be inherited
/// from the original loop. The following values
/// are considered:
/// nullptr : Inherit all attributes from @p OrigLoopID.
/// "" : Do not inherit any attribute from @p OrigLoopID; only use
/// those specified by a followup attribute.
/// "<prefix>": Inherit all attributes except those which start with
/// <prefix>; commonly used to remove metadata for the
/// applied transformation.
/// @param AlwaysNew If true, do not try to reuse OrigLoopID and never return
/// None.
///
/// @return The loop ID for the after-transformation loop. The following values
/// can be returned:
/// None : No followup attribute was found; it is up to the
/// transformation to choose attributes that make sense.
/// @p OrigLoopID: The original identifier can be reused.
/// nullptr : The new loop has no attributes.
/// MDNode* : A new unique loop identifier.
Optional<MDNode *>
makeFollowupLoopID(MDNode *OrigLoopID, ArrayRef<StringRef> FollowupAttrs,
const char *InheritOptionsAttrsPrefix = "",
bool AlwaysNew = false);
/// Look for the loop attribute that disables all transformation heuristic.
bool hasDisableAllTransformsHint(const Loop *L);
/// The mode sets how eager a transformation should be applied.
enum TransformationMode {
/// The pass can use heuristics to determine whether a transformation should
/// be applied.
TM_Unspecified,
/// The transformation should be applied without considering a cost model.
TM_Enable,
/// The transformation should not be applied.
TM_Disable,
/// Force is a flag and should not be used alone.
TM_Force = 0x04,
/// The transformation was directed by the user, e.g. by a #pragma in
/// the source code. If the transformation could not be applied, a
/// warning should be emitted.
TM_ForcedByUser = TM_Enable | TM_Force,
/// The transformation must not be applied. For instance, `#pragma clang loop
/// unroll(disable)` explicitly forbids any unrolling to take place. Unlike
/// general loop metadata, it must not be dropped. Most passes should not
/// behave differently under TM_Disable and TM_SuppressedByUser.
TM_SuppressedByUser = TM_Disable | TM_Force
};
/// @{
/// Get the mode for LLVM's supported loop transformations.
TransformationMode hasUnrollTransformation(Loop *L);
TransformationMode hasUnrollAndJamTransformation(Loop *L);
TransformationMode hasVectorizeTransformation(Loop *L);
TransformationMode hasDistributeTransformation(Loop *L);
TransformationMode hasLICMVersioningTransformation(Loop *L);
/// @}
/// Set input string into loop metadata by keeping other values intact.
void addStringMetadataToLoop(Loop *TheLoop, const char *MDString,
unsigned V = 0);

View File

@ -35,6 +35,15 @@ class ScalarEvolution;
using NewLoopsMap = SmallDenseMap<const Loop *, Loop *, 4>;
/// @{
/// Metadata attribute names
const char *const LLVMLoopUnrollFollowupAll = "llvm.loop.unroll.followup_all";
const char *const LLVMLoopUnrollFollowupUnrolled =
"llvm.loop.unroll.followup_unrolled";
const char *const LLVMLoopUnrollFollowupRemainder =
"llvm.loop.unroll.followup_remainder";
/// @}
const Loop* addClonedBlockToLoopInfo(BasicBlock *OriginalBB,
BasicBlock *ClonedBB, LoopInfo *LI,
NewLoopsMap &NewLoops);
@ -61,15 +70,16 @@ LoopUnrollResult UnrollLoop(Loop *L, unsigned Count, unsigned TripCount,
unsigned PeelCount, bool UnrollRemainder,
LoopInfo *LI, ScalarEvolution *SE,
DominatorTree *DT, AssumptionCache *AC,
OptimizationRemarkEmitter *ORE, bool PreserveLCSSA);
OptimizationRemarkEmitter *ORE, bool PreserveLCSSA,
Loop **RemainderLoop = nullptr);
bool UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,
bool AllowExpensiveTripCount,
bool UseEpilogRemainder, bool UnrollRemainder,
LoopInfo *LI,
ScalarEvolution *SE, DominatorTree *DT,
AssumptionCache *AC,
bool PreserveLCSSA);
LoopInfo *LI, ScalarEvolution *SE,
DominatorTree *DT, AssumptionCache *AC,
bool PreserveLCSSA,
Loop **ResultLoop = nullptr);
void computePeelCount(Loop *L, unsigned LoopSize,
TargetTransformInfo::UnrollingPreferences &UP,
@ -84,7 +94,8 @@ LoopUnrollResult UnrollAndJamLoop(Loop *L, unsigned Count, unsigned TripCount,
unsigned TripMultiple, bool UnrollRemainder,
LoopInfo *LI, ScalarEvolution *SE,
DominatorTree *DT, AssumptionCache *AC,
OptimizationRemarkEmitter *ORE);
OptimizationRemarkEmitter *ORE,
Loop **EpilogueLoop = nullptr);
bool isSafeToUnrollAndJam(Loop *L, ScalarEvolution &SE, DominatorTree &DT,
DependenceInfo &DI);

View File

@ -113,7 +113,11 @@ public:
unsigned getWidth() const { return Width.Value; }
unsigned getInterleave() const { return Interleave.Value; }
unsigned getIsVectorized() const { return IsVectorized.Value; }
enum ForceKind getForce() const { return (ForceKind)Force.Value; }
enum ForceKind getForce() const {
if (Force.Value == FK_Undefined && hasDisableAllTransformsHint(TheLoop))
return FK_Disabled;
return (ForceKind)Force.Value;
}
/// If hints are provided that force vectorization, use the AlwaysPrint
/// pass name to force the frontend to print the diagnostic.

View File

@ -237,23 +237,19 @@ MDNode *Loop::getLoopID() const {
}
void Loop::setLoopID(MDNode *LoopID) const {
assert(LoopID && "Loop ID should not be null");
assert(LoopID->getNumOperands() > 0 && "Loop ID needs at least one operand");
assert(LoopID->getOperand(0) == LoopID && "Loop ID should refer to itself");
assert((!LoopID || LoopID->getNumOperands() > 0) &&
"Loop ID needs at least one operand");
assert((!LoopID || LoopID->getOperand(0) == LoopID) &&
"Loop ID should refer to itself");
if (BasicBlock *Latch = getLoopLatch()) {
Latch->getTerminator()->setMetadata(LLVMContext::MD_loop, LoopID);
return;
}
assert(!getLoopLatch() &&
"The loop should have no single latch at this point");
BasicBlock *H = getHeader();
for (BasicBlock *BB : this->blocks()) {
Instruction *TI = BB->getTerminator();
for (BasicBlock *Successor : successors(TI)) {
if (Successor == H)
if (Successor == H) {
TI->setMetadata(LLVMContext::MD_loop, LoopID);
break;
}
}
}
}

View File

@ -148,6 +148,7 @@
#include "llvm/Transforms/Scalar/SpeculateAroundPHIs.h"
#include "llvm/Transforms/Scalar/SpeculativeExecution.h"
#include "llvm/Transforms/Scalar/TailRecursionElimination.h"
#include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
#include "llvm/Transforms/Utils/AddDiscriminators.h"
#include "llvm/Transforms/Utils/BreakCriticalEdges.h"
#include "llvm/Transforms/Utils/EntryExitInstrumenter.h"
@ -835,6 +836,7 @@ PassBuilder::buildModuleOptimizationPipeline(OptimizationLevel Level,
createFunctionToLoopPassAdaptor(LoopUnrollAndJamPass(Level)));
}
OptimizePM.addPass(LoopUnrollPass(LoopUnrollOptions(Level)));
OptimizePM.addPass(WarnMissedTransformationsPass());
OptimizePM.addPass(InstCombinePass());
OptimizePM.addPass(RequireAnalysisPass<OptimizationRemarkEmitterAnalysis, Function>());
OptimizePM.addPass(createFunctionToLoopPassAdaptor(LICMPass(), DebugLogging));

View File

@ -230,6 +230,7 @@ FUNCTION_PASS("verify<memoryssa>", MemorySSAVerifierPass())
FUNCTION_PASS("verify<regions>", RegionInfoVerifierPass())
FUNCTION_PASS("view-cfg", CFGViewerPass())
FUNCTION_PASS("view-cfg-only", CFGOnlyViewerPass())
FUNCTION_PASS("transform-warning", WarnMissedTransformationsPass())
#undef FUNCTION_PASS
#ifndef LOOP_ANALYSIS

View File

@ -702,6 +702,8 @@ void PassManagerBuilder::populateModulePassManager(
MPM.add(createLICMPass());
}
MPM.add(createWarnMissedTransformationsPass());
// After vectorization and unrolling, assume intrinsics may tell us more
// about pointer alignments.
MPM.add(createAlignmentFromAssumptionsPass());
@ -877,6 +879,8 @@ void PassManagerBuilder::addLTOOptimizationPasses(legacy::PassManagerBase &PM) {
if (!DisableUnrollLoops)
PM.add(createLoopUnrollPass(OptLevel));
PM.add(createWarnMissedTransformationsPass());
// Now that we've optimized loops (in particular loop induction variables),
// we may have exposed more scalar opportunities. Run parts of the scalar
// optimizer again at this point.

View File

@ -69,6 +69,7 @@ add_llvm_library(LLVMScalarOpts
StraightLineStrengthReduce.cpp
StructurizeCFG.cpp
TailRecursionElimination.cpp
WarnMissedTransforms.cpp
ADDITIONAL_HEADER_DIRS
${LLVM_MAIN_INCLUDE_DIR}/llvm/Transforms

View File

@ -78,6 +78,18 @@ using namespace llvm;
#define LDIST_NAME "loop-distribute"
#define DEBUG_TYPE LDIST_NAME
/// @{
/// Metadata attribute names
static const char *const LLVMLoopDistributeFollowupAll =
"llvm.loop.distribute.followup_all";
static const char *const LLVMLoopDistributeFollowupCoincident =
"llvm.loop.distribute.followup_coincident";
static const char *const LLVMLoopDistributeFollowupSequential =
"llvm.loop.distribute.followup_sequential";
static const char *const LLVMLoopDistributeFollowupFallback =
"llvm.loop.distribute.followup_fallback";
/// @}
static cl::opt<bool>
LDistVerify("loop-distribute-verify", cl::Hidden,
cl::desc("Turn on DominatorTree and LoopInfo verification "
@ -186,7 +198,7 @@ public:
/// Returns the loop where this partition ends up after distribution.
/// If this partition is mapped to the original loop then use the block from
/// the loop.
const Loop *getDistributedLoop() const {
Loop *getDistributedLoop() const {
return ClonedLoop ? ClonedLoop : OrigLoop;
}
@ -443,6 +455,9 @@ public:
assert(&*OrigPH->begin() == OrigPH->getTerminator() &&
"preheader not empty");
// Preserve the original loop ID for use after the transformation.
MDNode *OrigLoopID = L->getLoopID();
// Create a loop for each partition except the last. Clone the original
// loop before PH along with adding a preheader for the cloned loop. Then
// update PH to point to the newly added preheader.
@ -457,9 +472,13 @@ public:
Part->getVMap()[ExitBlock] = TopPH;
Part->remapInstructions();
setNewLoopID(OrigLoopID, Part);
}
Pred->getTerminator()->replaceUsesOfWith(OrigPH, TopPH);
// Also set a new loop ID for the last loop.
setNewLoopID(OrigLoopID, &PartitionContainer.back());
// Now go in forward order and update the immediate dominator for the
// preheaders with the exiting block of the previous loop. Dominance
// within the loop is updated in cloneLoopWithPreheader.
@ -575,6 +594,19 @@ private:
}
}
}
/// Assign new LoopIDs for the partition's cloned loop.
void setNewLoopID(MDNode *OrigLoopID, InstPartition *Part) {
Optional<MDNode *> PartitionID = makeFollowupLoopID(
OrigLoopID,
{LLVMLoopDistributeFollowupAll,
Part->hasDepCycle() ? LLVMLoopDistributeFollowupSequential
: LLVMLoopDistributeFollowupCoincident});
if (PartitionID.hasValue()) {
Loop *NewLoop = Part->getDistributedLoop();
NewLoop->setLoopID(PartitionID.getValue());
}
}
};
/// For each memory instruction, this class maintains difference of the
@ -743,6 +775,9 @@ public:
return fail("TooManySCEVRuntimeChecks",
"too many SCEV run-time checks needed.\n");
if (!IsForced.getValueOr(false) && hasDisableAllTransformsHint(L))
return fail("HeuristicDisabled", "distribution heuristic disabled");
LLVM_DEBUG(dbgs() << "\nDistributing loop: " << *L << "\n");
// We're done forming the partitions set up the reverse mapping from
// instructions to partitions.
@ -762,6 +797,8 @@ public:
RtPtrChecking);
if (!Pred.isAlwaysTrue() || !Checks.empty()) {
MDNode *OrigLoopID = L->getLoopID();
LLVM_DEBUG(dbgs() << "\nPointers:\n");
LLVM_DEBUG(LAI->getRuntimePointerChecking()->printChecks(dbgs(), Checks));
LoopVersioning LVer(*LAI, L, LI, DT, SE, false);
@ -769,6 +806,17 @@ public:
LVer.setSCEVChecks(LAI->getPSE().getUnionPredicate());
LVer.versionLoop(DefsUsedOutside);
LVer.annotateLoopWithNoAlias();
// The unversioned loop will not be changed, so we inherit all attributes
// from the original loop, but remove the loop distribution metadata to
// avoid to distribute it again.
MDNode *UnversionedLoopID =
makeFollowupLoopID(OrigLoopID,
{LLVMLoopDistributeFollowupAll,
LLVMLoopDistributeFollowupFallback},
"llvm.loop.distribute.", true)
.getValue();
LVer.getNonVersionedLoop()->setLoopID(UnversionedLoopID);
}
// Create identical copies of the original loop for each partition and hook

View File

@ -56,6 +56,20 @@ using namespace llvm;
#define DEBUG_TYPE "loop-unroll-and-jam"
/// @{
/// Metadata attribute names
static const char *const LLVMLoopUnrollAndJamFollowupAll =
"llvm.loop.unroll_and_jam.followup_all";
static const char *const LLVMLoopUnrollAndJamFollowupInner =
"llvm.loop.unroll_and_jam.followup_inner";
static const char *const LLVMLoopUnrollAndJamFollowupOuter =
"llvm.loop.unroll_and_jam.followup_outer";
static const char *const LLVMLoopUnrollAndJamFollowupRemainderInner =
"llvm.loop.unroll_and_jam.followup_remainder_inner";
static const char *const LLVMLoopUnrollAndJamFollowupRemainderOuter =
"llvm.loop.unroll_and_jam.followup_remainder_outer";
/// @}
static cl::opt<bool>
AllowUnrollAndJam("allow-unroll-and-jam", cl::Hidden,
cl::desc("Allows loops to be unroll-and-jammed."));
@ -112,11 +126,6 @@ static bool HasUnrollAndJamEnablePragma(const Loop *L) {
return GetUnrollMetadataForLoop(L, "llvm.loop.unroll_and_jam.enable");
}
// Returns true if the loop has an unroll_and_jam(disable) pragma.
static bool HasUnrollAndJamDisablePragma(const Loop *L) {
return GetUnrollMetadataForLoop(L, "llvm.loop.unroll_and_jam.disable");
}
// If loop has an unroll_and_jam_count pragma return the (necessarily
// positive) value from the pragma. Otherwise return 0.
static unsigned UnrollAndJamCountPragmaValue(const Loop *L) {
@ -299,13 +308,16 @@ tryToUnrollAndJamLoop(Loop *L, DominatorTree &DT, LoopInfo *LI,
<< L->getHeader()->getParent()->getName() << "] Loop %"
<< L->getHeader()->getName() << "\n");
TransformationMode EnableMode = hasUnrollAndJamTransformation(L);
if (EnableMode & TM_Disable)
return LoopUnrollResult::Unmodified;
// A loop with any unroll pragma (enabling/disabling/count/etc) is left for
// the unroller, so long as it does not explicitly have unroll_and_jam
// metadata. This means #pragma nounroll will disable unroll and jam as well
// as unrolling
if (HasUnrollAndJamDisablePragma(L) ||
(HasAnyUnrollPragma(L, "llvm.loop.unroll.") &&
!HasAnyUnrollPragma(L, "llvm.loop.unroll_and_jam."))) {
if (HasAnyUnrollPragma(L, "llvm.loop.unroll.") &&
!HasAnyUnrollPragma(L, "llvm.loop.unroll_and_jam.")) {
LLVM_DEBUG(dbgs() << " Disabled due to pragma.\n");
return LoopUnrollResult::Unmodified;
}
@ -344,6 +356,19 @@ tryToUnrollAndJamLoop(Loop *L, DominatorTree &DT, LoopInfo *LI,
return LoopUnrollResult::Unmodified;
}
// Save original loop IDs for after the transformation.
MDNode *OrigOuterLoopID = L->getLoopID();
MDNode *OrigSubLoopID = SubLoop->getLoopID();
// To assign the loop id of the epilogue, assign it before unrolling it so it
// is applied to every inner loop of the epilogue. We later apply the loop ID
// for the jammed inner loop.
Optional<MDNode *> NewInnerEpilogueLoopID = makeFollowupLoopID(
OrigOuterLoopID, {LLVMLoopUnrollAndJamFollowupAll,
LLVMLoopUnrollAndJamFollowupRemainderInner});
if (NewInnerEpilogueLoopID.hasValue())
SubLoop->setLoopID(NewInnerEpilogueLoopID.getValue());
// Find trip count and trip multiple
unsigned OuterTripCount = SE.getSmallConstantTripCount(L, Latch);
unsigned OuterTripMultiple = SE.getSmallConstantTripMultiple(L, Latch);
@ -359,9 +384,39 @@ tryToUnrollAndJamLoop(Loop *L, DominatorTree &DT, LoopInfo *LI,
if (OuterTripCount && UP.Count > OuterTripCount)
UP.Count = OuterTripCount;
LoopUnrollResult UnrollResult =
UnrollAndJamLoop(L, UP.Count, OuterTripCount, OuterTripMultiple,
UP.UnrollRemainder, LI, &SE, &DT, &AC, &ORE);
Loop *EpilogueOuterLoop = nullptr;
LoopUnrollResult UnrollResult = UnrollAndJamLoop(
L, UP.Count, OuterTripCount, OuterTripMultiple, UP.UnrollRemainder, LI,
&SE, &DT, &AC, &ORE, &EpilogueOuterLoop);
// Assign new loop attributes.
if (EpilogueOuterLoop) {
Optional<MDNode *> NewOuterEpilogueLoopID = makeFollowupLoopID(
OrigOuterLoopID, {LLVMLoopUnrollAndJamFollowupAll,
LLVMLoopUnrollAndJamFollowupRemainderOuter});
if (NewOuterEpilogueLoopID.hasValue())
EpilogueOuterLoop->setLoopID(NewOuterEpilogueLoopID.getValue());
}
Optional<MDNode *> NewInnerLoopID =
makeFollowupLoopID(OrigOuterLoopID, {LLVMLoopUnrollAndJamFollowupAll,
LLVMLoopUnrollAndJamFollowupInner});
if (NewInnerLoopID.hasValue())
SubLoop->setLoopID(NewInnerLoopID.getValue());
else
SubLoop->setLoopID(OrigSubLoopID);
if (UnrollResult == LoopUnrollResult::PartiallyUnrolled) {
Optional<MDNode *> NewOuterLoopID = makeFollowupLoopID(
OrigOuterLoopID,
{LLVMLoopUnrollAndJamFollowupAll, LLVMLoopUnrollAndJamFollowupOuter});
if (NewOuterLoopID.hasValue()) {
L->setLoopID(NewOuterLoopID.getValue());
// Do not setLoopAlreadyUnrolled if a followup was given.
return UnrollResult;
}
}
// If loop has an unroll count pragma or unrolled by explicitly set count
// mark loop as unrolled to prevent unrolling beyond that requested.

View File

@ -661,11 +661,6 @@ static bool HasUnrollEnablePragma(const Loop *L) {
return GetUnrollMetadataForLoop(L, "llvm.loop.unroll.enable");
}
// Returns true if the loop has an unroll(disable) pragma.
static bool HasUnrollDisablePragma(const Loop *L) {
return GetUnrollMetadataForLoop(L, "llvm.loop.unroll.disable");
}
// Returns true if the loop has an runtime unroll(disable) pragma.
static bool HasRuntimeUnrollDisablePragma(const Loop *L) {
return GetUnrollMetadataForLoop(L, "llvm.loop.unroll.runtime.disable");
@ -713,12 +708,19 @@ static uint64_t getUnrolledLoopSize(
// Returns true if unroll count was set explicitly.
// Calculates unroll count and writes it to UP.Count.
// Unless IgnoreUser is true, will also use metadata and command-line options
// that are specific to to the LoopUnroll pass (which, for instance, are
// irrelevant for the LoopUnrollAndJam pass).
// FIXME: This function is used by LoopUnroll and LoopUnrollAndJam, but consumes
// many LoopUnroll-specific options. The shared functionality should be
// refactored into it own function.
bool llvm::computeUnrollCount(
Loop *L, const TargetTransformInfo &TTI, DominatorTree &DT, LoopInfo *LI,
ScalarEvolution &SE, const SmallPtrSetImpl<const Value *> &EphValues,
OptimizationRemarkEmitter *ORE, unsigned &TripCount, unsigned MaxTripCount,
unsigned &TripMultiple, unsigned LoopSize,
TargetTransformInfo::UnrollingPreferences &UP, bool &UseUpperBound) {
// Check for explicit Count.
// 1st priority is unroll count set by "unroll-count" option.
bool UserUnrollCount = UnrollCount.getNumOccurrences() > 0;
@ -969,7 +971,7 @@ static LoopUnrollResult tryToUnrollLoop(
LLVM_DEBUG(dbgs() << "Loop Unroll: F["
<< L->getHeader()->getParent()->getName() << "] Loop %"
<< L->getHeader()->getName() << "\n");
if (HasUnrollDisablePragma(L))
if (hasUnrollTransformation(L) & TM_Disable)
return LoopUnrollResult::Unmodified;
if (!L->isLoopSimplifyForm()) {
LLVM_DEBUG(
@ -1066,14 +1068,39 @@ static LoopUnrollResult tryToUnrollLoop(
if (TripCount && UP.Count > TripCount)
UP.Count = TripCount;
// Save loop properties before it is transformed.
MDNode *OrigLoopID = L->getLoopID();
// Unroll the loop.
Loop *RemainderLoop = nullptr;
LoopUnrollResult UnrollResult = UnrollLoop(
L, UP.Count, TripCount, UP.Force, UP.Runtime, UP.AllowExpensiveTripCount,
UseUpperBound, MaxOrZero, TripMultiple, UP.PeelCount, UP.UnrollRemainder,
LI, &SE, &DT, &AC, &ORE, PreserveLCSSA);
LI, &SE, &DT, &AC, &ORE, PreserveLCSSA, &RemainderLoop);
if (UnrollResult == LoopUnrollResult::Unmodified)
return LoopUnrollResult::Unmodified;
if (RemainderLoop) {
Optional<MDNode *> RemainderLoopID =
makeFollowupLoopID(OrigLoopID, {LLVMLoopUnrollFollowupAll,
LLVMLoopUnrollFollowupRemainder});
if (RemainderLoopID.hasValue())
RemainderLoop->setLoopID(RemainderLoopID.getValue());
}
if (UnrollResult != LoopUnrollResult::FullyUnrolled) {
Optional<MDNode *> NewLoopID =
makeFollowupLoopID(OrigLoopID, {LLVMLoopUnrollFollowupAll,
LLVMLoopUnrollFollowupUnrolled});
if (NewLoopID.hasValue()) {
L->setLoopID(NewLoopID.getValue());
// Do not setLoopAlreadyUnrolled if loop attributes have been specified
// explicitly.
return UnrollResult;
}
}
// If loop has an unroll count pragma or unrolled by explicitly set count
// mark loop as unrolled to prevent unrolling beyond that requested.
// If the loop was peeled, we already "used up" the profile information

View File

@ -594,6 +594,11 @@ bool LoopVersioningLICM::runOnLoop(Loop *L, LPPassManager &LPM) {
if (skipLoop(L))
return false;
// Do not do the transformation if disabled by metadata.
if (hasLICMVersioningTransformation(L) & TM_Disable)
return false;
// Get Analysis information.
AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
SE = &getAnalysis<ScalarEvolutionWrapperPass>().getSE();

View File

@ -75,6 +75,7 @@ void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeLoopUnrollPass(Registry);
initializeLoopUnrollAndJamPass(Registry);
initializeLoopUnswitchPass(Registry);
initializeWarnMissedTransformationsLegacyPass(Registry);
initializeLoopVersioningLICMPass(Registry);
initializeLoopIdiomRecognizeLegacyPassPass(Registry);
initializeLowerAtomicLegacyPassPass(Registry);

View File

@ -0,0 +1,144 @@
//===- LoopTransformWarning.cpp - ----------------------------------------===//
//
// The LLVM Compiler Infrastructure
//
// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.
//
//===----------------------------------------------------------------------===//
//
// Emit warnings if forced code transformations have not been performed.
//
//===----------------------------------------------------------------------===//
#include "llvm/Transforms/Scalar/WarnMissedTransforms.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/Transforms/Utils/LoopUtils.h"
using namespace llvm;
#define DEBUG_TYPE "transform-warning"
/// Emit warnings for forced (i.e. user-defined) loop transformations which have
/// still not been performed.
static void warnAboutLeftoverTransformations(Loop *L,
OptimizationRemarkEmitter *ORE) {
if (hasUnrollTransformation(L) == TM_ForcedByUser) {
LLVM_DEBUG(dbgs() << "Leftover unroll transformation\n");
ORE->emit(
DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
"FailedRequestedUnrolling",
L->getStartLoc(), L->getHeader())
<< "loop not unrolled: the optimizer was unable to perform the "
"requested transformation; the transformation might be disabled or "
"specified as part of an unsupported transformation ordering");
}
if (hasUnrollAndJamTransformation(L) == TM_ForcedByUser) {
LLVM_DEBUG(dbgs() << "Leftover unroll-and-jam transformation\n");
ORE->emit(
DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
"FailedRequestedUnrollAndJamming",
L->getStartLoc(), L->getHeader())
<< "loop not unroll-and-jammed: the optimizer was unable to perform "
"the requested transformation; the transformation might be disabled "
"or specified as part of an unsupported transformation ordering");
}
if (hasVectorizeTransformation(L) == TM_ForcedByUser) {
LLVM_DEBUG(dbgs() << "Leftover vectorization transformation\n");
Optional<int> VectorizeWidth =
getOptionalIntLoopAttribute(L, "llvm.loop.vectorize.width");
Optional<int> InterleaveCount =
getOptionalIntLoopAttribute(L, "llvm.loop.interleave.count");
if (VectorizeWidth.getValueOr(0) != 1)
ORE->emit(
DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
"FailedRequestedVectorization",
L->getStartLoc(), L->getHeader())
<< "loop not vectorized: the optimizer was unable to perform the "
"requested transformation; the transformation might be disabled "
"or specified as part of an unsupported transformation ordering");
else if (InterleaveCount.getValueOr(0) != 1)
ORE->emit(
DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
"FailedRequestedInterleaving",
L->getStartLoc(), L->getHeader())
<< "loop not interleaved: the optimizer was unable to perform the "
"requested transformation; the transformation might be disabled "
"or specified as part of an unsupported transformation ordering");
}
if (hasDistributeTransformation(L) == TM_ForcedByUser) {
LLVM_DEBUG(dbgs() << "Leftover distribute transformation\n");
ORE->emit(
DiagnosticInfoOptimizationFailure(DEBUG_TYPE,
"FailedRequestedDistribution",
L->getStartLoc(), L->getHeader())
<< "loop not distributed: the optimizer was unable to perform the "
"requested transformation; the transformation might be disabled or "
"specified as part of an unsupported transformation ordering");
}
}
static void warnAboutLeftoverTransformations(Function *F, LoopInfo *LI,
OptimizationRemarkEmitter *ORE) {
for (auto *L : LI->getLoopsInPreorder())
warnAboutLeftoverTransformations(L, ORE);
}
// New pass manager boilerplate
PreservedAnalyses
WarnMissedTransformationsPass::run(Function &F, FunctionAnalysisManager &AM) {
auto &ORE = AM.getResult<OptimizationRemarkEmitterAnalysis>(F);
auto &LI = AM.getResult<LoopAnalysis>(F);
warnAboutLeftoverTransformations(&F, &LI, &ORE);
return PreservedAnalyses::all();
}
// Legacy pass manager boilerplate
namespace {
class WarnMissedTransformationsLegacy : public FunctionPass {
public:
static char ID;
explicit WarnMissedTransformationsLegacy() : FunctionPass(ID) {
initializeWarnMissedTransformationsLegacyPass(
*PassRegistry::getPassRegistry());
}
bool runOnFunction(Function &F) override {
if (skipFunction(F))
return false;
auto &ORE = getAnalysis<OptimizationRemarkEmitterWrapperPass>().getORE();
auto &LI = getAnalysis<LoopInfoWrapperPass>().getLoopInfo();
warnAboutLeftoverTransformations(&F, &LI, &ORE);
return false;
}
void getAnalysisUsage(AnalysisUsage &AU) const override {
AU.addRequired<OptimizationRemarkEmitterWrapperPass>();
AU.addRequired<LoopInfoWrapperPass>();
AU.setPreservesAll();
}
};
} // end anonymous namespace
char WarnMissedTransformationsLegacy::ID = 0;
INITIALIZE_PASS_BEGIN(WarnMissedTransformationsLegacy, "transform-warning",
"Warn about non-applied transformations", false, false)
INITIALIZE_PASS_DEPENDENCY(LoopInfoWrapperPass)
INITIALIZE_PASS_DEPENDENCY(OptimizationRemarkEmitterWrapperPass)
INITIALIZE_PASS_END(WarnMissedTransformationsLegacy, "transform-warning",
"Warn about non-applied transformations", false, false)
Pass *llvm::createWarnMissedTransformationsPass() {
return new WarnMissedTransformationsLegacy();
}

View File

@ -329,12 +329,15 @@ void llvm::simplifyLoopAfterUnroll(Loop *L, bool SimplifyIVs, LoopInfo *LI,
///
/// This utility preserves LoopInfo. It will also preserve ScalarEvolution and
/// DominatorTree if they are non-null.
///
/// If RemainderLoop is non-null, it will receive the remainder loop (if
/// required and not fully unrolled).
LoopUnrollResult llvm::UnrollLoop(
Loop *L, unsigned Count, unsigned TripCount, bool Force, bool AllowRuntime,
bool AllowExpensiveTripCount, bool PreserveCondBr, bool PreserveOnlyFirst,
unsigned TripMultiple, unsigned PeelCount, bool UnrollRemainder,
LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT, AssumptionCache *AC,
OptimizationRemarkEmitter *ORE, bool PreserveLCSSA) {
OptimizationRemarkEmitter *ORE, bool PreserveLCSSA, Loop **RemainderLoop) {
BasicBlock *Preheader = L->getLoopPreheader();
if (!Preheader) {
@ -468,7 +471,7 @@ LoopUnrollResult llvm::UnrollLoop(
if (RuntimeTripCount && TripMultiple % Count != 0 &&
!UnrollRuntimeLoopRemainder(L, Count, AllowExpensiveTripCount,
EpilogProfitability, UnrollRemainder, LI, SE,
DT, AC, PreserveLCSSA)) {
DT, AC, PreserveLCSSA, RemainderLoop)) {
if (Force)
RuntimeTripCount = false;
else {

View File

@ -167,12 +167,14 @@ static void moveHeaderPhiOperandsToForeBlocks(BasicBlock *Header,
isSafeToUnrollAndJam should be used prior to calling this to make sure the
unrolling will be valid. Checking profitablility is also advisable.
If EpilogueLoop is non-null, it receives the epilogue loop (if it was
necessary to create one and not fully unrolled).
*/
LoopUnrollResult
llvm::UnrollAndJamLoop(Loop *L, unsigned Count, unsigned TripCount,
unsigned TripMultiple, bool UnrollRemainder,
LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT,
AssumptionCache *AC, OptimizationRemarkEmitter *ORE) {
LoopUnrollResult llvm::UnrollAndJamLoop(
Loop *L, unsigned Count, unsigned TripCount, unsigned TripMultiple,
bool UnrollRemainder, LoopInfo *LI, ScalarEvolution *SE, DominatorTree *DT,
AssumptionCache *AC, OptimizationRemarkEmitter *ORE, Loop **EpilogueLoop) {
// When we enter here we should have already checked that it is safe
BasicBlock *Header = L->getHeader();
@ -196,7 +198,8 @@ llvm::UnrollAndJamLoop(Loop *L, unsigned Count, unsigned TripCount,
if (TripMultiple == 1 || TripMultiple % Count != 0) {
if (!UnrollRuntimeLoopRemainder(L, Count, /*AllowExpensiveTripCount*/ false,
/*UseEpilogRemainder*/ true,
UnrollRemainder, LI, SE, DT, AC, true)) {
UnrollRemainder, LI, SE, DT, AC, true,
EpilogueLoop)) {
LLVM_DEBUG(dbgs() << "Won't unroll-and-jam; remainder loop could not be "
"generated when assuming runtime trip count\n");
return LoopUnrollResult::Unmodified;

View File

@ -380,6 +380,7 @@ CloneLoopBlocks(Loop *L, Value *NewIter, const bool CreateRemainderLoop,
}
if (CreateRemainderLoop) {
Loop *NewLoop = NewLoops[L];
MDNode *LoopID = NewLoop->getLoopID();
assert(NewLoop && "L should have been cloned");
// Only add loop metadata if the loop is not going to be completely
@ -387,6 +388,16 @@ CloneLoopBlocks(Loop *L, Value *NewIter, const bool CreateRemainderLoop,
if (UnrollRemainder)
return NewLoop;
Optional<MDNode *> NewLoopID = makeFollowupLoopID(
LoopID, {LLVMLoopUnrollFollowupAll, LLVMLoopUnrollFollowupRemainder});
if (NewLoopID.hasValue()) {
NewLoop->setLoopID(NewLoopID.getValue());
// Do not setLoopAlreadyUnrolled if loop attributes have been defined
// explicitly.
return NewLoop;
}
// Add unroll disable metadata to disable future unrolling for this loop.
NewLoop->setLoopAlreadyUnrolled();
return NewLoop;
@ -525,10 +536,10 @@ static bool canProfitablyUnrollMultiExitLoop(
bool llvm::UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,
bool AllowExpensiveTripCount,
bool UseEpilogRemainder,
bool UnrollRemainder,
LoopInfo *LI, ScalarEvolution *SE,
DominatorTree *DT, AssumptionCache *AC,
bool PreserveLCSSA) {
bool UnrollRemainder, LoopInfo *LI,
ScalarEvolution *SE, DominatorTree *DT,
AssumptionCache *AC, bool PreserveLCSSA,
Loop **ResultLoop) {
LLVM_DEBUG(dbgs() << "Trying runtime unrolling on Loop: \n");
LLVM_DEBUG(L->dump());
LLVM_DEBUG(UseEpilogRemainder ? dbgs() << "Using epilog remainder.\n"
@ -911,16 +922,20 @@ bool llvm::UnrollRuntimeLoopRemainder(Loop *L, unsigned Count,
formDedicatedExitBlocks(remainderLoop, DT, LI, PreserveLCSSA);
}
auto UnrollResult = LoopUnrollResult::Unmodified;
if (remainderLoop && UnrollRemainder) {
LLVM_DEBUG(dbgs() << "Unrolling remainder loop\n");
UnrollLoop(remainderLoop, /*Count*/ Count - 1, /*TripCount*/ Count - 1,
/*Force*/ false, /*AllowRuntime*/ false,
/*AllowExpensiveTripCount*/ false, /*PreserveCondBr*/ true,
/*PreserveOnlyFirst*/ false, /*TripMultiple*/ 1,
/*PeelCount*/ 0, /*UnrollRemainder*/ false, LI, SE, DT, AC,
/*ORE*/ nullptr, PreserveLCSSA);
UnrollResult =
UnrollLoop(remainderLoop, /*Count*/ Count - 1, /*TripCount*/ Count - 1,
/*Force*/ false, /*AllowRuntime*/ false,
/*AllowExpensiveTripCount*/ false, /*PreserveCondBr*/ true,
/*PreserveOnlyFirst*/ false, /*TripMultiple*/ 1,
/*PeelCount*/ 0, /*UnrollRemainder*/ false, LI, SE, DT, AC,
/*ORE*/ nullptr, PreserveLCSSA);
}
if (ResultLoop && UnrollResult != LoopUnrollResult::FullyUnrolled)
*ResultLoop = remainderLoop;
NumRuntimeUnrolled++;
return true;
}

View File

@ -42,6 +42,8 @@ using namespace llvm::PatternMatch;
#define DEBUG_TYPE "loop-utils"
static const char *LLVMLoopDisableNonforced = "llvm.loop.disable_nonforced";
bool llvm::formDedicatedExitBlocks(Loop *L, DominatorTree *DT, LoopInfo *LI,
bool PreserveLCSSA) {
bool Changed = false;
@ -183,14 +185,8 @@ void llvm::initializeLoopPassPass(PassRegistry &Registry) {
INITIALIZE_PASS_DEPENDENCY(ScalarEvolutionWrapperPass)
}
/// Find string metadata for loop
///
/// If it has a value (e.g. {"llvm.distribute", 1} return the value as an
/// operand or null otherwise. If the string metadata is not found return
/// Optional's not-a-value.
Optional<const MDOperand *> llvm::findStringMetadataForLoop(Loop *TheLoop,
StringRef Name) {
MDNode *LoopID = TheLoop->getLoopID();
static Optional<MDNode *> findOptionMDForLoopID(MDNode *LoopID,
StringRef Name) {
// Return none if LoopID is false.
if (!LoopID)
return None;
@ -209,18 +205,253 @@ Optional<const MDOperand *> llvm::findStringMetadataForLoop(Loop *TheLoop,
continue;
// Return true if MDString holds expected MetaData.
if (Name.equals(S->getString()))
switch (MD->getNumOperands()) {
case 1:
return nullptr;
case 2:
return &MD->getOperand(1);
default:
llvm_unreachable("loop metadata has 0 or 1 operand");
}
return MD;
}
return None;
}
static Optional<MDNode *> findOptionMDForLoop(const Loop *TheLoop,
StringRef Name) {
return findOptionMDForLoopID(TheLoop->getLoopID(), Name);
}
/// Find string metadata for loop
///
/// If it has a value (e.g. {"llvm.distribute", 1} return the value as an
/// operand or null otherwise. If the string metadata is not found return
/// Optional's not-a-value.
Optional<const MDOperand *> llvm::findStringMetadataForLoop(Loop *TheLoop,
StringRef Name) {
auto MD = findOptionMDForLoop(TheLoop, Name).getValueOr(nullptr);
if (!MD)
return None;
switch (MD->getNumOperands()) {
case 1:
return nullptr;
case 2:
return &MD->getOperand(1);
default:
llvm_unreachable("loop metadata has 0 or 1 operand");
}
}
static Optional<bool> getOptionalBoolLoopAttribute(const Loop *TheLoop,
StringRef Name) {
Optional<MDNode *> MD = findOptionMDForLoop(TheLoop, Name);
if (!MD.hasValue())
return None;
MDNode *OptionNode = MD.getValue();
if (OptionNode == nullptr)
return None;
switch (OptionNode->getNumOperands()) {
case 1:
// When the value is absent it is interpreted as 'attribute set'.
return true;
case 2:
return mdconst::extract_or_null<ConstantInt>(
OptionNode->getOperand(1).get());
}
llvm_unreachable("unexpected number of options");
}
static bool getBooleanLoopAttribute(const Loop *TheLoop, StringRef Name) {
return getOptionalBoolLoopAttribute(TheLoop, Name).getValueOr(false);
}
llvm::Optional<int> llvm::getOptionalIntLoopAttribute(Loop *TheLoop,
StringRef Name) {
const MDOperand *AttrMD =
findStringMetadataForLoop(TheLoop, Name).getValueOr(nullptr);
if (!AttrMD)
return None;
ConstantInt *IntMD = mdconst::extract_or_null<ConstantInt>(AttrMD->get());
if (!IntMD)
return None;
return IntMD->getSExtValue();
}
Optional<MDNode *> llvm::makeFollowupLoopID(
MDNode *OrigLoopID, ArrayRef<StringRef> FollowupOptions,
const char *InheritOptionsExceptPrefix, bool AlwaysNew) {
if (!OrigLoopID) {
if (AlwaysNew)
return nullptr;
return None;
}
assert(OrigLoopID->getOperand(0) == OrigLoopID);
bool InheritAllAttrs = !InheritOptionsExceptPrefix;
bool InheritSomeAttrs =
InheritOptionsExceptPrefix && InheritOptionsExceptPrefix[0] != '\0';
SmallVector<Metadata *, 8> MDs;
MDs.push_back(nullptr);
bool Changed = false;
if (InheritAllAttrs || InheritSomeAttrs) {
for (const MDOperand &Existing : drop_begin(OrigLoopID->operands(), 1)) {
MDNode *Op = cast<MDNode>(Existing.get());
auto InheritThisAttribute = [InheritSomeAttrs,
InheritOptionsExceptPrefix](MDNode *Op) {
if (!InheritSomeAttrs)
return false;
// Skip malformatted attribute metadata nodes.
if (Op->getNumOperands() == 0)
return true;
Metadata *NameMD = Op->getOperand(0).get();
if (!isa<MDString>(NameMD))
return true;
StringRef AttrName = cast<MDString>(NameMD)->getString();
// Do not inherit excluded attributes.
return !AttrName.startswith(InheritOptionsExceptPrefix);
};
if (InheritThisAttribute(Op))
MDs.push_back(Op);
else
Changed = true;
}
} else {
// Modified if we dropped at least one attribute.
Changed = OrigLoopID->getNumOperands() > 1;
}
bool HasAnyFollowup = false;
for (StringRef OptionName : FollowupOptions) {
MDNode *FollowupNode =
findOptionMDForLoopID(OrigLoopID, OptionName).getValueOr(nullptr);
if (!FollowupNode)
continue;
HasAnyFollowup = true;
for (const MDOperand &Option : drop_begin(FollowupNode->operands(), 1)) {
MDs.push_back(Option.get());
Changed = true;
}
}
// Attributes of the followup loop not specified explicity, so signal to the
// transformation pass to add suitable attributes.
if (!AlwaysNew && !HasAnyFollowup)
return None;
// If no attributes were added or remove, the previous loop Id can be reused.
if (!AlwaysNew && !Changed)
return OrigLoopID;
// No attributes is equivalent to having no !llvm.loop metadata at all.
if (MDs.size() == 1)
return nullptr;
// Build the new loop ID.
MDTuple *FollowupLoopID = MDNode::get(OrigLoopID->getContext(), MDs);
FollowupLoopID->replaceOperandWith(0, FollowupLoopID);
return FollowupLoopID;
}
bool llvm::hasDisableAllTransformsHint(const Loop *L) {
return getBooleanLoopAttribute(L, LLVMLoopDisableNonforced);
}
TransformationMode llvm::hasUnrollTransformation(Loop *L) {
if (getBooleanLoopAttribute(L, "llvm.loop.unroll.disable"))
return TM_SuppressedByUser;
Optional<int> Count =
getOptionalIntLoopAttribute(L, "llvm.loop.unroll.count");
if (Count.hasValue())
return Count.getValue() == 1 ? TM_SuppressedByUser : TM_ForcedByUser;
if (getBooleanLoopAttribute(L, "llvm.loop.unroll.enable"))
return TM_ForcedByUser;
if (getBooleanLoopAttribute(L, "llvm.loop.unroll.full"))
return TM_ForcedByUser;
if (hasDisableAllTransformsHint(L))
return TM_Disable;
return TM_Unspecified;
}
TransformationMode llvm::hasUnrollAndJamTransformation(Loop *L) {
if (getBooleanLoopAttribute(L, "llvm.loop.unroll_and_jam.disable"))
return TM_SuppressedByUser;
Optional<int> Count =
getOptionalIntLoopAttribute(L, "llvm.loop.unroll_and_jam.count");
if (Count.hasValue())
return Count.getValue() == 1 ? TM_SuppressedByUser : TM_ForcedByUser;
if (getBooleanLoopAttribute(L, "llvm.loop.unroll_and_jam.enable"))
return TM_ForcedByUser;
if (hasDisableAllTransformsHint(L))
return TM_Disable;
return TM_Unspecified;
}
TransformationMode llvm::hasVectorizeTransformation(Loop *L) {
Optional<bool> Enable =
getOptionalBoolLoopAttribute(L, "llvm.loop.vectorize.enable");
if (Enable == false)
return TM_SuppressedByUser;
Optional<int> VectorizeWidth =
getOptionalIntLoopAttribute(L, "llvm.loop.vectorize.width");
Optional<int> InterleaveCount =
getOptionalIntLoopAttribute(L, "llvm.loop.interleave.count");
if (Enable == true) {
// 'Forcing' vector width and interleave count to one effectively disables
// this tranformation.
if (VectorizeWidth == 1 && InterleaveCount == 1)
return TM_SuppressedByUser;
return TM_ForcedByUser;
}
if (getBooleanLoopAttribute(L, "llvm.loop.isvectorized"))
return TM_Disable;
if (VectorizeWidth == 1 && InterleaveCount == 1)
return TM_Disable;
if (VectorizeWidth > 1 || InterleaveCount > 1)
return TM_Enable;
if (hasDisableAllTransformsHint(L))
return TM_Disable;
return TM_Unspecified;
}
TransformationMode llvm::hasDistributeTransformation(Loop *L) {
if (getBooleanLoopAttribute(L, "llvm.loop.distribute.enable"))
return TM_ForcedByUser;
if (hasDisableAllTransformsHint(L))
return TM_Disable;
return TM_Unspecified;
}
TransformationMode llvm::hasLICMVersioningTransformation(Loop *L) {
if (getBooleanLoopAttribute(L, "llvm.loop.licm_versioning.disable"))
return TM_SuppressedByUser;
if (hasDisableAllTransformsHint(L))
return TM_Disable;
return TM_Unspecified;
}
/// Does a BFS from a given node to all of its children inside a given loop.
/// The returned vector of nodes includes the starting point.
SmallVector<DomTreeNode *, 16>

View File

@ -152,6 +152,16 @@ using namespace llvm;
#define LV_NAME "loop-vectorize"
#define DEBUG_TYPE LV_NAME
/// @{
/// Metadata attribute names
static const char *const LLVMLoopVectorizeFollowupAll =
"llvm.loop.vectorize.followup_all";
static const char *const LLVMLoopVectorizeFollowupVectorized =
"llvm.loop.vectorize.followup_vectorized";
static const char *const LLVMLoopVectorizeFollowupEpilogue =
"llvm.loop.vectorize.followup_epilogue";
/// @}
STATISTIC(LoopsVectorized, "Number of loops vectorized");
STATISTIC(LoopsAnalyzed, "Number of loops analyzed for vectorization");
@ -796,27 +806,6 @@ void InnerLoopVectorizer::addMetadata(ArrayRef<Value *> To,
}
}
static void emitMissedWarning(Function *F, Loop *L,
const LoopVectorizeHints &LH,
OptimizationRemarkEmitter *ORE) {
LH.emitRemarkWithHints();
if (LH.getForce() == LoopVectorizeHints::FK_Enabled) {
if (LH.getWidth() != 1)
ORE->emit(DiagnosticInfoOptimizationFailure(
DEBUG_TYPE, "FailedRequestedVectorization",
L->getStartLoc(), L->getHeader())
<< "loop not vectorized: "
<< "failed explicitly specified loop vectorization");
else if (LH.getInterleave() != 1)
ORE->emit(DiagnosticInfoOptimizationFailure(
DEBUG_TYPE, "FailedRequestedInterleaving", L->getStartLoc(),
L->getHeader())
<< "loop not interleaved: "
<< "failed explicitly specified loop interleaving");
}
}
namespace llvm {
/// LoopVectorizationCostModel - estimates the expected speedups due to
@ -1377,7 +1366,7 @@ static bool isExplicitVecOuterLoop(Loop *OuterLp,
if (!Hints.getWidth()) {
LLVM_DEBUG(dbgs() << "LV: Not vectorizing: No user vector width.\n");
emitMissedWarning(Fn, OuterLp, Hints, ORE);
Hints.emitRemarkWithHints();
return false;
}
@ -1385,7 +1374,7 @@ static bool isExplicitVecOuterLoop(Loop *OuterLp,
// TODO: Interleave support is future work.
LLVM_DEBUG(dbgs() << "LV: Not vectorizing: Interleave is not supported for "
"outer loops.\n");
emitMissedWarning(Fn, OuterLp, Hints, ORE);
Hints.emitRemarkWithHints();
return false;
}
@ -2739,6 +2728,7 @@ BasicBlock *InnerLoopVectorizer::createVectorizedLoopSkeleton() {
BasicBlock *OldBasicBlock = OrigLoop->getHeader();
BasicBlock *VectorPH = OrigLoop->getLoopPreheader();
BasicBlock *ExitBlock = OrigLoop->getExitBlock();
MDNode *OrigLoopID = OrigLoop->getLoopID();
assert(VectorPH && "Invalid loop structure");
assert(ExitBlock && "Must have an exit block");
@ -2882,6 +2872,17 @@ BasicBlock *InnerLoopVectorizer::createVectorizedLoopSkeleton() {
LoopVectorBody = VecBody;
LoopScalarBody = OldBasicBlock;
Optional<MDNode *> VectorizedLoopID =
makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,
LLVMLoopVectorizeFollowupVectorized});
if (VectorizedLoopID.hasValue()) {
Lp->setLoopID(VectorizedLoopID.getValue());
// Do not setAlreadyVectorized if loop attributes have been defined
// explicitly.
return LoopVectorPreHeader;
}
// Keep all loop hints from the original loop on the vector loop (we'll
// replace the vectorizer-specific hints below).
if (MDNode *LID = OrigLoop->getLoopID())
@ -7177,7 +7178,7 @@ bool LoopVectorizePass::processLoop(Loop *L) {
&Requirements, &Hints, DB, AC);
if (!LVL.canVectorize(EnableVPlanNativePath)) {
LLVM_DEBUG(dbgs() << "LV: Not vectorizing: Cannot prove legality.\n");
emitMissedWarning(F, L, Hints, ORE);
Hints.emitRemarkWithHints();
return false;
}
@ -7250,7 +7251,7 @@ bool LoopVectorizePass::processLoop(Loop *L) {
ORE->emit(createLVMissedAnalysis(Hints.vectorizeAnalysisPassName(),
"NoImplicitFloat", L)
<< "loop not vectorized due to NoImplicitFloat attribute");
emitMissedWarning(F, L, Hints, ORE);
Hints.emitRemarkWithHints();
return false;
}
@ -7265,7 +7266,7 @@ bool LoopVectorizePass::processLoop(Loop *L) {
ORE->emit(
createLVMissedAnalysis(Hints.vectorizeAnalysisPassName(), "UnsafeFP", L)
<< "loop not vectorized due to unsafe FP support.");
emitMissedWarning(F, L, Hints, ORE);
Hints.emitRemarkWithHints();
return false;
}
@ -7307,7 +7308,7 @@ bool LoopVectorizePass::processLoop(Loop *L) {
if (Requirements.doesNotMeet(F, L, Hints)) {
LLVM_DEBUG(dbgs() << "LV: Not vectorizing: loop did not meet vectorization "
"requirements.\n");
emitMissedWarning(F, L, Hints, ORE);
Hints.emitRemarkWithHints();
return false;
}
@ -7384,6 +7385,8 @@ bool LoopVectorizePass::processLoop(Loop *L) {
LVP.setBestPlan(VF.Width, IC);
using namespace ore;
bool DisableRuntimeUnroll = false;
MDNode *OrigLoopID = L->getLoopID();
if (!VectorizeLoop) {
assert(IC > 1 && "interleave count should not be 1 or 0");
@ -7410,7 +7413,7 @@ bool LoopVectorizePass::processLoop(Loop *L) {
// no runtime checks about strides and memory. A scalar loop that is
// rarely used is not worth unrolling.
if (!LB.areSafetyChecksAdded())
AddRuntimeUnrollDisableMetaData(L);
DisableRuntimeUnroll = true;
// Report the vectorization decision.
ORE->emit([&]() {
@ -7422,8 +7425,18 @@ bool LoopVectorizePass::processLoop(Loop *L) {
});
}
// Mark the loop as already vectorized to avoid vectorizing again.
Hints.setAlreadyVectorized();
Optional<MDNode *> RemainderLoopID =
makeFollowupLoopID(OrigLoopID, {LLVMLoopVectorizeFollowupAll,
LLVMLoopVectorizeFollowupEpilogue});
if (RemainderLoopID.hasValue()) {
L->setLoopID(RemainderLoopID.getValue());
} else {
if (DisableRuntimeUnroll)
AddRuntimeUnrollDisableMetaData(L);
// Mark the loop as already vectorized to avoid vectorizing again.
Hints.setAlreadyVectorized();
}
LLVM_DEBUG(verifyFunction(*L->getHeader()->getParent()));
return true;

View File

@ -246,6 +246,7 @@
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-O-NEXT: Running pass: LoopUnrollPass
; CHECK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-O-NEXT: Running pass: WarnMissedTransformationsPass
; CHECK-O-NEXT: Running pass: InstCombinePass
; CHECK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
; CHECK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass

View File

@ -224,6 +224,7 @@
; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
; CHECK-POSTLINK-O-NEXT: Running pass: LoopUnrollPass
; CHECK-POSTLINK-O-NEXT: Running analysis: OuterAnalysisManagerProxy
; CHECK-POSTLINK-O-NEXT: Running pass: WarnMissedTransformationsPass
; CHECK-POSTLINK-O-NEXT: Running pass: InstCombinePass
; CHECK-POSTLINK-O-NEXT: Running pass: RequireAnalysisPass<{{.*}}OptimizationRemarkEmitterAnalysis
; CHECK-POSTLINK-O-NEXT: Running pass: FunctionToLoopPassAdaptor<{{.*}}LICMPass

View File

@ -250,6 +250,10 @@
; CHECK-NEXT: Scalar Evolution Analysis
; CHECK-NEXT: Loop Pass Manager
; CHECK-NEXT: Loop Invariant Code Motion
; CHECK-NEXT: Lazy Branch Probability Analysis
; CHECK-NEXT: Lazy Block Frequency Analysis
; CHECK-NEXT: Optimization Remark Emitter
; CHECK-NEXT: Warn about non-applied transformations
; CHECK-NEXT: Alignment from assumptions
; CHECK-NEXT: Strip Unused Function Prototypes
; CHECK-NEXT: Dead Global Elimination

View File

@ -255,6 +255,10 @@
; CHECK-NEXT: Scalar Evolution Analysis
; CHECK-NEXT: Loop Pass Manager
; CHECK-NEXT: Loop Invariant Code Motion
; CHECK-NEXT: Lazy Branch Probability Analysis
; CHECK-NEXT: Lazy Block Frequency Analysis
; CHECK-NEXT: Optimization Remark Emitter
; CHECK-NEXT: Warn about non-applied transformations
; CHECK-NEXT: Alignment from assumptions
; CHECK-NEXT: Strip Unused Function Prototypes
; CHECK-NEXT: Dead Global Elimination

View File

@ -237,6 +237,10 @@
; CHECK-NEXT: Scalar Evolution Analysis
; CHECK-NEXT: Loop Pass Manager
; CHECK-NEXT: Loop Invariant Code Motion
; CHECK-NEXT: Lazy Branch Probability Analysis
; CHECK-NEXT: Lazy Block Frequency Analysis
; CHECK-NEXT: Optimization Remark Emitter
; CHECK-NEXT: Warn about non-applied transformations
; CHECK-NEXT: Alignment from assumptions
; CHECK-NEXT: Strip Unused Function Prototypes
; CHECK-NEXT: Dead Global Elimination

View File

@ -236,6 +236,10 @@
; CHECK-NEXT: Scalar Evolution Analysis
; CHECK-NEXT: Loop Pass Manager
; CHECK-NEXT: Loop Invariant Code Motion
; CHECK-NEXT: Lazy Branch Probability Analysis
; CHECK-NEXT: Lazy Block Frequency Analysis
; CHECK-NEXT: Optimization Remark Emitter
; CHECK-NEXT: Warn about non-applied transformations
; CHECK-NEXT: Alignment from assumptions
; CHECK-NEXT: Strip Unused Function Prototypes
; CHECK-NEXT: Dead Global Elimination

View File

@ -0,0 +1,50 @@
; RUN: opt -loop-distribute -enable-loop-distribute=1 -S < %s | FileCheck %s
;
; Check that the disable_nonforced is honored by loop distribution.
;
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced(
; CHECK-NOT: for.body.ldist1:
define void @disable_nonforced(i32* noalias %a,
i32* noalias %b,
i32* noalias %c,
i32* noalias %d,
i32* noalias %e) {
entry:
br label %for.body
for.body:
%ind = phi i64 [ 0, %entry ], [ %add, %for.body ]
%arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind
%loadA = load i32, i32* %arrayidxA, align 4
%arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind
%loadB = load i32, i32* %arrayidxB, align 4
%mulA = mul i32 %loadB, %loadA
%add = add nuw nsw i64 %ind, 1
%arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add
store i32 %mulA, i32* %arrayidxA_plus_4, align 4
%arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind
%loadD = load i32, i32* %arrayidxD, align 4
%arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind
%loadE = load i32, i32* %arrayidxE, align 4
%mulC = mul i32 %loadD, %loadE
%arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind
store i32 %mulC, i32* %arrayidxC, align 4
%exitcond = icmp eq i64 %add, 20
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}}

View File

@ -0,0 +1,51 @@
; RUN: opt -loop-distribute -S < %s | FileCheck %s
;
; Check that llvm.loop.distribute.enable overrides
; llvm.loop.disable_nonforced.
;
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced(
; CHECK: for.body.ldist1:
define void @disable_nonforced(i32* noalias %a,
i32* noalias %b,
i32* noalias %c,
i32* noalias %d,
i32* noalias %e) {
entry:
br label %for.body
for.body:
%ind = phi i64 [ 0, %entry ], [ %add, %for.body ]
%arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind
%loadA = load i32, i32* %arrayidxA, align 4
%arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind
%loadB = load i32, i32* %arrayidxB, align 4
%mulA = mul i32 %loadB, %loadA
%add = add nuw nsw i64 %ind, 1
%arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add
store i32 %mulA, i32* %arrayidxA_plus_4, align 4
%arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind
%loadD = load i32, i32* %arrayidxD, align 4
%arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind
%loadE = load i32, i32* %arrayidxE, align 4
%mulC = mul i32 %loadD, %loadE
%arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind
store i32 %mulC, i32* %arrayidxC, align 4
%exitcond = icmp eq i64 %add, 20
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.distribute.enable", i32 1}}

View File

@ -0,0 +1,66 @@
; RUN: opt -basicaa -loop-distribute -S < %s | FileCheck %s
;
; Check that followup loop-attributes are applied to the loops after
; loop distribution.
;
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define void @f(i32* %a, i32* %b, i32* %c, i32* %d, i32* %e) {
entry:
br label %for.body
for.body:
%ind = phi i64 [ 0, %entry ], [ %add, %for.body ]
%arrayidxA = getelementptr inbounds i32, i32* %a, i64 %ind
%loadA = load i32, i32* %arrayidxA, align 4
%arrayidxB = getelementptr inbounds i32, i32* %b, i64 %ind
%loadB = load i32, i32* %arrayidxB, align 4
%mulA = mul i32 %loadB, %loadA
%add = add nuw nsw i64 %ind, 1
%arrayidxA_plus_4 = getelementptr inbounds i32, i32* %a, i64 %add
store i32 %mulA, i32* %arrayidxA_plus_4, align 4
%arrayidxD = getelementptr inbounds i32, i32* %d, i64 %ind
%loadD = load i32, i32* %arrayidxD, align 4
%arrayidxE = getelementptr inbounds i32, i32* %e, i64 %ind
%loadE = load i32, i32* %arrayidxE, align 4
%mulC = mul i32 %loadD, %loadE
%arrayidxC = getelementptr inbounds i32, i32* %c, i64 %ind
store i32 %mulC, i32* %arrayidxC, align 4
%exitcond = icmp eq i64 %add, 20
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = distinct !{!0, !1, !2, !3, !4, !5}
!1 = !{!"llvm.loop.distribute.enable", i1 true}
!2 = !{!"llvm.loop.distribute.followup_all", !{!"FollowupAll"}}
!3 = !{!"llvm.loop.distribute.followup_coincident", !{!"FollowupCoincident", i1 false}}
!4 = !{!"llvm.loop.distribute.followup_sequential", !{!"FollowupSequential", i32 8}}
!5 = !{!"llvm.loop.distribute.followup_fallback", !{!"FollowupFallback"}}
; CHECK-LABEL: for.body.lver.orig:
; CHECK: br i1 %exitcond.lver.orig, label %for.end, label %for.body.lver.orig, !llvm.loop ![[LOOP_ORIG:[0-9]+]]
; CHECK-LABEL: for.body.ldist1:
; CHECK: br i1 %exitcond.ldist1, label %for.body.ph, label %for.body.ldist1, !llvm.loop ![[LOOP_SEQUENTIAL:[0-9]+]]
; CHECK-LABEL: for.body:
; CHECK: br i1 %exitcond, label %for.end, label %for.body, !llvm.loop ![[LOOP_COINCIDENT:[0-9]+]]
; CHECK: ![[LOOP_ORIG]] = distinct !{![[LOOP_ORIG]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOUP_FALLBACK:[0-9]+]]}
; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
; CHECK: ![[FOLLOUP_FALLBACK]] = !{!"FollowupFallback"}
; CHECK: ![[LOOP_SEQUENTIAL]] = distinct !{![[LOOP_SEQUENTIAL]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_SEQUENTIAL:[0-9]+]]}
; CHECK: ![[FOLLOWUP_SEQUENTIAL]] = !{!"FollowupSequential", i32 8}
; CHECK: ![[LOOP_COINCIDENT]] = distinct !{![[LOOP_COINCIDENT]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_COINCIDENT:[0-9]+]]}
; CHECK: ![[FOLLOWUP_COINCIDENT]] = !{!"FollowupCoincident", i1 false}

View File

@ -0,0 +1,99 @@
; Legacy pass manager
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; New pass manager
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; CHECK: warning: source.cpp:19:5: loop not distributed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; YAML: --- !Failure
; YAML-NEXT: Pass: transform-warning
; YAML-NEXT: Name: FailedRequestedDistribution
; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
; YAML-NEXT: Args:
; YAML-NEXT: - String: 'loop not distributed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
; YAML-NEXT: ...
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
entry:
%cmp9 = icmp sgt i32 %Length, 0, !dbg !32
br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32
for.body.preheader:
br label %for.body, !dbg !35
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
%arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
%0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
%idxprom1 = sext i32 %0 to i64, !dbg !35
%arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
%1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
%arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
%lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
%exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !50
for.end.loopexit:
br label %for.end
for.end:
ret void, !dbg !36
}
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!9, !10}
!llvm.ident = !{!11}
!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
!1 = !DIFile(filename: "source.cpp", directory: ".")
!2 = !{}
!4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
!5 = !DIFile(filename: "source.cpp", directory: ".")
!6 = !DISubroutineType(types: !2)
!7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
!8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
!9 = !{i32 2, !"Dwarf Version", i32 2}
!10 = !{i32 2, !"Debug Info Version", i32 3}
!11 = !{!"clang version 3.5.0"}
!12 = !DILocation(line: 3, column: 8, scope: !13)
!13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
!16 = !DILocation(line: 4, column: 5, scope: !17)
!17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
!18 = !{!19, !19, i64 0}
!19 = !{!"int", !20, i64 0}
!20 = !{!"omnipotent char", !21, i64 0}
!21 = !{!"Simple C/C++ TBAA"}
!22 = !DILocation(line: 5, column: 9, scope: !23)
!23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
!24 = !DILocation(line: 8, column: 1, scope: !4)
!25 = !DILocation(line: 12, column: 8, scope: !26)
!26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
!30 = !DILocation(line: 13, column: 5, scope: !26)
!31 = !DILocation(line: 14, column: 1, scope: !7)
!32 = !DILocation(line: 18, column: 8, scope: !33)
!33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
!35 = !DILocation(line: 19, column: 5, scope: !33)
!36 = !DILocation(line: 20, column: 1, scope: !8)
!37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
!38 = !DILocation(line: 27, column: 3, scope: !37)
!39 = !DILocation(line: 31, column: 3, scope: !37)
!40 = !DILocation(line: 28, column: 9, scope: !37)
!41 = !DILocation(line: 29, column: 11, scope: !37)
!42 = !DILocation(line: 29, column: 7, scope: !37)
!43 = !DILocation(line: 27, column: 32, scope: !37)
!44 = !DILocation(line: 27, column: 30, scope: !37)
!45 = !DILocation(line: 27, column: 21, scope: !37)
!46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
!50 = !{!50, !{!"llvm.loop.distribute.enable"}}

View File

@ -0,0 +1,99 @@
; Legacy pass manager
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; New pass manager
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; CHECK: warning: source.cpp:19:5: loop not unroll-and-jammed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; YAML: --- !Failure
; YAML-NEXT: Pass: transform-warning
; YAML-NEXT: Name: FailedRequestedUnrollAndJamming
; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
; YAML-NEXT: Args:
; YAML-NEXT: - String: 'loop not unroll-and-jammed: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
; YAML-NEXT: ...
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
entry:
%cmp9 = icmp sgt i32 %Length, 0, !dbg !32
br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32
for.body.preheader:
br label %for.body, !dbg !35
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
%arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
%0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
%idxprom1 = sext i32 %0 to i64, !dbg !35
%arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
%1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
%arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
%lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
%exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !50
for.end.loopexit:
br label %for.end
for.end:
ret void, !dbg !36
}
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!9, !10}
!llvm.ident = !{!11}
!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
!1 = !DIFile(filename: "source.cpp", directory: ".")
!2 = !{}
!4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
!5 = !DIFile(filename: "source.cpp", directory: ".")
!6 = !DISubroutineType(types: !2)
!7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
!8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
!9 = !{i32 2, !"Dwarf Version", i32 2}
!10 = !{i32 2, !"Debug Info Version", i32 3}
!11 = !{!"clang version 3.5.0"}
!12 = !DILocation(line: 3, column: 8, scope: !13)
!13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
!16 = !DILocation(line: 4, column: 5, scope: !17)
!17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
!18 = !{!19, !19, i64 0}
!19 = !{!"int", !20, i64 0}
!20 = !{!"omnipotent char", !21, i64 0}
!21 = !{!"Simple C/C++ TBAA"}
!22 = !DILocation(line: 5, column: 9, scope: !23)
!23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
!24 = !DILocation(line: 8, column: 1, scope: !4)
!25 = !DILocation(line: 12, column: 8, scope: !26)
!26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
!30 = !DILocation(line: 13, column: 5, scope: !26)
!31 = !DILocation(line: 14, column: 1, scope: !7)
!32 = !DILocation(line: 18, column: 8, scope: !33)
!33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
!35 = !DILocation(line: 19, column: 5, scope: !33)
!36 = !DILocation(line: 20, column: 1, scope: !8)
!37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
!38 = !DILocation(line: 27, column: 3, scope: !37)
!39 = !DILocation(line: 31, column: 3, scope: !37)
!40 = !DILocation(line: 28, column: 9, scope: !37)
!41 = !DILocation(line: 29, column: 11, scope: !37)
!42 = !DILocation(line: 29, column: 7, scope: !37)
!43 = !DILocation(line: 27, column: 32, scope: !37)
!44 = !DILocation(line: 27, column: 30, scope: !37)
!45 = !DILocation(line: 27, column: 21, scope: !37)
!46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
!50 = !{!50, !{!"llvm.loop.unroll_and_jam.enable"}}

View File

@ -0,0 +1,99 @@
; Legacy pass manager
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; New pass manager
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; CHECK: warning: source.cpp:19:5: loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; YAML: --- !Failure
; YAML-NEXT: Pass: transform-warning
; YAML-NEXT: Name: FailedRequestedUnrolling
; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
; YAML-NEXT: Args:
; YAML-NEXT: - String: 'loop not unrolled: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
; YAML-NEXT: ...
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
entry:
%cmp9 = icmp sgt i32 %Length, 0, !dbg !32
br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32
for.body.preheader:
br label %for.body, !dbg !35
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
%arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
%0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
%idxprom1 = sext i32 %0 to i64, !dbg !35
%arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
%1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
%arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
%lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
%exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !50
for.end.loopexit:
br label %for.end
for.end:
ret void, !dbg !36
}
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!9, !10}
!llvm.ident = !{!11}
!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
!1 = !DIFile(filename: "source.cpp", directory: ".")
!2 = !{}
!4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
!5 = !DIFile(filename: "source.cpp", directory: ".")
!6 = !DISubroutineType(types: !2)
!7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
!8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
!9 = !{i32 2, !"Dwarf Version", i32 2}
!10 = !{i32 2, !"Debug Info Version", i32 3}
!11 = !{!"clang version 3.5.0"}
!12 = !DILocation(line: 3, column: 8, scope: !13)
!13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
!16 = !DILocation(line: 4, column: 5, scope: !17)
!17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
!18 = !{!19, !19, i64 0}
!19 = !{!"int", !20, i64 0}
!20 = !{!"omnipotent char", !21, i64 0}
!21 = !{!"Simple C/C++ TBAA"}
!22 = !DILocation(line: 5, column: 9, scope: !23)
!23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
!24 = !DILocation(line: 8, column: 1, scope: !4)
!25 = !DILocation(line: 12, column: 8, scope: !26)
!26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
!30 = !DILocation(line: 13, column: 5, scope: !26)
!31 = !DILocation(line: 14, column: 1, scope: !7)
!32 = !DILocation(line: 18, column: 8, scope: !33)
!33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
!35 = !DILocation(line: 19, column: 5, scope: !33)
!36 = !DILocation(line: 20, column: 1, scope: !8)
!37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
!38 = !DILocation(line: 27, column: 3, scope: !37)
!39 = !DILocation(line: 31, column: 3, scope: !37)
!40 = !DILocation(line: 28, column: 9, scope: !37)
!41 = !DILocation(line: 29, column: 11, scope: !37)
!42 = !DILocation(line: 29, column: 7, scope: !37)
!43 = !DILocation(line: 27, column: 32, scope: !37)
!44 = !DILocation(line: 27, column: 30, scope: !37)
!45 = !DILocation(line: 27, column: 21, scope: !37)
!46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)
!50 = !{!50, !{!"llvm.loop.unroll.enable"}}

View File

@ -0,0 +1,113 @@
; Legacy pass manager
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; New pass manager
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-missed=transform-warning -pass-remarks-analysis=transform-warning 2>&1 | FileCheck %s
; RUN: opt < %s -passes=transform-warning -disable-output -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; C/C++ code for tests
; void test(int *A, int Length) {
; #pragma clang loop vectorize(enable) interleave(enable)
; for (int i = 0; i < Length; i++) {
; A[i] = i;
; if (A[i] > Length)
; break;
; }
; }
; File, line, and column should match those specified in the metadata
; CHECK: warning: source.cpp:19:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; YAML: --- !Failure
; YAML-NEXT: Pass: transform-warning
; YAML-NEXT: Name: FailedRequestedVectorization
; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
; YAML-NEXT: Args:
; YAML-NEXT: - String: 'loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
; YAML-NEXT: ...
target datalayout = "e-m:o-i64:64-f80:128-n8:16:32:64-S128"
define void @_Z17test_array_boundsPiS_i(i32* nocapture %A, i32* nocapture readonly %B, i32 %Length) !dbg !8 {
entry:
%cmp9 = icmp sgt i32 %Length, 0, !dbg !32
br i1 %cmp9, label %for.body.preheader, label %for.end, !dbg !32, !llvm.loop !34
for.body.preheader:
br label %for.body, !dbg !35
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %for.body.preheader ]
%arrayidx = getelementptr inbounds i32, i32* %B, i64 %indvars.iv, !dbg !35
%0 = load i32, i32* %arrayidx, align 4, !dbg !35, !tbaa !18
%idxprom1 = sext i32 %0 to i64, !dbg !35
%arrayidx2 = getelementptr inbounds i32, i32* %A, i64 %idxprom1, !dbg !35
%1 = load i32, i32* %arrayidx2, align 4, !dbg !35, !tbaa !18
%arrayidx4 = getelementptr inbounds i32, i32* %A, i64 %indvars.iv, !dbg !35
store i32 %1, i32* %arrayidx4, align 4, !dbg !35, !tbaa !18
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1, !dbg !32
%lftr.wideiv = trunc i64 %indvars.iv.next to i32, !dbg !32
%exitcond = icmp eq i32 %lftr.wideiv, %Length, !dbg !32
br i1 %exitcond, label %for.end.loopexit, label %for.body, !dbg !32, !llvm.loop !34
for.end.loopexit:
br label %for.end
for.end:
ret void, !dbg !36
}
!llvm.dbg.cu = !{!0}
!llvm.module.flags = !{!9, !10}
!llvm.ident = !{!11}
!0 = distinct !DICompileUnit(language: DW_LANG_C_plus_plus, producer: "clang version 3.5.0", isOptimized: true, runtimeVersion: 6, emissionKind: LineTablesOnly, file: !1, enums: !2, retainedTypes: !2, globals: !2, imports: !2)
!1 = !DIFile(filename: "source.cpp", directory: ".")
!2 = !{}
!4 = distinct !DISubprogram(name: "test", line: 1, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 1, file: !1, scope: !5, type: !6, retainedNodes: !2)
!5 = !DIFile(filename: "source.cpp", directory: ".")
!6 = !DISubroutineType(types: !2)
!7 = distinct !DISubprogram(name: "test_disabled", line: 10, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 10, file: !1, scope: !5, type: !6, retainedNodes: !2)
!8 = distinct !DISubprogram(name: "test_array_bounds", line: 16, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 16, file: !1, scope: !5, type: !6, retainedNodes: !2)
!9 = !{i32 2, !"Dwarf Version", i32 2}
!10 = !{i32 2, !"Debug Info Version", i32 3}
!11 = !{!"clang version 3.5.0"}
!12 = !DILocation(line: 3, column: 8, scope: !13)
!13 = distinct !DILexicalBlock(line: 3, column: 3, file: !1, scope: !4)
!14 = !{!14, !15, !15}
!15 = !{!"llvm.loop.vectorize.enable", i1 true}
!16 = !DILocation(line: 4, column: 5, scope: !17)
!17 = distinct !DILexicalBlock(line: 3, column: 36, file: !1, scope: !13)
!18 = !{!19, !19, i64 0}
!19 = !{!"int", !20, i64 0}
!20 = !{!"omnipotent char", !21, i64 0}
!21 = !{!"Simple C/C++ TBAA"}
!22 = !DILocation(line: 5, column: 9, scope: !23)
!23 = distinct !DILexicalBlock(line: 5, column: 9, file: !1, scope: !17)
!24 = !DILocation(line: 8, column: 1, scope: !4)
!25 = !DILocation(line: 12, column: 8, scope: !26)
!26 = distinct !DILexicalBlock(line: 12, column: 3, file: !1, scope: !7)
!27 = !{!27, !28, !29}
!28 = !{!"llvm.loop.interleave.count", i32 1}
!29 = !{!"llvm.loop.vectorize.width", i32 1}
!30 = !DILocation(line: 13, column: 5, scope: !26)
!31 = !DILocation(line: 14, column: 1, scope: !7)
!32 = !DILocation(line: 18, column: 8, scope: !33)
!33 = distinct !DILexicalBlock(line: 18, column: 3, file: !1, scope: !8)
!34 = !{!34, !15}
!35 = !DILocation(line: 19, column: 5, scope: !33)
!36 = !DILocation(line: 20, column: 1, scope: !8)
!37 = distinct !DILexicalBlock(line: 24, column: 3, file: !1, scope: !46)
!38 = !DILocation(line: 27, column: 3, scope: !37)
!39 = !DILocation(line: 31, column: 3, scope: !37)
!40 = !DILocation(line: 28, column: 9, scope: !37)
!41 = !DILocation(line: 29, column: 11, scope: !37)
!42 = !DILocation(line: 29, column: 7, scope: !37)
!43 = !DILocation(line: 27, column: 32, scope: !37)
!44 = !DILocation(line: 27, column: 30, scope: !37)
!45 = !DILocation(line: 27, column: 21, scope: !37)
!46 = distinct !DISubprogram(name: "test_multiple_failures", line: 26, isLocal: false, isDefinition: true, virtualIndex: 6, flags: DIFlagPrototyped, isOptimized: true, unit: !0, scopeLine: 26, file: !1, scope: !5, type: !6, retainedNodes: !2)

View File

@ -0,0 +1,29 @@
; RUN: opt -loop-unroll -unroll-count=2 -S < %s | FileCheck %s
;
; Check that the disable_nonforced loop property is honored by
; loop unroll.
;
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced(
; CHECK: load
; CHECK-NOT: load
define void @disable_nonforced(i32* nocapture %a) {
entry:
br label %for.body
for.body:
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = load i32, i32* %arrayidx, align 4
%inc = add nsw i32 %0, 1
store i32 %inc, i32* %arrayidx, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 64
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = !{!0, !{!"llvm.loop.disable_nonforced"}}

View File

@ -0,0 +1,30 @@
; RUN: opt -loop-unroll -unroll-count=2 -S < %s | FileCheck %s
;
; Check whether the llvm.loop.unroll.count loop property overrides
; llvm.loop.disable_nonforced.
;
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced_count(
; CHECK: store
; CHECK: store
; CHECK-NOT: store
define void @disable_nonforced_count(i32* nocapture %a) {
entry:
br label %for.body
for.body:
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = load i32, i32* %arrayidx, align 4
%inc = add nsw i32 %0, 1
store i32 %inc, i32* %arrayidx, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 64
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll.count", i32 2}}

View File

@ -0,0 +1,30 @@
; RUN: opt -loop-unroll -unroll-count=2 -S < %s | FileCheck %s
;
; Check that the llvm.loop.unroll.enable loop property overrides
; llvm.loop.disable_nonforced.
;
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced_enable(
; CHECK: store
; CHECK: store
; CHECK-NOT: store
define void @disable_nonforced_enable(i32* nocapture %a) {
entry:
br label %for.body
for.body:
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = load i32, i32* %arrayidx, align 4
%inc = add nsw i32 %0, 1
store i32 %inc, i32* %arrayidx, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 64
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll.enable"}}

View File

@ -0,0 +1,32 @@
; RUN: opt -loop-unroll -S < %s | FileCheck %s
;
; Check that the llvm.loop.unroll.full loop property overrides
; llvm.loop.disable_nonforced.
;
target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced_full(
; CHECK: store
; CHECK: store
; CHECK: store
; CHECK: store
; CHECK-NOT: store
define void @disable_nonforced_full(i32* nocapture %a) {
entry:
br label %for.body
for.body:
%indvars.iv = phi i64 [ 0, %entry ], [ %indvars.iv.next, %for.body ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = load i32, i32* %arrayidx, align 4
%inc = add nsw i32 %0, 1
store i32 %inc, i32* %arrayidx, align 4
%indvars.iv.next = add nuw nsw i64 %indvars.iv, 1
%exitcond = icmp eq i64 %indvars.iv.next, 4
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll.full"}}

View File

@ -0,0 +1,63 @@
; RUN: opt < %s -S -loop-unroll -unroll-count=2 | FileCheck %s -check-prefixes=COUNT,COMMON
; RUN: opt < %s -S -loop-unroll -unroll-runtime=true -unroll-runtime-epilog=true | FileCheck %s -check-prefixes=EPILOG,COMMON
; RUN: opt < %s -S -loop-unroll -unroll-runtime=true -unroll-runtime-epilog=false | FileCheck %s -check-prefixes=PROLOG,COMMON
;
; Check that followup-attributes are applied after LoopUnroll.
;
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
define i32 @test(i32* nocapture %a, i32 %n) nounwind uwtable readonly {
entry:
%cmp1 = icmp eq i32 %n, 0
br i1 %cmp1, label %for.end, label %for.body
for.body: ; preds = %for.body, %entry
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
%sum.02 = phi i32 [ %add, %for.body ], [ 0, %entry ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = load i32, i32* %arrayidx, align 4
%add = add nsw i32 %0, %sum.02
%indvars.iv.next = add i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, %n
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !4
for.end: ; preds = %for.body, %entry
%sum.0.lcssa = phi i32 [ 0, %entry ], [ %add, %for.body ]
ret i32 %sum.0.lcssa
}
!1 = !{!"llvm.loop.unroll.followup_all", !{!"FollowupAll"}}
!2 = !{!"llvm.loop.unroll.followup_unrolled", !{!"FollowupUnrolled"}}
!3 = !{!"llvm.loop.unroll.followup_remainder", !{!"FollowupRemainder"}}
!4 = distinct !{!4, !1, !2, !3}
; COMMON-LABEL: @test(
; COUNT: br i1 %exitcond.1, label %for.end.loopexit, label %for.body, !llvm.loop ![[LOOP:[0-9]+]]
; COUNT: ![[FOLLOWUP_ALL:[0-9]+]] = !{!"FollowupAll"}
; COUNT: ![[FOLLOWUP_UNROLLED:[0-9]+]] = !{!"FollowupUnrolled"}
; COUNT: ![[LOOP]] = distinct !{![[LOOP]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_UNROLLED]]}
; EPILOG: br i1 %niter.ncmp.7, label %for.end.loopexit.unr-lcssa.loopexit, label %for.body, !llvm.loop ![[LOOP_0:[0-9]+]]
; EPILOG: br i1 %epil.iter.cmp, label %for.body.epil, label %for.end.loopexit.epilog-lcssa, !llvm.loop ![[LOOP_2:[0-9]+]]
; EPILOG: ![[LOOP_0]] = distinct !{![[LOOP_0]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_UNROLLED:[0-9]+]]}
; EPILOG: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
; EPILOG: ![[FOLLOWUP_UNROLLED]] = !{!"FollowupUnrolled"}
; EPILOG: ![[LOOP_2]] = distinct !{![[LOOP_2]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_REMAINDER:[0-9]+]]}
; EPILOG: ![[FOLLOWUP_REMAINDER]] = !{!"FollowupRemainder"}
; PROLOG: br i1 %prol.iter.cmp, label %for.body.prol, label %for.body.prol.loopexit.unr-lcssa, !llvm.loop ![[LOOP_0:[0-9]+]]
; PROLOG: br i1 %exitcond.7, label %for.end.loopexit.unr-lcssa, label %for.body, !llvm.loop ![[LOOP_2:[0-9]+]]
; PROLOG: ![[LOOP_0]] = distinct !{![[LOOP_0]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_REMAINDER:[0-9]+]]}
; PROLOG: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
; PROLOG: ![[FOLLOWUP_REMAINDER]] = !{!"FollowupRemainder"}
; PROLOG: ![[LOOP_2]] = distinct !{![[LOOP_2]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_UNROLLED:[0-9]+]]}
; PROLOG: ![[FOLLOWUP_UNROLLED]] = !{!"FollowupUnrolled"}

View File

@ -0,0 +1,50 @@
; RUN: opt -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=2 -S < %s | FileCheck %s
;
; Check that the disable_nonforced loop property is honored by
; loop unroll-and-jam.
;
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
; CHECK-LABEL: disable_nonforced
; CHECK: load
; CHECK-NOT: load
define void @disable_nonforced(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
entry:
%cmp = icmp ne i32 %J, 0
%cmp122 = icmp ne i32 %I, 0
%or.cond = and i1 %cmp, %cmp122
br i1 %or.cond, label %for.outer.preheader, label %for.end
for.outer.preheader:
br label %for.outer
for.outer:
%i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
br label %for.inner
for.inner:
%j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
%sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
%arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
%0 = load i32, i32* %arrayidx.us, align 4
%add.us = add i32 %0, %sum1.us
%inc.us = add nuw i32 %j.us, 1
%exitcond = icmp eq i32 %inc.us, %J
br i1 %exitcond, label %for.latch, label %for.inner
for.latch:
%add.us.lcssa = phi i32 [ %add.us, %for.inner ]
%arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
%add8.us = add nuw i32 %i.us, 1
%exitcond25 = icmp eq i32 %add8.us, %I
br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
for.end.loopexit:
br label %for.end
for.end:
ret void
}
!0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}}

View File

@ -0,0 +1,52 @@
; RUN: opt -loop-unroll-and-jam -allow-unroll-and-jam -S < %s | FileCheck %s
;
; Verify that the llvm.loop.unroll_and_jam.count loop property overrides
; llvm.loop.disable_nonforced.
;
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
; CHECK-LABEL: @disable_nonforced_enable(
; CHECK: load
; CHECK: load
; CHECK-NOT: load
; CHECK: br i1
define void @disable_nonforced_enable(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
entry:
%cmp = icmp ne i32 %J, 0
%cmp122 = icmp ne i32 %I, 0
%or.cond = and i1 %cmp, %cmp122
br i1 %or.cond, label %for.outer.preheader, label %for.end
for.outer.preheader:
br label %for.outer
for.outer:
%i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
br label %for.inner
for.inner:
%j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
%sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
%arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
%0 = load i32, i32* %arrayidx.us, align 4
%add.us = add i32 %0, %sum1.us
%inc.us = add nuw i32 %j.us, 1
%exitcond = icmp eq i32 %inc.us, %J
br i1 %exitcond, label %for.latch, label %for.inner
for.latch:
%add.us.lcssa = phi i32 [ %add.us, %for.inner ]
%arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
%add8.us = add nuw i32 %i.us, 1
%exitcond25 = icmp eq i32 %add8.us, %I
br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
for.end.loopexit:
br label %for.end
for.end:
ret void
}
!0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll_and_jam.count", i32 2}}

View File

@ -0,0 +1,52 @@
; RUN: opt -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=2 -S < %s | FileCheck %s
;
; Verify that the llvm.loop.unroll_and_jam.enable loop property
; overrides llvm.loop.disable_nonforced.
;
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
; CHECK-LABEL: disable_nonforced_enable
; CHECK: load
; CHECK: load
; CHECK-NOT: load
; CHECK: br i1
define void @disable_nonforced_enable(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
entry:
%cmp = icmp ne i32 %J, 0
%cmp122 = icmp ne i32 %I, 0
%or.cond = and i1 %cmp, %cmp122
br i1 %or.cond, label %for.outer.preheader, label %for.end
for.outer.preheader:
br label %for.outer
for.outer:
%i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
br label %for.inner
for.inner:
%j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
%sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
%arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
%0 = load i32, i32* %arrayidx.us, align 4
%add.us = add i32 %0, %sum1.us
%inc.us = add nuw i32 %j.us, 1
%exitcond = icmp eq i32 %inc.us, %J
br i1 %exitcond, label %for.latch, label %for.inner
for.latch:
%add.us.lcssa = phi i32 [ %add.us, %for.inner ]
%arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
%add8.us = add nuw i32 %i.us, 1
%exitcond25 = icmp eq i32 %add8.us, %I
br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
for.end.loopexit:
br label %for.end
for.end:
ret void
}
!0 = distinct !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.unroll_and_jam.enable"}}

View File

@ -0,0 +1,66 @@
; RUN: opt -basicaa -tbaa -loop-unroll-and-jam -allow-unroll-and-jam -unroll-and-jam-count=4 -unroll-remainder < %s -S | FileCheck %s
;
; Check that followup attributes are set in the new loops.
;
target datalayout = "e-m:e-p:32:32-i64:64-v128:64:128-a:0:32-n32-S64"
define void @followup(i32 %I, i32 %J, i32* noalias nocapture %A, i32* noalias nocapture readonly %B) {
entry:
%cmp = icmp ne i32 %J, 0
%cmp122 = icmp ne i32 %I, 0
%or.cond = and i1 %cmp, %cmp122
br i1 %or.cond, label %for.outer.preheader, label %for.end
for.outer.preheader:
br label %for.outer
for.outer:
%i.us = phi i32 [ %add8.us, %for.latch ], [ 0, %for.outer.preheader ]
br label %for.inner
for.inner:
%j.us = phi i32 [ 0, %for.outer ], [ %inc.us, %for.inner ]
%sum1.us = phi i32 [ 0, %for.outer ], [ %add.us, %for.inner ]
%arrayidx.us = getelementptr inbounds i32, i32* %B, i32 %j.us
%0 = load i32, i32* %arrayidx.us, align 4
%add.us = add i32 %0, %sum1.us
%inc.us = add nuw i32 %j.us, 1
%exitcond = icmp eq i32 %inc.us, %J
br i1 %exitcond, label %for.latch, label %for.inner
for.latch:
%add.us.lcssa = phi i32 [ %add.us, %for.inner ]
%arrayidx6.us = getelementptr inbounds i32, i32* %A, i32 %i.us
store i32 %add.us.lcssa, i32* %arrayidx6.us, align 4
%add8.us = add nuw i32 %i.us, 1
%exitcond25 = icmp eq i32 %add8.us, %I
br i1 %exitcond25, label %for.end.loopexit, label %for.outer, !llvm.loop !0
for.end.loopexit:
br label %for.end
for.end:
ret void
}
!0 = !{!0, !1, !2, !3, !4, !6}
!1 = !{!"llvm.loop.unroll_and_jam.enable"}
!2 = !{!"llvm.loop.unroll_and_jam.followup_outer", !{!"FollowupOuter"}}
!3 = !{!"llvm.loop.unroll_and_jam.followup_inner", !{!"FollowupInner"}}
!4 = !{!"llvm.loop.unroll_and_jam.followup_all", !{!"FollowupAll"}}
!6 = !{!"llvm.loop.unroll_and_jam.followup_remainder_inner", !{!"FollowupRemainderInner"}}
; CHECK: br i1 %exitcond.3, label %for.latch, label %for.inner, !llvm.loop ![[LOOP_INNER:[0-9]+]]
; CHECK: br i1 %niter.ncmp.3, label %for.end.loopexit.unr-lcssa.loopexit, label %for.outer, !llvm.loop ![[LOOP_OUTER:[0-9]+]]
; CHECK: br i1 %exitcond.epil, label %for.latch.epil, label %for.inner.epil, !llvm.loop ![[LOOP_REMAINDER_INNER:[0-9]+]]
; CHECK: br i1 %exitcond.epil.1, label %for.latch.epil.1, label %for.inner.epil.1, !llvm.loop ![[LOOP_REMAINDER_INNER]]
; CHECK: br i1 %exitcond.epil.2, label %for.latch.epil.2, label %for.inner.epil.2, !llvm.loop ![[LOOP_REMAINDER_INNER]]
; CHECK: ![[LOOP_INNER]] = distinct !{![[LOOP_INNER]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_INNER:[0-9]+]]}
; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
; CHECK: ![[FOLLOWUP_INNER]] = !{!"FollowupInner"}
; CHECK: ![[LOOP_OUTER]] = distinct !{![[LOOP_OUTER]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_OUTER:[0-9]+]]}
; CHECK: ![[FOLLOWUP_OUTER]] = !{!"FollowupOuter"}
; CHECK: ![[LOOP_REMAINDER_INNER]] = distinct !{![[LOOP_REMAINDER_INNER]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_REMAINDER_INNER:[0-9]+]]}
; CHECK: ![[FOLLOWUP_REMAINDER_INNER]] = !{!"FollowupRemainderInner"}

View File

@ -316,4 +316,4 @@ for.end:
!8 = distinct !{!"llvm.loop.unroll.disable"}
!9 = distinct !{!9, !10}
!10 = distinct !{!"llvm.loop.unroll.enable"}
!11 = distinct !{!11, !8, !6}
!11 = distinct !{!11, !8, !6}

View File

@ -1,9 +1,9 @@
; RUN: opt < %s -loop-vectorize -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
; RUN: opt < %s -loop-vectorize -o /dev/null -pass-remarks-output=%t.yaml
; RUN: opt < %s -loop-vectorize -transform-warning -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
; RUN: opt < %s -loop-vectorize -transform-warning -o /dev/null -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; RUN: opt < %s -passes=loop-vectorize -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
; RUN: opt < %s -passes=loop-vectorize -o /dev/null -pass-remarks-output=%t.yaml
; RUN: opt < %s -passes=loop-vectorize,transform-warning -S -pass-remarks-missed='loop-vectorize' -pass-remarks-analysis='loop-vectorize' 2>&1 | FileCheck %s
; RUN: opt < %s -passes=loop-vectorize,transform-warning -o /dev/null -pass-remarks-output=%t.yaml
; RUN: cat %t.yaml | FileCheck -check-prefix=YAML %s
; C/C++ code for tests
@ -33,7 +33,7 @@
; }
; CHECK: remark: source.cpp:19:5: loop not vectorized: cannot identify array bounds
; CHECK: remark: source.cpp:19:5: loop not vectorized
; CHECK: warning: source.cpp:19:5: loop not vectorized: failed explicitly specified loop vectorization
; CHECK: warning: source.cpp:19:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; int foo();
; void test_multiple_failures(int *A) {
@ -94,13 +94,12 @@
; YAML-NEXT: - String: ')'
; YAML-NEXT: ...
; YAML-NEXT: --- !Failure
; YAML-NEXT: Pass: loop-vectorize
; YAML-NEXT: Pass: transform-warning
; YAML-NEXT: Name: FailedRequestedVectorization
; YAML-NEXT: DebugLoc: { File: source.cpp, Line: 19, Column: 5 }
; YAML-NEXT: Function: _Z17test_array_boundsPiS_i
; YAML-NEXT: Args:
; YAML-NEXT: - String: 'loop not vectorized: '
; YAML-NEXT: - String: failed explicitly specified loop vectorization
; YAML-NEXT: - String: 'loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering'
; YAML-NEXT: ...
; YAML-NEXT: --- !Analysis
; YAML-NEXT: Pass: loop-vectorize

View File

@ -0,0 +1,29 @@
; RUN: opt -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 -S < %s | FileCheck %s
;
; Check that the disable_nonforced loop property is honored by the
; loop vectorizer.
;
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced(
; CHECK-NOT: x i32>
define void @disable_nonforced(i32* nocapture %a, i32 %n) {
entry:
%cmp4 = icmp sgt i32 %n, 0
br i1 %cmp4, label %for.body, label %for.end
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = trunc i64 %indvars.iv to i32
store i32 %0, i32* %arrayidx, align 4
%indvars.iv.next = add i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, %n
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = !{!0, !{!"llvm.loop.disable_nonforced"}}

View File

@ -0,0 +1,29 @@
; RUN: opt -loop-vectorize -force-vector-interleave=1 -force-vector-width=2 -S < %s | FileCheck %s
;
; Check whether the llvm.loop.vectorize.enable loop property overrides
; llvm.loop.disable_nonforced.
;
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
; CHECK-LABEL: @disable_nonforced_enable(
; CHECK: store <2 x i32>
define void @disable_nonforced_enable(i32* nocapture %a, i32 %n) {
entry:
%cmp4 = icmp sgt i32 %n, 0
br i1 %cmp4, label %for.body, label %for.end
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = trunc i64 %indvars.iv to i32
store i32 %0, i32* %arrayidx, align 4
%indvars.iv.next = add i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, %n
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = !{!0, !{!"llvm.loop.disable_nonforced"}, !{!"llvm.loop.vectorize.enable", i32 1}}

View File

@ -0,0 +1,43 @@
; RUN: opt -loop-vectorize -force-vector-width=4 -force-vector-interleave=1 -S < %s | FileCheck %s
;
; Check that the followup loop attributes are applied.
;
target datalayout = "e-p:64:64:64-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:64:64-f32:32:32-f64:64:64-v64:64:64-v128:128:128-a0:0:64-s0:64:64-f80:128:128-n8:16:32:64-S128"
define void @followup(i32* nocapture %a, i32 %n) {
entry:
%cmp4 = icmp sgt i32 %n, 0
br i1 %cmp4, label %for.body, label %for.end
for.body:
%indvars.iv = phi i64 [ %indvars.iv.next, %for.body ], [ 0, %entry ]
%arrayidx = getelementptr inbounds i32, i32* %a, i64 %indvars.iv
%0 = trunc i64 %indvars.iv to i32
store i32 %0, i32* %arrayidx, align 4
%indvars.iv.next = add i64 %indvars.iv, 1
%lftr.wideiv = trunc i64 %indvars.iv.next to i32
%exitcond = icmp eq i32 %lftr.wideiv, %n
br i1 %exitcond, label %for.end, label %for.body, !llvm.loop !0
for.end:
ret void
}
!0 = distinct !{!0, !3, !4, !5}
!3 = !{!"llvm.loop.vectorize.followup_vectorized", !{!"FollowupVectorized"}}
!4 = !{!"llvm.loop.vectorize.followup_epilogue", !{!"FollowupEpilogue"}}
!5 = !{!"llvm.loop.vectorize.followup_all", !{!"FollowupAll"}}
; CHECK-LABEL @followup(
; CHECK-LABEL: vector.body:
; CHECK: br i1 %13, label %middle.block, label %vector.body, !llvm.loop ![[LOOP_VECTOR:[0-9]+]]
; CHECK-LABEL: for.body:
; CHECK: br i1 %exitcond, label %for.end.loopexit, label %for.body, !llvm.loop ![[LOOP_EPILOGUE:[0-9]+]]
; CHECK: ![[LOOP_VECTOR]] = distinct !{![[LOOP_VECTOR]], ![[FOLLOWUP_ALL:[0-9]+]], ![[FOLLOWUP_VECTORIZED:[0-9]+]]}
; CHECK: ![[FOLLOWUP_ALL]] = !{!"FollowupAll"}
; CHECK: ![[FOLLOWUP_VECTORIZED:[0-9]+]] = !{!"FollowupVectorized"}
; CHECK: ![[LOOP_EPILOGUE]] = distinct !{![[LOOP_EPILOGUE]], ![[FOLLOWUP_ALL]], ![[FOLLOWUP_EPILOGUE:[0-9]+]]}
; CHECK: ![[FOLLOWUP_EPILOGUE]] = !{!"FollowupEpilogue"}

View File

@ -1,8 +1,8 @@
; RUN: opt < %s -loop-vectorize -S 2>&1 | FileCheck %s
; RUN: opt < %s -loop-vectorize -transform-warning -S 2>&1 | FileCheck %s
; Verify warning is generated when vectorization/ interleaving is explicitly specified and fails to occur.
; CHECK: warning: no_array_bounds.cpp:5:5: loop not vectorized: failed explicitly specified loop vectorization
; CHECK: warning: no_array_bounds.cpp:10:5: loop not interleaved: failed explicitly specified loop interleaving
; CHECK: warning: no_array_bounds.cpp:5:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; CHECK: warning: no_array_bounds.cpp:10:5: loop not interleaved: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; #pragma clang loop vectorize(enable)
; for (int i = 0; i < number; i++) {

View File

@ -1,16 +1,16 @@
; RUN: opt < %s -loop-vectorize -force-vector-width=4 -S 2>&1 | FileCheck %s
; RUN: opt < %s -loop-vectorize -force-vector-width=1 -S 2>&1 | FileCheck %s -check-prefix=NOANALYSIS
; RUN: opt < %s -loop-vectorize -force-vector-width=4 -pass-remarks-missed='loop-vectorize' -S 2>&1 | FileCheck %s -check-prefix=MOREINFO
; RUN: opt < %s -loop-vectorize -force-vector-width=4 -transform-warning -S 2>&1 | FileCheck %s
; RUN: opt < %s -loop-vectorize -force-vector-width=1 -transform-warning -S 2>&1 | FileCheck %s -check-prefix=NOANALYSIS
; RUN: opt < %s -loop-vectorize -force-vector-width=4 -transform-warning -pass-remarks-missed='loop-vectorize' -S 2>&1 | FileCheck %s -check-prefix=MOREINFO
; CHECK: remark: source.cpp:4:5: loop not vectorized: loop contains a switch statement
; CHECK: warning: source.cpp:4:5: loop not vectorized: failed explicitly specified loop vectorization
; CHECK: warning: source.cpp:4:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; NOANALYSIS-NOT: remark: {{.*}}
; NOANALYSIS: warning: source.cpp:4:5: loop not interleaved: failed explicitly specified loop interleaving
; NOANALYSIS: warning: source.cpp:4:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; MOREINFO: remark: source.cpp:4:5: loop not vectorized: loop contains a switch statement
; MOREINFO: remark: source.cpp:4:5: loop not vectorized (Force=true, Vector Width=4)
; MOREINFO: warning: source.cpp:4:5: loop not vectorized: failed explicitly specified loop vectorization
; MOREINFO: warning: source.cpp:4:5: loop not vectorized: the optimizer was unable to perform the requested transformation; the transformation might be disabled or specified as part of an unsupported transformation ordering
; CHECK: _Z11test_switchPii
; CHECK-NOT: x i32>