1922 Commits

Author SHA1 Message Date
Florian Hahn
deec9e7674
[VPlan] Move VPTransformState::get() to VPlan.cpp (NFC).
The last dependency of code defined in LoopVectorize.cpp has been
removed a while ago. Move VPTransformState::get() to VPlan.cpp where
other members are also defined.
2023-08-03 21:49:58 +01:00
Mel Chen
425e9e81a0 [LV] Rename the Select[I|F]Cmp reduction pattern to [I|F]AnyOf. (NFC)
Regarding this NFC change, please refer to the discussion in this thread. https://reviews.llvm.org/D150851#4467261

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D155786
2023-08-03 00:37:19 -07:00
Mel Chen
97cccdd9f3 [LV][NFC] Remove the redundant braces. 2023-08-02 20:45:04 -07:00
Florian Hahn
8ea274b46b
[VPlan] Fix in-loop reduction chains using VPlan def-use chains (NFCI)
Update adjustRecipesForReductions to directly use the VPlan def-use
chains for in-loop reductions to collect the reduction operations that
need adjusting.

This allows the removal of
 * ReductionChainMap
 * recording of recipes for instruction in the reduction chain
 * removes late uses of getVPValue
 * removes to need for removeVPValueFor.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D155845
2023-08-02 17:04:29 +01:00
Florian Hahn
d1d0e135a1
[LV] Move packScalarIntoVectorValue to VPTransformState (NFC).
This moves packScalarIntoVectorValue from ILV to the more approriate
VPTransformState.
2023-08-02 12:36:48 +01:00
Bjorn Pettersson
408cc94445 [LV][LSV][SLP] Drop some typed pointer bitcasts
Differential Revision: https://reviews.llvm.org/D156736
2023-08-02 12:08:37 +02:00
Florian Hahn
707359ecf5
Recommit "[LV] Re-use existing broadcast value for live-ins."
This reverts commit 245ec675a4e41f7ec24dfc998720bffdc46a6c53.

Recommits eea9258648ce with a fix to only erase the instruction from the
first part if it is defined outside the loop. This fixes a
use-after-free error reported.
2023-08-01 15:54:02 +01:00
Florian Hahn
822c749aec
[LV] Shrink operands before creating new instr to force eval order.
Shrink operands before creating the new instruction to make sure the
same evaluation order is used on all platforms. This fixes buildbot
failures due to different argument evaluation order on different
systems.
2023-07-30 17:16:37 +01:00
Martin Storsjö
245ec675a4 Revert "[LV] Re-use existing broadcast value for live-ins."
This reverts commit eea9258648ce73507f6f85c395de978af659d498.

That commit triggered crashes in the following testcase:

$ cat reduced.c
typedef struct {
  int a[8]
} b;
typedef struct {
  b *c;
  short d
} e;
void f() {
  int g;
  char *h;
  e *i = f;
  short j = i->d;
  int a = i->c->a[0];
  for (;;)
    for (; g < a; g++) {
      *h = j * i->d >> 8;
      h++;
    }
}
$ clang -target aarch64-linux-gnu -w -c -O2 reduced.c
2023-07-25 10:35:41 +03:00
Florian Hahn
eea9258648
[LV] Re-use existing broadcast value for live-ins.
When requesting a vector value for a live-in, we can re-use the
broadcast of the live-in of part 0 for parts > 0.
2023-07-24 11:50:47 +01:00
Florian Hahn
25d34215bb
[LV] Replace use of getMaxSafeDepDist with isSafeForAnyVector (NFC)
Replace the use of getMaxSafeDepDistBytes with the more direct
isSafeForAnyVector. This removes the need to define getMaxSafeDepDistBytes.
2023-07-21 22:05:50 +02:00
Florian Hahn
68746a8cea
[LV] Move all VPlan transforms after initial VPlan construction.
Reorder VPlan transforms slightly so they are all grouped together,
after disabling Value -> VPValue lookup. In terms of codegen impact,
this should be NFC modulo a small number of instruction reorderings.

Preparation to split up tryToBuildVPlanWithVPRecipes in a follow-up.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154640
2023-07-18 10:53:30 +01:00
Nikita Popov
94abecca6b [IVDescriptors] Remove typed pointer support (NFC)
This also removes the element type from the descriptor, as it is
always i8. The meaning of the step is now the same between
integers and pointers.
2023-07-12 15:48:29 +02:00
Florian Hahn
9259f41e62
[VPlan] Clear reduction flags directly as VPlanTransform.
After D150027, all relevant recipes should model their IR flags
directly. Instead of removing the flags after codegen as part of
fixReductions, drop poison generating flags directly from the recipes.

Depends on D150027.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D150028
2023-07-09 21:11:51 +01:00
Florian Hahn
14ec3f4b06
[LV] Skip VFs > # iterations remaining for epilogue vectorization.
If a candidate VF for epilogue vectorization is greater than the number of
remaining iterations, the epilogue loop would be dead. Skip such factors.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154264
2023-07-07 21:43:51 +01:00
Florian Hahn
aee851fd0e
Revert "[LV] Skip VFs < iterations remaining for epilogue vectorization."
This reverts commit 7cc0be01a0068946ea3613dc2cb45c81b0f45860.

The title of the commit is incorrect, revert to fix the commit message.
2023-07-07 21:41:24 +01:00
Florian Hahn
7cc0be01a0
[LV] Skip VFs < iterations remaining for epilogue vectorization.
If a candidate VF for epilogue vectorization is less than the number of
remaining iterations, the epilogue loop would be dead. Skip such factors.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154264
2023-07-07 20:33:42 +01:00
Florian Hahn
a0fcf84a8c
[LV] Consider if scalar epilogue is required in getMaximizedVFForTarget.
When a scalar epilogue is required, at least one iteration of the scalar loop
has to execute. Adjust ConstTripCount accordingly to avoid picking a max VF
that results in a dead vector loop.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D154261
2023-07-06 13:35:35 +01:00
Florian Hahn
1746ac42ca
[LV] Forget SCEVs for exit phis after vectorization.
After vectorization, the exit blocks of the original loop will have additional
predecessors. Invalidate SCEVs for the exit phis in case SE looked through
single-entry phis.

Fixes https://github.com/llvm/llvm-project/issues/63368
Fixes https://github.com/llvm/llvm-project/issues/63669
2023-07-04 21:28:03 +01:00
Florian Hahn
39385c521d
[LV] Move getBroadcastInstr to VPTransformState.::get (NFCI).
getBroadcastInstrs is only used in VPTransformState::get. Move it closer
to use to reduce unnecessary interaction with ILV object.
2023-07-04 11:24:11 +01:00
Florian Hahn
b4efc0f070
[LV] Break up condition in selectEpilogueVectorizationFactor loop (NFCI)
Restructure the loop as suggested in D154264 to increase readability and
make it easier to extend.
2023-07-03 22:39:40 +01:00
Florian Hahn
55e7f1f786
[LV] Pass bool to requiresScalarEpilogue (NFC).
requiresScalarEpilogue only checks if the selected VF is vectorizing
(and not scalar). Update it to just take a boolean, to make it clearer
what information is used and to allow callers without a VF (used in a
follow-up patch).
2023-06-30 22:08:27 +01:00
Igor Kirillov
17bde328d6 [LV] Add mask support for vectorizing interleaved groups
This patch extends LoopVectorize to handle the vectorization of interleaved
memory accesses with scalable vectors when mask is required or/and predicated
tail folding is enabled.

Differential Revision: https://reviews.llvm.org/D152258
2023-06-29 17:50:56 +00:00
Florian Hahn
ea6ca9cb2b
[LV] Fix crash when stride isn't a constant.
In same cases, the stride may not be a constant. Just skip those cases
for now. This should only happen for cases where LV interleaves only, if
it is vectorized the stride needs to be versioned to a constant.
2023-06-14 16:53:34 +01:00
Florian Hahn
d209084720
[VPlan] Replace versioned stride with constant during VPlan opts.
After constructing the initial VPlan, replace VPValues for versioned
strides with their constant counterparts.

Differential Revision: https://reviews.llvm.org/D147783
2023-06-13 08:26:55 +01:00
Graham Hunter
95bfb1902d [LV][AArch64] Allow (limited) interleaving for scalable vectors
This patch uses the (de)interleaving intrinsics introduced in
D141924 to handle vectorization of interleaving groups with a
factor of 2 for scalable vectors.

Reviewed By: fhahn, reames

Differential Revision: https://reviews.llvm.org/D145163
2023-06-09 11:42:10 +01:00
Nikita Popov
143ed21b26 Revert "[LCSSA] Remove unused ScalarEvolution argument (NFC)"
This reverts commit 5362a0d859d8e96b3f7c0437b7866e17a818a4f7.

In preparation for reverting a dependent revision.
2023-06-05 16:45:38 +02:00
Florian Hahn
e19297471a
[LV] Check if value was already not uniform for previous VF.
If the value was already known to not be uniform for the previous
(smaller VF), it cannot be uniform for the larger VF.

This slightly reduces compile-time, once uniformity checks are becoming
a bit more expensive due to using SCEV rewriting (D148841).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D151658
2023-06-04 20:31:01 +01:00
Florian Hahn
e48b1e87a3
[LV] Split off invariance check from isUniform (NFCI).
After 572cfa3fde5433, isUniform now checks VF based uniformity instead of
just invariance as before.

As follow-up cleanup suggested in D148841, separate the invariance check
out and update callers that currently check only for invariance.

This also moves the implementation of isUniform from LoopAccessAnalysis
to LoopVectorizationLegality, as LoopAccesAnalysis doesn't use the more
general isUniform.
2023-06-01 19:09:11 +01:00
Florian Hahn
572cfa3fde
[LV] Use SCEV for uniformity analysis across VF
This patch uses SCEV to check if a value is uniform across a given VF.

The basic idea is to construct SCEVs where the AddRecs of the loop are
adjusted to reflect the version in the vectorized loop (Step multiplied
by VF). We construct a SCEV for the value of the vector lane 0
(offset 0) compare it to the expressions for lanes 1 to the last vector
lane (VF - 1). If they are equal, consider the expression uniform.

While re-writing expressions, we also need to catch expressions we
cannot determine uniformity (e.g. SCEVUnknown).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D148841
2023-05-31 16:01:00 +01:00
Florian Hahn
8098f2577e
[LV] Use Legal::isUniform to detect uniform pointers.
Update collectLoopUniforms to identify uniform pointers using
Legal::isUniform. This is more powerful and  brings pointer
classification here in sync with setCostBasedWideningDecision
which uses isUniformMemOp. The existing mis-match in reasoning
can causes crashes due to D134460, which is fixed by this patch.

Fixes https://github.com/llvm/llvm-project/issues/60831.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D150991
2023-05-30 16:42:55 +01:00
Florian Hahn
b750862107
[LV] Use early exit for stores storing the ptr operand. (NFC)
Cleanup suggested in D150991.
2023-05-30 12:14:12 +01:00
Alexander Timofeev
bad4de1ae7 Don't disable loop unroll for vectorized loops on AMDGPU target
We've got a performance regression after the https://reviews.llvm.org/D115261.
Despite the loop being vectorized unroll is still required.

Reviewed By: rampitec

Differential Revision: https://reviews.llvm.org/D149281
2023-05-25 22:54:41 +02:00
Craig Topper
6006d43e2d LLVM_FALLTHROUGH => [[fallthrough]]. NFC
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D150996
2023-05-24 12:40:10 -07:00
Florian Hahn
55903151a2
[VPlan] Use isUniformAfterVec in VPReplicateRecipe::execute.
I was unable to find a case where this actually changes generated code,
but it enables the bug fix in D144434. It also brings codegen in line
with the handling of stores to uniform addresses in the cost model
(D134460).

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D144491
2023-05-19 18:15:21 +01:00
Florian Hahn
701f7230cd
[VPlan] Use VPRecipeWithIRFlags for VPReplicateRecipe, retire poison map
Update VPReplicateRecipe to use VPRecipeWithIRFlags for IR flag
handling. Retire separate MayGeneratePoisonRecipes map.

Depends on D149082.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D150027
2023-05-15 11:49:20 +01:00
Florian Hahn
f40a7901d1
[LV] Move selecting vectorization factor logic to LVP (NFC).
Split off from D143938. This moves the planning logic to select the
vectorization factor to LoopVectorizationPlanner as a step towards only
computing costs for individual VFs in LoopVectorizationCostModel and do
planning in LVP.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D150197
2023-05-13 12:28:14 +01:00
Florian Hahn
7472f1da96
[VPlan] Change LoopVectorizationPlanner::TTI to be const reference (NFC) 2023-05-13 12:27:57 +01:00
Florian Hahn
0418d0242b
[LV] Move getVScaleForTuning out of LoopVectorizationCostModel (NFC).
Split off refactoring from D150197 to reduce diff.
2023-05-13 10:17:13 +01:00
Philip Reames
592199c8fe [LV] Use interface routines instead of internal variables
This makes a (possible) change to the internal representation easier in the future, and makes the code easier to read now.
2023-05-12 16:27:12 -07:00
Florian Hahn
bf279a0f8e
[VPlan] Remove dangling comment and newlines (NFC).
Apply missed cleanups.
2023-05-11 22:06:56 +01:00
Florian Hahn
3d4eed0133
[LV] Reuse SCEV expansion results for epilogue vectorization.
When generating code for the epilogue vector loop, we need to re-use the
expansion results for induction steps generated for the main vector
loop, as the pre-header of the epilogue vector loop may not dominate the
vector preheader of the epilogue.

This fixes a reported crash. Note that this is a workaround which should
be removed soon once induction resume value creation is handled in VPlan
directly.
2023-05-11 22:00:07 +01:00
Philip Reames
7fbfcc653f [LV/LAA] Use PSE to identify stride multiplies which simplify [mostly nfc]
LV/LAA will speculate that (some) strided access patterns have unit stride, and insert runtime checks if required.

LV cost models a multiply by such a stride as free.  We did this by keeping around the StrideSet structure, just to check if one of the operands were one of the strides we speculated.

We can instead just ask PredicatedScalarEvolution if either of the operands are one (after predicates are applied).  We get mostly the same result - PSE can prove it in more cases in theory - and simpler code.
2023-05-11 11:16:04 -07:00
Florian Hahn
236a0e82df
[LV] Use VPValue to get expanded value for SCEV step expressions.
Update skeleton creation logic to use SCEV expansion results from
expanding the pre-header. This avoids another set of SCEV expansions
that may happen after the CFG has been modified.

Fixes .

Depends on D147964.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D147965
2023-05-11 16:49:19 +01:00
Florian Hahn
127b00b25c
[VPlan] Record IR flags on VPWidenRecipe directly (NFC).
This patch introduces a VPRecipeWithIRFlags class to record various IR
flags for a recipe. This allows de-coupling of IR flags from the
underlying instructions. The main benefit is that it allows dropping of
IR flags from recipes directly, without the need to go through
State::MayGeneratePoisonRecipes. The plan is to remove
MayGeneratePoisonRecipes once all relevant recipes are transitioned.

It also allows dropping IR flags during VPlan-to-VPlan transforms, which
will be used in a follow-up patch to implement truncateToMinimalBitwidths
as VPlan-to-VPlan transform.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D149079
2023-05-08 17:28:50 +01:00
Florian Hahn
823d35fd3b
[VPlan] Use RecipeBuilder to look up member when fixing IG (NFC).
Recipes for interleave group members are recorded directly in the
RecipeBuilder. Use it directly instead of going indirectly through
VPlan's Value->VPValue mapping.
2023-05-07 18:02:27 +01:00
Florian Hahn
e3afe0b89d
[VPlan] Add VPWidenCastRecipe, split off from VPWidenRecipe (NFCI).
To generate cast instructions, the result type is needed. To allow
creating widened casts without underlying instruction, introduce a new
VPWidenCastRecipe that also holds the result type.

This functionality will be used in a follow-up patch to
implement truncateToMinimalBitwidths as VPlan-to-VPlan transform.

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D149081
2023-05-05 13:20:16 +01:00
Florian Hahn
b85a402dd8
[VPlan] Introduce new entry block to VPlan for early SCEV expansion.
This patch adds a new preheader block the VPlan to place SCEV expansions
expansions like the trip count. This preheader block is disconnected
at the moment, as the bypass blocks of the skeleton are not yet modeled
in VPlan.

The preheader block is executed before skeleton creation, so the SCEV
expansion results can be used during skeleton creation. At the moment,
the trip count expression and induction steps are expanded in the new
preheader. The remainder of SCEV expansions will be moved gradually in
the future.

D147965 will update skeleton creation to use the steps expanded in the
pre-header to fix .

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D147964
2023-05-04 14:00:13 +01:00
Florian Hahn
79692750d2
[LV] Use VPValue for SCEV expansion in fixupIVUsers.
The step is already expanded in the VPlan. Use this expansion instead.
This is a step towards modeling fixing up IV users in VPlan.

 It also fixes a crash casued by SCEV-expanding the Step expression in
fixupIVUsers, where the IR is in an incomplete state

Reviewed By: Ayal

Differential Revision: https://reviews.llvm.org/D147963
2023-05-04 09:25:59 +01:00
Nikita Popov
5362a0d859 [LCSSA] Remove unused ScalarEvolution argument (NFC)
After D149435, LCSSA formation no longer needs access to
ScalarEvolution, so remove the argument from the utilities.
2023-05-02 12:17:05 +02:00