Polly currently needs to be slowly refactor to use the C++ wrapper objects to handle the reference counters automatically.
I took the function of astScheduleDimIsParallel and refactored it so that it uses the C++ wrapper function as much as possible.
There are some problems with the IsParallel since it expects the C objects, so the C++ wrapper functions must be .release() and .get() first before they are able to be used with IsParallel.
When checking the ReductionDependencies Parallelism with the Build's Schedule, I opted to keep the union map as a C object rather than a C++ object. Eventually, changes will need to be made to IsParallel to refactor it to the C++ wrappers. When this is done, this function will also need to be slightly refactored to not use the C object.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98455
This reverts commit 329aeb5db4,
and relands commit 61f006ac65.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
This removes some (but not all) uses of type-less CreateGEP()
and CreateInBoundsGEP() APIs, which are incompatible with opaque
pointers.
There are a still a number of tricky uses left, as well as many
more variation APIs for CreateGEP.
This is a continuation of D89456.
As it was suggested there, now that SCEV models `PtrToInt`,
we can try to improve SCEV's pointer handling.
In particular, i believe, i will need this in the future
to further fix `SCEVAddExpr`operation type handling.
This removes special handling of `ConstantPointerNull`
from `ScalarEvolution::createSCEV()`, and add constant folding
into `ScalarEvolution::getPtrToIntExpr()`.
This way, `null` constants stay as such in SCEV's,
but gracefully become zero integers when asked.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D98147
Emit llvm.loop.parallel_accesses metadata instead of
llvm.mem.parallel_loop_access. The latter is deprecated because it
assumes that LoopIDs are persistent, which they are not.
We also emit parallel access metadata for all surrounding parallel
loops, not just the innermost parallel.
Polly use algorithms from the Integer Set Library (isl), which is a library written in C and which is incompatible with the rest of the LLVM as it is written in C++.
Changes made:
* Refabricating IsOutermostParallel() to take C++ bindings instead of reference-counting in C isl lib.
* Addition of manage_copy() to be used as reference for C objects instead of IsOutermostParallel()
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D97751
Currently, the IslAst library is a C library that would be incompatible with the rest of the LLVM because LLVM is written in C++.
I took one function, IsInnermostParallel(), and refactored it so that it would take the C++ wrapper object instead of using reference counters with the C ISL library. As well, all the references that use IsInnermostParallel() will use manage_copy() since they are still expecting the C object.
Reviewed By: Meinersbur
Differential Revision: https://reviews.llvm.org/D97425
Allow users to use a non-system version of perl, python and awk, which is useful
in certain package managers.
Reviewed By: JDevlieghere, MaskRay
Differential Revision: https://reviews.llvm.org/D95119
Regenerate the C++ wrapper header from the current isl version's
headers.
The most notable change is that some dimension sizes are represented by
an isl_size (instead of unsigned), which is a signed int. Additionally,
some function may return -1 in case of an error which already had been
fixed in the past. The C++ may no return -1 instead of UINT_MAX which
caused the problems.
Some types in Polly had been changed from unsigned to isl_size
(that were not already auto) and some loops/comparision had to be
changed to avoid unsigned/signed comparison warnings.
DetectionContext objects are stored as values in a DenseMap. When the
DenseMap reaches its maximum load factor, it is resized and all its
objects moved to a new memory allocation. Unfortunately Scop object have
a reference to its DetectionContext. When the DenseMap resizes, all the
DetectionContexts reference now point to invalid memory, even if caused
by an unrelated DetectionContext.
Even worse, NewPM's ScopPassManager called isMaxRegionInScop with the
Verify=true parameter before each pass. This caused the old
DetectionContext to be removed an a new on created and re-verified.
Of course, the Scop object was already created pointing to the old
DetectionContext. Because the new DetectionContext would
usually be stored at the same position in the DenseMap, the reference
would usually reference the new DetectionContext of the same Region.
Usually.
If not, the old position still points to memory in the DenseMap
allocation (unless also a resizing occurs) such that tools like Valgrind
and AddressSanitizer would not be able to diagnose this.
Instead of storing the DetectionContext inside the DenseMap, use a
std::unique_ptr to a DetectionContext allocation, i.e. it will not move
around anymore. This also allows use to remove the very strange
DetectionContext(const DetectionContext &&)
copy/move(?) constructor. DetectionContext objects now are neither
copied nor moved.
As a result, every re-verification of a DetectionContext will use a new
allocation. Therefore, once a Scop object has been created using a
DetectionContext, it must not be re-verified (the Scop data structure
requires its underlying Region to not change before code generation
anyway). The NewPM may call isMaxRegionInScop only with
Validate=false parameter.
The description of the -polly switch stated that it was only enabled
with -O3. This was a lie, the optimization level was ignored. Only at
-O0 Polly was not added to the pass pipeline because the pass builder,
but only because the extension points were not triggered.
In the NewPM, the VectorizerStart extensions point is actually trigger
even with -O0 which leads to the following crash:
Assertion `Level != OptimizationLevel::O0 && "Must request optimizations!"' failed.
We sanitize the optimization levels using the following rules for both
pass mangers:
1. Only enable Polly if optimizing at all (-O1, -O2 or -O3).
2. Do not enable Polly when optimizing for size.
3. Ignore the optimization level for diagnostic passes (printer, viewer
or JScop-exporter).
4. If only diagnostic passes enabled, skip the code-generation.
5. Fix the description of the -polly command line option.
These are implementation details of the IslScheduleOptimizer pass
implementation and not use anywhere else. Hence, we can move them to the
cpp file and into an anonymous namespace.
Only getPartialTilePrefixes is, aside from the pass itself, used
externally (by the ScheduleOptimizerTest) and moved into the polly
namespace.
This reverts commit b7d870eae7 and the
subsequent fix "[Polly] Fix build after AssumptionCache change (D96168)"
(commit e6810cab09).
It caused indeterminism in the output, such that e.g. the
polly-x86_64-linux buildbot failed accasionally.
Move SimplifiyVisitor from Simplify.h to Simplify.cpp. It is not
relevant for applying the pass in either the NewPM or the legacyPM.
Rename it to SimplifyImpl to account for that.
This is possible due its state not being necessary to be preserved
between runs and thefore SimplifyImpl not needed to be held in the
pass object. Instead, SimplifyImpl is only instatiated for the
current Scop. In the NewPM as a function-local variable, and in the
legacy PM inside a llvm::Optional object because the state must be
preserved between the printScop (invoked by opt -analyze) and the most
recent runOnScop calls.
"using namespace" pollutes the namespace of every file that includes
such a header and universally considered a bad thing. Even the variant
namespace polly {
using namespace llvm;
}
(previously used by LoopGenerators.h) imports more symbols than the file
is in control of. The header may include a fixed set of files from LLVM,
but the header itself may by be included together with other headers
from LLVM. For instance, LLVM's MemorySSA.h and Polly's ScopInfo.h both
declare a class 'MemoryAccess' which may conflict.
Instead of prefixing everything in Polly's header files, this patch adds
'using' statements to import only the symbols that are actually
referenced in Polly. This approach is also used by MLIR to import
commonly used symbols into the mlir namespace.
This patch also puts the symbols declared in IslNodeBuilder.h into the
Polly namespace to also be able to use the imported symbols.
Even though it has some oddities, both pipelines should be as similar as
possible. Also use report_fatal_error instead of assertions to ensure a
proper failure in release builds for unsupported options.
This finalizes the patch serious to make Polly run in the default
configuration when using the NewPM by default.
In particular, print the ast with -debug-only=polly-ast, print a
per-scop header with print<polly-ast> and force-add the analysis with
-polly-code-generation=ast.
The pass-instrumentation pass is implicitly execute by the NewPM
whenever a new analysis runs. Not registering it will cause the crash
whenever a scop pass requests an analysis.
For instance this is the case for the IstAstAnalysis requesting the
DependenceAnalsis result.
In addition to that regression tests should not test the intire pass
pipeline (unless they are testing the pipeline itself), the Polly-ACC
currently does not support the new pass manager. If enabled by default,
such tests will therefore fail.
Use the -polly-gpu-runtime and -polly-gpu-arch options also as default
values for the PPCGCodeGeneration pass. This requires to move the option
to be moved from the pipeline-building Register passes to the
PPCGCodeGeneration implementation.
Fixes the spir-typesize.ll buildbot fail.
ZoneAlgorithms's computePHI relies on being provided with consistent a
schedule to compute the statement prodecessors of a statement containing
PHINodes. Otherwise unexpected results such as PHI nodes with multiple
predecessors can occur which would result in problems in the
algorithms expecting consistent data.
In the added test case, statement instances are scrubbed from the
SCoP their execution would result in undefined behavior (Due to a nsw
overflow). As already being undefined behavior in LLVM-IR, neither
AssumedContext nor InvalidContext are updated, giving computePHI no
means to avoid these cases.
Intoduce a new SCoP property, the DefinedBehaviorContext, that among
the runtime-checked conditions, also tracks the assumptions not needing
a runtime check, in particular those affecting the assumed control flow.
This replaces the manual combination of the 3 other contexts that was
already done in computePHI and setNewAccessRelation. Currently, the only
additional assumption is that loop induction variables will nsw flag for
not wrap, but potentially more can be added. Use in
hasFeasibleRuntimeContext, isl::ast_build and gisting are other
potential uses.
To limit computational complexity, the DefinedBehaviorContext is not
availabe if it grows too large (atm hardcoded to 8 disjuncts).
Possible other fixes include bailing out in computePHI when
inconsistencies are detected, choose an arbitrary value for inconsistent
cases (since it is undefined behavior anyways), or make the code
receiving the result from ComputePHI handle inconsistent data. All of
them reduce the quality of implementation having to bail out more often
and disabling the ability to assert on actually wrong results.
This fixes llvm.org/PR48783.
This fixes llvm.org/PR48554
Some test cases had to be updated because the hash function for
union_maps have been changed which affects the output order.
to Pass.h.
In some compiler passes like SampleProfileLoaderPass, we want to know which
LTO/ThinLTO phase the pass is in. Currently the phase is represented in enum
class PassBuilder::ThinLTOPhase, so it is only available in PassBuilder and
it also cannot represent phase in full LTO. The patch extends it to include
full LTO phases and move it from PassBuilder.h to Pass.h, then it is much
easier for PassBuilder to communiate with each pass about current LTO phase.
Differential Revision: https://reviews.llvm.org/D94613
MemoryAccess::setNewAccessRelation() in assert-builds checks whether the
access relation for a READ has a memory location for every instance of
the domain. Otherwise, we would not have value to load from. That check
already considered that instances outside the Scop's context do not
matter since they are never executed (or would be undefined behavior).
In this patch also take instances of the InvalidContext into account,
as these can also be assumed to never occur. InvalidContext was
introduced to avoid the computational complexity of subtracting
restrictions from the AssumedContext. However, this additional check in
setNewAccessRelation is only done in assert-builds.
The assertion case with an InvalidContext may occur with DeLICM on a
conditionally infinite loops, as it is the case in the following code:
for (int i = 0; i < n; i+=b)
vreg = ...;
*Dest = vreg;
The loop is infinite when b=0, and [b] -> { : b = 0 } is part of the
InvalidContext. When DeLICM tries to map the memory for %vreg to *Dest,
there is no store instance that uses the value of vreg when b = 0, hence
no location to map it to. However, the case is irrelevant since Polly's
runtime condition check ensures that this is never case.
Fixes llvm.org/PR48445
ScalarEvolution::getSCEV cannot be used during codegen. ScalarEvolution
assumes a stable IR and control flow which is under construction during
Polly's CodeGen. In particular, it uses DominatorTree for compute the
backedge taken count. However the DominatorTree is not updated during
codegen.
In this case, SCEV was used to determine the base pointer of an array
access. Replace it by our own function. Polly generates only GEP and
BitCasts for array acceses, i.e. it is sufficient to handle these to to
find the base pointer.
Fixes llvm.org/PR48422
There's a small number of users of this function, they are all updated.
This updates the C API adding a new method LLVMGetTypeByName2 that takes a context and a name.
Differential Revision: https://reviews.llvm.org/D78793
Currently, we have some confusion in the codebase regarding the
meaning of LocationSize::unknown(): Some parts (including most of
BasicAA) assume that LocationSize::unknown() only allows accesses
after the base pointer. Some parts (various callers of AA) assume
that LocationSize::unknown() allows accesses both before and after
the base pointer (but within the underlying object).
This patch splits up LocationSize::unknown() into
LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer()
to make this completely unambiguous. I tried my best to determine
which one is appropriate for all the existing uses.
The test changes in cs-cs.ll in particular illustrate a previously
clearly incorrect AA result: We were effectively assuming that
argmemonly functions were only allowed to access their arguments
after the passed pointer, but not before it. I'm pretty sure that
this was not intentional, and it's certainly not specified by
LangRef that way.
Differential Revision: https://reviews.llvm.org/D91649
Declarations in headers should not be in the anonymous
namespace. Compilers also warn about the use of
<anon namespace>::SimplifyVisitor as a public field in
polly::SimplifyPass and polly::SimplifyPrinterPass.
Operand tree forwarding can cause the change of an access kind; in
particular change from a scalar kind to an array kind if the scalar
dependency is not necessary. Such an access cannot and doesn't need to
be forwarded anymore.
Fixes llvm.org/PR48034
Print to dbgs() any taken action.
Also, read-only scalars do not require any action unless
-polly-analyze-read-only-scalars=true is used. Better refect this by
using ForwardingAction::triviallyForwardable and thus not bumping the
statistics.
ScopBuilder distributes independent instructions between statements.
Only modeled (e.g. not synthesizable) instructions are represented.
To compute independence, non-modeled instructions were used in some
parts of determining instruction independence, which could lead to the
re-introduction of non-model instructions.
In particular, required invariant loads could be added to instruction
list, which then led to redundant MemoryAccesses for such a load.
This fixes llvm.org/PR48059.