The `opt -analyze` option only works with the legacy pass manager and might be removed in the future, as explained in llvm.org/PR53733. This patch introduced -polly-print-* passes that print what the pass would print with the `-analyze` option and replaces all uses of `-analyze` in the regression tests.
There are two exceptions: `CodeGen\single_loop_param_less_equal.ll` and `CodeGen\loop_with_condition_nested.ll` use `-analyze on the `-loops` pass which is not part of Polly.
Reviewed By: aeubanks
Differential Revision: https://reviews.llvm.org/D120782
Implement the vectorLoopUnroll interface for MultiDimReduceOp and add a
pattern to do the unrolling following the same interface other vector
unroll patterns.
Differential Revision: https://reviews.llvm.org/D121263
According to LangRef, an access scope must have zero operands and
be distinct. The access group may either be a single access scope
or a list of access scopes.
LoopInfo may assert if this is not the case.
VSX introduced some permute instructions that are direct
replacements for Altivec ones except they can target all
the VSX registers. We have added code generation for most
of these but somehow missed the low/hi word merges (XXMRG[LH]W).
This caused some additional spills on some large
computationally intensive code.
This patch simply adds the missed patterns.
Previously, the test checked for a "undefined symbol" error
(instead of the "could not open std*.lib" which would happen without
the flag).
Instead, use /entry: so that the link succeeds.
No behavior change, but maybe makes the test a bit easier to understand.
Differential Revision: https://reviews.llvm.org/D121553
Implement exp2f function that is correctly rounded for all rounding modes.
Reviewed By: sivachandra, zimmermann6
Differential Revision: https://reviews.llvm.org/D121463
WG14 adopted N2775 (http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2775.pdf)
at our Feb 2022 meeting. This paper adds a literal suffix for
bit-precise types that automatically sizes the bit-precise type to be
the smallest possible legal _BitInt type that can represent the literal
value. The suffix chosen is wb (for a signed bit-precise type) which
can be combined with the u suffix (for an unsigned bit-precise type).
The preprocessor continues to operate as-if all integer types were
intmax_t/uintmax_t, including bit-precise integer types. It is a
constraint violation if the bit-precise literal is too large to fit
within that type in the context of the preprocessor (when still using
a pp-number preprocessing token), but it is not a constraint violation
in other circumstances. This allows you to make bit-precise integer
literals that are wider than what the preprocessor currently supports
in order to initialize variables, etc.
Due to various implementation constraints, despite the programmer
choosing a 'processor' cpu_dispatch/cpu_specific needs to use the
'feature' list of a processor to identify it. This results in the
identified processor in source-code not being propogated to the
optimizer, and thus, not able to be tuned for.
This patch changes to use the actual cpu as written for tune-cpu so that
opt can make decisions based on the cpu-as-spelled, which should better
match the behavior expected by the programmer.
Note that the 'valid' list of processors for x86 is in
llvm/include/llvm/Support/X86TargetParser.def. At the moment, this list
contains only Intel processors, but other vendors may wish to add their
own entries as 'alias'es (or with different feature lists!).
If this is not done, there is two potential performance issues with the
patch, but I believe them to be worth it in light of the improvements to
behavior and performance.
1- In the event that the user spelled "ProcessorB", but we only have the
features available to test for "ProcessorA" (where A is B minus
features),
AND there is an optimization opportunity for "B" that negatively affects
"A", the optimizer will likely choose to do so.
2- In the event that the user spelled VendorI's processor, and the
feature
list allows it to run on VendorA's processor of similar features, AND
there
is an optimization opportunity for VendorIs that negatively affects
"A"s,
the optimizer will likely choose to do so. This can be fixed by adding
an
alias to X86TargetParser.def.
Differential Revision: https://reviews.llvm.org/D121410
This change teaches the Sema logic for `__builtin_memcpy_inline` to implicitly convert arrays passed as arguments to pointers, similarly to regular `memcpy`.
This code will no longer cause a compiler crash:
```
void f(char *p) {
char s[1] = {0};
__builtin_memcpy_inline(p, s, 1);
}
```
rdar://88147527
Differential Revision: https://reviews.llvm.org/D121475
The insertion point for the builder used during VPlan code generation is
set during code generation. Setting the insert point here is dead code
and can be removed.
in TokenAnnotator::parseBrace. Left is misleading, because we have a
loop and Left does not move.
Also return early.
Differential Revision: https://reviews.llvm.org/D121558
in TokenAnnotator::parseParens(). Left is misleading since we have a
loop and Left is not adjusted.
Differential Revision: https://reviews.llvm.org/D121557
Context: I needed this for https://github.com/google/iree/pull/8474 .
I found that TSan instrumentation expects vector sizes to be <= 16,
and in my project (IREE) we have tests with higher vector sizes.
That left some test functions uninstrumented, resulting in crashes as
instrumented code called into them.
Differential Revision: https://reviews.llvm.org/D121182
As noticed in D119654, by adding the masked intrinsics results together we can end up with the selects being canonicalized away from the intrinsic - this isn't what we want to test here so replace with a insertvalue chain into a aggregate instead to retain all the results.
The revision removes the linalg.fill operation and renames the OpDSL generated linalg.fill_tensor operation to replace it. After the change, all named structured operations are defined via OpDSL and there are no handwritten operations left.
A side-effect of the change is that the pretty printed form changes from:
```
%1 = linalg.fill(%cst, %0) : f32, tensor<?x?xf32> -> tensor<?x?xf32>
```
changes to
```
%1 = linalg.fill ins(%cst : f32) outs(%0 : tensor<?x?xf32>) -> tensor<?x?xf32>
```
Additionally, the builder signature now takes input and output value ranges as it is the case for all other OpDSL operations:
```
rewriter.create<linalg::FillOp>(loc, val, output)
```
changes to
```
rewriter.create<linalg::FillOp>(loc, ValueRange{val}, ValueRange{output})
```
All other changes remain minimal. In particular, the canonicalization patterns are the same and the `value()`, `output()`, and `result()` methods are now implemented by the FillOpInterface.
Depends On D120726
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D120728
Use the TODO macro in `flang/Lower/Todo.h` with the converter location.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D121582
Support new intrinsics for following instrauctions.
- VLDZ, VPCNT, VBRV
- LCR, SCR, TSCR, FIDCR
- FENCE
Also clean the intrinsics implementation of a following instruction.
- SVOB
Reviewed By: simoll
Differential Revision: https://reviews.llvm.org/D121509
Add a todo for assumed shape dummy argument with VALUE attribute
since this is not implemented yet.
This patch is part of the upstreaming effort from fir-dev branch.
Reviewed By: jeanPerier
Differential Revision: https://reviews.llvm.org/D121581
This adds support for v256.32|64 scatter|gather isel. vp.gather|scatter
and regular gather|scatter intrinsics are both lowered to the internal
VVP layer. Splitting these ops on v512.32 is the subject of future
patches.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D121288
Implement the GET_COMMAND intrinsic.
Add 2 new parameters (sourceFile and line) so we can create a terminator
for RUNTIME_CHECKs.
Differential Revision: https://reviews.llvm.org/D118777
The vectoriser sometimes generates predicated vector loops using
the llvm.get.active.lane.mask intrinsic so it's important that we
are able to calculate a valid cost for the call instruction. When
SVE is enabled we are able to use a single whilelo instruction
for some vector types - in such cases I've marked the cost as 1.
For all other cases I've set the cost according to how the intrinsic
will be expanded.
Tests added here:
Analysis/CostModel/AArch64/sve-intrinsics.ll
Analysis/CostModel/ARM/active_lane_mask.ll
Analysis/CostModel/RISCV/active_lane_mask.ll
Differential Revision: https://reviews.llvm.org/D121109
Kazushi Marukawa (kaz7) of NEC Solution Innovators will take over my
role as a code owner for the Vector Engine target. Erich Focht (efocht)
of NEC will assume the administrator role for the clang-ve-ninja
buildbot.
Reviewed By: kaz7
Differential Revision: https://reviews.llvm.org/D121453
Introduce an explicit `replaceOp` call to enable the tracking of the producer LinalgOp.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D121369
This support "DEPENDS" and "EXTRA_INCLUDES", allowing in particular
to inject include paths to a tablegen targets without forcing to go
through the global INCLUDE_DIRECTORIES property.
Differential Revision: https://reviews.llvm.org/D121568
When using COMPILER_RT_USE_BUILTINS_LIBRARY=ON and clang-cl there
where several places where it didn't work as expected.
First -print-libgcc-file-name has to be prefixed with /clang:
Then the regex that matched the builtins library was wrong because
the builtins library is called clang_rt.builtins_<arch>.lib
and the regex only matched libclang_rt.builtins_arch.a
With this commit you can use a runtime build on Windows with this
option enabled.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D120698
This allows you to set a custom path to the ThinLTO cache so that
it can be shared when building in several different build directories.
Reviewed By: phosek
Differential Revision: https://reviews.llvm.org/D121215
Previously the comments for configuration structs as a whole like
`BraceWrappingFlags` did not go into the doc.
Reviewed By: curdeius
Differential Revision: https://reviews.llvm.org/D120361
For when we want to change a configuration option from an enum into a
struct. The need arose when working on D119599.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D120363
This is present since the beginning, but does not seem needed by any
in-tree target right now. This seems like the kind of thing to populate
by the caller if needed.
Differential Revision: https://reviews.llvm.org/D121565