When folding a binary operation between two array constructors, it
is necessary to check if each value contained in the left operand
has the same rank and shape as the one on the right.
Otherwise, lowering would end up with an operation between values
of different ranks/shapes, which could result in a crash.
For instance, the following code was crashing the compiler:
integer :: x(4), y(2, 2), z(4)
z = (/x/) + (/y/)
Fixes#60229
Reviewed By: klausler, jeanPerier
Differential Revision: https://reviews.llvm.org/D147181
When dropping leading unit dims of vector.insert's operands and creating
a new vector.insert, its new position rank should be computed explicitly
in two steps: first based on the numbers of leading unit dims dropped
from the vector.insert's destination, then based on the numbers of
leading unit dims dropped from its source.
Reviewed By: pifon2a
Differential Revision: https://reviews.llvm.org/D147280
The revision moves the data layout parsing into a separate file
and extends it to support pointer data layout specifications.
Additionally, it also produces more precise warnings and error
messages.
Reviewed By: Dinistro, definelicht
Differential Revision: https://reviews.llvm.org/D147170
Now that we canonicalize to min/max intrinsics, we no longer need
to guard against this here.
In fact, it seems like the issue from PR46271 was the final push
for introducing the intrinsics in the first place...
Add special case to matrix lowering for dot products. Normal matrix lowering if optimized for either row-major or column-major, which results in many `shufflevector` instructions being generated for one vector. We work around this in our special case. We can also use vector-reduce adds instead of sequential adds to sum the result of the element-wise multiplication, which takes advantage of SIMD instructions.
Reviewed By: fhahn, thegameg
Differential Revision: https://reviews.llvm.org/D131125
See https://discourse.llvm.org/t/rfc-enable-assignment-tracking/69399
This sets the `-Xclang -fexperimental-assignment-tracking` flag to the value
`enabled` which means it will be enabled so long as none of the following are
true: it's an LTO build, LLDB debugger tuning has been specified, or it's an O0
build (no work is done in any case if -g is not specified or -gmlt is used).
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D146987
If the to-be-split dbg.assign has a `DIArgList` and a new `Value` has been
requested then use a kill-location for the new dbg.assign. We can't simply
replace the value component (a `DIArgList`) with the new `Value` as that would
leave the `DIExpression` in an invalid state (`DW_OP_LLVM_arg` operands with no
arglist).
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D147312
Update dot-product-int.ll tests to use mostly i32 instead of i64;
there's no mul.2d instruction, so vector versions of v2i64 cannot be
lowered efficiently.
We can't just use VisitCallExpr() here, as that doesn't handle CallExpr
subclasses such as CXXMemberCallExpr.
Differential Revision: https://reviews.llvm.org/D141772
Tested this with the new AArch32 backend on armv7l and it works without issues in GDB. The size of the load-address field is only 32-bit here, but we implicitly account for it by writing a ELFT::uint which is:
https://github.com/llvm/llvm-project/blob/release/16.x/llvm/include/llvm/Object/ELFTypes.h#L57
So, instead of adding a newly supported machine type, let's just drop this restriction althogether.
Scope of changes:
1) Add attribute to OpenMP MLIR dialect which stores target cpu and
target features
2) Store target information in MLIR module
Differential Revision: https://reviews.llvm.org/D146612
Reviewed By: kiranchandramohan
Co-authored-by: Kiran Chandramohan <kiran.chandramohan@arm.com>
For 1:N type conversion, there is a 1:N relationship between the
original operands and the converted operands. The same is true for the
results. The previous design passed an instance of a "mapping" class
into each pattern that helped with handling this 1:N correspondance.
However, this was still rather manual and, in particular, it required
the use of magic constants for the indices of the different operands.
This commits uses the generated GenericAdaptor class that is generated
for each op class in order to simplify this relationship further. The
GenericAdaptor allows to wrap around a list of arbitrary types for each
operand (via templating); for 1:N type conversion, this allows the
operand accessors of the adaptor class to return a ValueRange that
corresponds to the N values in the converted types. Patterns can thus
use the named accessors instead of magic constants, which eliminates a
common class of errors.
This commit further simplifies the API that patterns need to implement
by making the operand and result type mappings part of the adaptor.
Since many patterns only need one of the two (or even neither), this
reduces the number of unnecessary arguments in many cases.
Reviewed By: springerm
Differential Revision: https://reviews.llvm.org/D147225
This reverts commit 64b45db34a.
Reason: the patterns are wrong which can result in a miscompilation.
However, fixing the pattern is not trivial due to how i8 values
are handled, and due to the additional type-checking performed by
D147127: trunc/smax/smin are all defined as int ops in the DAG
despite them working on vectors too.
As this is not a much-needed pattern, I prefer reverting for now
until I can find time to properly rewrite the pattern.
SROA may convert a wide integer load into a narrow pointer load,
make sure we don't crash. It would not be legal to transfer the
metadata in this case.
This patch is extracted from D96035. It adds StringPool class.
StringPool allows to store strings in parallel. It also allows
to have string data associated with the concrete string.
Reviewed By: JDevlieghere
Differential Revision: https://reviews.llvm.org/D140841
Multiple errors have being reported on
https://reviews.llvm.org/rG498aa534f472d28db893aa9a8627d0b46e17f312
Reverting until the correctness issues can be resolved.
We are also seeing a lot of performance differences from the patch. Some are
looking good, but some are looking pretty bad.
Sometimes it's useful to be able and debug code even without actual debug info, e.g. for setting breakpoints on function names.
This patch adds a new API option to make it possible in Orc.
The existing API and behavior remains unchanged: non-debug objects are not passed to exectuors.
When the initial DebugObjectManagerPlugin landed, it was not clear whether we will have more patching requirements for debug section. Also, there were no other use-cases for debug object flags.
Adding options to the plugin gives us a use-case and we can re-use the field for it. This commit only refactors the infrastructure in preparation for two more patches to come.
Originally, the DebugObjectManagerPlugin recorded all sections and filtered some of them for load-address patching.
Then we spotted problems with duplicate section names and started additional filtering upfront (see b26f45e5a4).
This seems the better approach. Let's go for it and stop filtering in two locations.
Compiler-generated section names can clash. Examples are group sections or profile counter sections.
We don't need to abort debug registration for the entire LinkGraph in such a case.
Instead, let's skip the relevant sections and add a note to the debug log.
Generalize used to fail on ops that have a null region builder.
This is incorrect, the test should be whether the op has a region or not.
In the future we may want to support 0-region ops with a region builder.
Differential Revision: https://reviews.llvm.org/D147166
This patch adds support for `-mattr` and `-march` in mlir-cpu-runner.
With this change, one should be able to consistently use mlir-cpu-runner
for MLIR's integration tests (instead of e.g. resorting to lli when some
additional flags are needed). This is demonstrated in
concatenate_dim_1.mlir.
In order to support the new flags, this patch makes sure that
MLIR's ExecutionEngine/JITRunner (that mlir-cpu-runner is built on top of):
* takes into account the new command line flags when creating
TargetMachine,
* avoids recreating TargetMachine if one is already available,
* creates LLVM's DataLayout based on the previously configured
TargetMachine.
This is necessary in order to make sure that the command line
configuration is propagated correctly to the backend code generator.
A few additional updates are made in order to facilitate this change,
including support for debug dumps from JITRunner.
Differential Revision: https://reviews.llvm.org/D146917
The current context less lowering of NULL is producing invalid code
(can lead to reading outside of allocated memory): it is casting
a simple pointer to a descriptor address.
Later, reads are made to this descriptor. It used to be "OK" when
fir.load of fir.box were no-ops, but this was incorrect, and the
fir.load codegen is known doing a copy, and read the whole descriptor
data, not only the base address.
The previous patch that allowed fir.box<None> allocation, this
code fix this by allocating an actual fir.box<None>.
Note: this is still an overkill way to lower foo(null()). HLFIR
lowering always contextualize NULL() lowering leading to much simpler
code:
```
%absent = fir.absent fir.box<T>
fir.call @foo(%absent)
```
Differential Revision: https://reviews.llvm.org/D147239
Currently, it is OK to have alloca/store/and reboxed to
fir.box<!fir.array<?xnone>> and fir.class<none>, but not simple
fir.box<none>.
This restriction is a legacy from a time where it was thought TYPE(*)
descriptor size would not be statically known, but the way polymorphism
was implemented actually allows knowing its size: a scalar descriptor
with an addendum (in case it is a derived type).
Note that this assumes fir.box<none> are always scalars. There are currently
a few cast from ranked descriptor to !fir.box<None> around runtime calls.
These are simple casts before runtime call, so there are no load/stores
to the resulting fir.box<None> and it is OK.
When assumed rank are supported, some legacy usage of fir.box<none> as the "any"
descriptor in the runtime interface will be replaced to avoid any issues there.
This change will be required to fix an undefined behavior with NULL() that
requires allocation of a fir.box<None>.
Differential Revision: https://reviews.llvm.org/D147237
ASSOCIATED intrinsic TARGET handling is weird for OPTIONAL, because as
opposed to other intrinsic arguments, OPTIONAL allocatable and pointers
may be absent when passed to it, and a diassociated pointer TARGET is not
the same as when TARGET is not provided. Hence, it needs custom
handling in lowering.
The handling was done late (in genIntrinsicCall, without the semantic
context), and assumed it would be possible to retrieve the optionality
aspects, but this is brittle, and hard to share with HLFIR.
Move it in CustomIntrinsicCall that is intended to deal with these
corner case.
Also avoid using fir.box<None> as the related fir.if result, and used
the correct fir.box/fir.class type for the target: using a fir.box<None>
here is risky since fir.box<None> are now meant for scalar TYPE(*), and
the TARGET may be ranked.
Move the introduction of the fir.box<None> around the runtime (when
assumed rank are supported, these will become !fir.box<!fir.array<..xNone>>).
Differential Revision: https://reviews.llvm.org/D147224
Add the more precise error message introduced in
https://reviews.llvm.org/D142337 to the standard
error produced for unhandled constants. This way
we keep testing both error cases.
Reviewed By: Dinistro
Differential Revision: https://reviews.llvm.org/D147205
'Params' is a member of the ByteCodeEmitter. We only added the
parameters the first time we saw the function, so subsequent visits
didn't work if they had (and used) parameters.
Just do the work everytime we see a function.
Differential Revision: https://reviews.llvm.org/D141681