There have been a few API pieces remaining to allow for a smooth transition for
downstream users, but these have been up for a few months now. After this only
the C API will have reference to "Identifier", but those will be reworked in a followup.
The main updates are:
* Identifier -> StringAttr
* StringAttr::get requires the context as the first parameter
- i.e. `Identifier::get("...", ctx)` -> `StringAttr::get(ctx, "...")`
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D116626
If a fusedloc is created with a single location then no fusedloc
was previously created and single location returned instead. In the case
where there is a metadata associated with the location this results in
discarding the metadata. Instead only canonicalize where there is no
loss of information.
Differential Revision: https://reviews.llvm.org/D115605
This diff adds support to printing a Value when it is null. We encounter this situation when debugging the PDL bytcode execution (where a null Value is perfectly valid). Currently, the AsmPrinter crashes (with an assert in a cast) when it encounters such Value.
We follow the same format used in other printed entities (e.g., null attribute).
Reviewed By: mehdi_amini, bondhugula
Differential Revision: https://reviews.llvm.org/D116084
This reduce an unnecessary amount of copy of non-trivial objects, like
APFloat.
Reviewed By: rriddle, jpienaar
Differential Revision: https://reviews.llvm.org/D116505
Querying threads directly from the thread pool fails if there is no thread pool or if multithreading is not enabled. Returns 1 by default.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D116259
This method is more suitable as an opinterface: it seems intrinsic to
individual instances of the operation instead of the dialect.
Also remove the restriction on the interface being applicable to the entry block only.
Differential Revision: https://reviews.llvm.org/D116018
When a dialect is loaded with `getOrLoadDialect`, its constructor may recurse and call `getOrLoadDialect` on a dependent dialect, which may result in an insertion in the dialect map, invalidating the reference to the (previously null) dialect pointer.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D115846
With VectorType supporting scalable dimensions, we don't need many of
the operations currently present in ArmSVE, like mask generation and
basic arithmetic instructions. Therefore, this patch also gets
rid of those.
Having built-in scalable vector support also simplifies the lowering of
scalable vector dialects down to LLVMIR.
Scalable dimensions are indicated with the scalable dimensions
between square brackets:
vector<[4]xf32>
Is a scalable vector of 4 single precission floating point elements.
More generally, a VectorType can have a set of fixed-length dimensions
followed by a set of scalable dimensions:
vector<2x[4x4]xf32>
Is a vector with 2 scalable 4x4 vectors of single precission floating
point elements.
The scale of the scalable dimensions can be obtained with the Vector
operation:
%vs = vector.vscale
This change is being discussed in the discourse RFC:
https://llvm.discourse.group/t/rfc-add-built-in-support-for-scalable-vector-types/4484
Differential Revision: https://reviews.llvm.org/D111819
This is the second part of https://reviews.llvm.org/D114993 after slicing
into 2 independent commits.
This is needed at the moment to get good codegen from 2d vector.transfer
ops that aim to compile to SIMD load/store instructions but that can
only do so if the whole 2d transfer shape is handled in one piece, in
particular taking advantage of the memref being contiguous rowmajor.
For instance, if the target architecture has 128bit SIMD then we would
expect that contiguous row-major transfers of <4x4xi8> map to one SIMD
load/store instruction each.
The current generic lowering of multi-dimensional vector.transfer ops
can't achieve that because it peels dimensions one by one, so a transfer
of <4x4xi8> becomes 4 transfers of <4xi8>.
The new patterns here are only enabled for now by
-test-vector-transfer-flatten-patterns.
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114993
* Constraints/Rewrites registered before a pattern was added were dropped
* Constraints/Rewrites may be registered multiple times (if different pattern sets depend on them)
* ModuleOp no longer has a terminator, so we shouldn't be removing the terminator from it
Differential Revision: https://reviews.llvm.org/D114816
Custom ops that have no parser or printer should fall back to the dialect's parser and/or printer hooks. This avoids the need to define parsers and printers that simply dispatch to the dialect hook.
Reviewed By: mehdi_amini, rriddle
Differential Revision: https://reviews.llvm.org/D115481
The new form of printing attribute in the declarative assembly is eliding the `#dialect.mnemonic` prefix to only keep the `<....>` part.
Differential Revision: https://reviews.llvm.org/D113873
Affine maps and integer sets previously relied on a single lock for creating unique instances. In a multi-threaded setting, this lock becomes a contention point. This commit updates AffineMap and IntegerSet to use StorageUniquer instead. StorageUniquer internally uses sharded locks and thread-local caches to reduce contention. It is already used for affine expressions, types and attributes. On my local machine, this gives me a 5X speedup for an application that manipulates a lot of affine maps and integer sets.
This commit also removes the integer set uniquer threshold. The threshold was used to avoid adding integer sets with a lot of constraints to the hash_map containing unique instances, but the constraints and the integer set were still allocated in the same allocator and never freed, thus not saving any space expect for the hash-map entry.
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D114942
This revision reintroduces tensor.insert_slice verification which seems
to have vanished over time: a verifier was initially introduced in cf9503c1b752062d9abfb2c7922a50574d9c5de4
but for some reason the invalid.mlir was not properly updated; as time passed the verifier was not called anymore and later the code was deleted.
As a consequence, a non-negligible portion of tests has run astray using invalid
tensor.insert_slice semantics and needed to be fixed.
Also, extract isRankReducedType from TensorOps for better reuse
Originally, this facility was used by both tensor and memref forms but
it got copied around as dialects were split.
Differential Revision: https://reviews.llvm.org/D114715
We check whether the maximum index of dimensional identifier present
in the result expressions is less than dimCount (number of dimensional
identifiers) argument passed in the AffineMap::get() and the maximum index
of symbolic identifier present in the result expressions is less than
symbolCount (number of symbolic identifiers) argument passed in AffineMap::get().
Reviewed By: nicolasvasilache, bondhugula
Differential Revision: https://reviews.llvm.org/D114238
This is commit 2 of 4 for the multi-root matching in PDL, discussed in https://llvm.discourse.group/t/rfc-multi-root-pdl-patterns-for-kernel-matching/4148 (topic flagged for review).
This commit implements the features needed for the execution of the new operations pdl_interp.get_accepting_ops, pdl_interp.choose_op:
1. The implementation of the generation and execution of the two ops.
2. The addition of Stack of bytecode positions within the ByteCodeExecutor. This is needed because in pdl_interp.choose_op, we iterate over the values returned by pdl_interp.get_accepting_ops until we reach finalize. When we reach finalize, we need to return back to the position marked in the stack.
3. The functionality to extend the lifetime of values that cross the nondeterministic choice. The existing bytecode generator allocates the values to memory positions by representing the liveness of values as a collection of disjoint intervals over the matcher positions. This is akin to register allocation, and substantially reduces the footprint of the bytecode executor. However, because with iterative operation pdl_interp.choose_op, execution "returns" back, so any values whose original liveness cross the nondeterminstic choice must have their lifetime executed until finalize.
Testing: pdl-bytecode.mlir test
Reviewed By: rriddle, Mogball
Differential Revision: https://reviews.llvm.org/D108547
Add rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.
Reviewed By: bondhugula, dcaballe
Differential Revision: https://reviews.llvm.org/D112985
NamedAttribute is currently represented as an std::pair, but this
creates an extremely clunky .first/.second API. This commit
converts it to a class, with better accessors (getName/getValue)
and also opens the door for more convenient API in the future.
Differential Revision: https://reviews.llvm.org/D113956
The current implementation is quite clunky; OperationName stores either an Identifier
or an AbstractOperation that corresponds to an operation. This has several problems:
* OperationNames created before and after an operation are registered are different
* Accessing the identifier name/dialect/etc. from an OperationName are overly branchy
- they need to dyn_cast a PointerUnion to check the state
This commit refactors this such that we create a single information struct for every
operation name, even operations that aren't registered yet. When an OperationName is
created for an unregistered operation, we only populate the name field. When the
operation is registered, we populate the remaining fields. With this we now have two
new classes: OperationName and RegisteredOperationName. These both point to the
same underlying operation information struct, but only RegisteredOperationName can
assume that the operation is actually registered. This leads to a much cleaner API, and
we can also move some AbstractOperation functionality directly to OperationName.
Differential Revision: https://reviews.llvm.org/D114049
There seems to be a consensus that we should allow 0D vectors:
https://llvm.discourse.group/t/should-we-have-0-d-vectors/3097
This commit is only the first step: it changes the verifier and the parser to
allow vectors like `vector<f32>` (but does not allow explicit 0 dimensions,
i.e., `vector<0xf32>` is not allowed).
Reviewed By: nicolasvasilache
Differential Revision: https://reviews.llvm.org/D114086
For the semi affine expressions, whenever rhs of a floordiv, ceildiv, mod
or product expression is a symbolic expression, we introduce a local variable
representing the result, and store the floordiv/ceildiv, mod or product
affine expression in LocalExprs. In this way the expression is flattened,
and trivial addition and subtraction related simplifications are performed.
Also rule based matching for detecting and transforming "expr - q * (expr floordiv q)"
to "expr mod q", where q is a symbolic exxpression, in simplifyAdd function.
Differential Revision: https://reviews.llvm.org/D112808
mapped_iterator is a useful abstraction for applying a
map function over an existing iterator, but our current
usage ends up allocating storage/making indirect calls
even with the map function is a known function, which
is horribly inefficient. This commit refactors the usage
of mapped_iterator to avoid this, and allows for directly
referencing the map function when dereferencing.
Fixes PR52319
Differential Revision: https://reviews.llvm.org/D113511
Identifier and StringAttr essentially serve the same purpose, i.e. to hold a string value. Keeping these seemingly identical pieces of functionality separate has caused problems in certain situations:
* Identifier has nice accessors that StringAttr doesn't
* Identifier can't be used as an Attribute, meaning strings are often duplicated between Identifier/StringAttr (e.g. in PDL)
The only thing that Identifier has that StringAttr doesn't is support for caching a dialect that is referenced by the string (e.g. dialect.foo). This functionality is added to StringAttr, as this is useful for StringAttr in generally the same ways it was useful for Identifier.
Differential Revision: https://reviews.llvm.org/D113536
There are several aspects of the API that either aren't easy to use, or are
deceptively easy to do the wrong thing. The main change of this commit
is to remove all of the `getValue<T>`/`getFlatValue<T>` from ElementsAttr
and instead provide operator[] methods on the ranges returned by
`getValues<T>`. This provides a much more convenient API for the value
ranges. It also removes the easy-to-be-inefficient nature of
getValue/getFlatValue, which under the hood would construct a new range for
the type `T`. Constructing a range is not necessarily cheap in all cases, and
could lead to very poor performance if used within a loop; i.e. if you were to
naively write something like:
```
DenseElementsAttr attr = ...;
for (int i = 0; i < size; ++i) {
// We are internally rebuilding the APFloat value range on each iteration!!
APFloat it = attr.getFlatValue<APFloat>(i);
}
```
Differential Revision: https://reviews.llvm.org/D113229
- String binary search does 1 less string comparison
- Identifier linear scan on large attribute list is switched to string binary search
Reviewed By: rriddle
Differential Revision: https://reviews.llvm.org/D112970
The main benefits of this change are faster access to operands
(no need to compute the offset, as it is now right after the
operation), simpler code(no need to manage a lot of the "is the
operand storage trailing" logic we had to before). The major
downside to this though, is that operand holding operations now
grow in size by 1 word (as no matter how we do this change, there
will need to be some additional book keeping).
Differential Revision: https://reviews.llvm.org/D111695
Inserting a symbol into a SymbolTable may lead to the name of the symbol being
changed in order to ensure uniqueness of symbol names in the table. Return this
new name to spare the caller the need to extract it from the symbol operation.
Depends On D112700
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112886
Provide support for removing an operation from the block that contains it and
moving it back to detached state. This allows for the operation to be moved to
a different block, a common IR manipulation for, e.g., module merging.
Also fix a potential one-past-end iterator dereference in Operation::moveAfter
discovered in the process.
Reviewed By: mehdi_amini
Differential Revision: https://reviews.llvm.org/D112700
This patch extends the SubElementAttr interface to allow replacing a contained sub attribute. The attribute that should be replaced is identified by an index which denotes the n-th element returned by the accompanying walkImmediateSubElements method.
Using this addition the patch implements replacing SymbolRefAttrs contained within any dialect attributes.
Differential Revision: https://reviews.llvm.org/D111357
This patch fixes:
mlir/lib/IR/BuiltinAttributes.cpp:876:39: error: unused function
'isComplexOfIntType' [-Werror,-Wunused-function]
in a release build.
The current implementation invokes materializations
whenever an input operand does not have a mapping for the
desired type, i.e. it requires materialization at the earliest possible
point. This conflicts with goal of dialect conversion (and also the
current documentation) which states that a materialization is only
required if the materialization is supposed to persist after the
conversion process has finished.
This revision refactors this such that whenever a target
materialization "might" be necessary, we insert an
unrealized_conversion_cast to act as a temporary materialization.
This allows for deferring the invocation of the user
materialization hooks until the end of the conversion process,
where we actually have a better sense if it's actually
necessary. This has several benefits:
* In some cases a target materialization hook is no longer
necessary
When performing a full conversion, there are some situations
where a temporary materialization is necessary. Moving forward,
these users won't need to provide any target materializations,
as the temporary materializations do not require the user to
provide materialization hooks.
* getRemappedValue can now handle values that haven't been
converted yet
Before this commit, it wasn't well supported to get the remapped
value of a value that hadn't been converted yet (making it
difficult/impossible to convert multiple operations in many
situations). This commit updates getRemappedValue to properly
handle this case by inserting temporary materializations when
necessary.
Another code-health related benefit is that with this change we
can move a majority of the complexity related to materializations
to the end of the conversion process, instead of handling adhoc
while conversion is happening.
Differential Revision: https://reviews.llvm.org/D111620
Fix AffineExpr `getLargestKnownDivisor` for ceil/floor div cases.
In these cases, nothing can be inferred on the divisor of the
result.
Add test case for `mod` as well.
Differential Revision: https://reviews.llvm.org/D112523
The change is based on the proposal from the following discussion:
https://llvm.discourse.group/t/rfc-memreftype-affine-maps-list-vs-single-item/3968
* Introduce `MemRefLayoutAttr` interface to get `AffineMap` from an `Attribute`
(`AffineMapAttr` implements this interface).
* Store layout as a single generic `MemRefLayoutAttr`.
This change removes the affine map composition feature and related API.
Actually, while the `MemRefType` itself supported it, almost none of the upstream
can work with more than 1 affine map in `MemRefType`.
The introduced `MemRefLayoutAttr` allows to re-implement this feature
in a more stable way - via separate attribute class.
Also the interface allows to use different layout representations rather than affine maps.
For example, the described "stride + offset" form, which is currently supported in ASM parser only,
can now be expressed as separate attribute.
Reviewed By: ftynse, bondhugula
Differential Revision: https://reviews.llvm.org/D111553