Commit Graph

18534 Commits

Author SHA1 Message Date
Guray Ozen
763109e346
[mlir][gpu] Use known_block_size to set maxntid for NVVM target (#77301)
Setting thread block size with `maxntid` on the kernel has great
performance benefits. In this way, downstream PTX compiler can do better
register allocation.

MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute
(`known_block_size`) that keeps the thread block size when it is known.
This PR simply uses this attribute to set `maxntid`.
2024-01-08 14:49:19 +01:00
Javed Absar
0ba868db70
[MLIR][Bufferizer][NFC] Simplify some codes. (#77254)
NFC. clean up.
2024-01-08 09:37:57 +00:00
Adrian Kuegel
2642240de9 [mlir] Add explicit call to flush
ClangTidy performance suggested to use '\n' instead of std::endl, but it
seems the flushing behavior was intended here (tests started failing).
2024-01-08 08:04:13 +00:00
Adrian Kuegel
6343b4e482 [mlir] Apply ClangTidy performance finding
- Use '\n' instead of std::endl;

https://clang.llvm.org/extra/clang-tidy/checks/performance/avoid-endl.html
2024-01-08 07:47:14 +00:00
Tobias Gysi
7e54ae24d8
[mlir][llvm] Do not inline variadic functions (#77241)
This revision updates the llvm dialect inliner to explicitly disallow
the inlining of variadic functions. Already previously the inlining
failed if the number of function arguments did not match the number of
call arguments. After the change, inlining checks the function is not
variadic and it does not contain a va_start intrinsic.
2024-01-08 08:30:10 +01:00
Christian Ulmann
bae1fdea71
[MLIR][LLVM] Add distinct identifier to the DISubprogram attribute (#77093)
This commit adds an optional distinct attribute parameter to the
DISubprogramAttr. This enables modeling of distinct subprograms, as
required for LLVM IR. This change is required to avoid accidential
uniquing of subprograms on functions that would lead to invalid LLVM IR
post export.
2024-01-08 08:25:30 +01:00
Christian Ulmann
b3037ae1fc
[MLIR][LLVM] Add distinct identifier to DICompileUnit attribute (#77070)
This commit adds a distinct attribute parameter to the DICompileUnit to
enable the modeling of distinctness. LLVM requires DICompileUnits to be
distinct and there are cases where one gets two equivalent compilation
units but LLVM still requires differentiates them. We observed such
cases for combinations of LTO and inline functions.

This patch also changes the DIScopeForLLVMFuncOp pass to a module pass,
to ensure that only one distinct DICompileUnit is created, instead of
one for each function.
2024-01-08 07:42:33 +01:00
Matthias Springer
752df2bc0b
[mlir][IR] DominanceInfo: Add function to query dominator of a range of block (#77098)
Also improve the implementation of `findCommonDominator` (skip duplicate
blocks) and extract it from `BufferPlacementTransformationBase` (so that
`BufferPlacementTransformationBase` can be retired eventually).
2024-01-07 14:01:11 +01:00
Matthias Springer
dd450f08cf
[mlir][Interfaces][NFC] Move region loop detection to RegionBranchOpInterface (#77090)
`BufferPlacementTransformationBase::isLoop` checks if there a loop in
the region branching graph of an operation. This algorithm is similar to
`isRegionReachable` in the `RegionBranchOpInterface`. To avoid duplicate
code, `isRegionReachable` is generalized, so that it can be used to
detect region loops. A helper function
`RegionBranchOpInterface::hasLoop` is added.

This change also turns a recursive implementation into an iterative one,
which is the preferred implementation strategy in LLVM.

Also move the `isLoop` to `BufferOptimizations.cpp`, so that we can
gradually retire `BufferPlacementTransformationBase`. (This is so that
proper error handling can be added to `BufferViewFlowAnalysis`.)
2024-01-07 13:49:29 +01:00
Bharathi Ramana Joshi
3eb9fd8ac8
[MLIR][Presburger] Implement IntegerRelation::mergeAndAlignSymbols (#76736) 2024-01-07 17:06:52 +05:30
Abhinav271828
2835be82db
[MLIR][Presburger] Fix ParamPoint to be column-wise instead of row-wise (#77232)
The ParamPoint datatype has each column representing an affine function.
The code for generating functions is modified to reflect this.
2024-01-07 16:27:10 +05:30
Abhinav271828
4c8dbb6813
[MLIR][Presburger] Definitions for basic functions related to cones (#76650)
We add some basic type aliases and function definitions relating to
cones for Barvinok's algorithm.
These include functions to get the dual of a cone and find its index.
2024-01-07 10:30:22 +00:00
Alex Beloi
c63febb102
[mlir][spirv] Use assemblyFormat to define atomic op assembly (#76323)
see #73359

Declarative assemblyFormat ODS is more concise and requires less
boilerplate than filling out CPP interfaces.

Changes:
* updates the Ops defined in `SPIRVAtomicOps.td` to use assemblyFormat.
* Removes print/parse from`AtomcOps.cpp` which is now generated by
assemblyFormat
* Adds `Trait` to verify that a pointer operand `foo`'s pointee type
matches operand `bar`'s type
* * Updates error message expected in tests from new Trait
* Updates tests to updated format (largely using <operand> in place of
"operand")
2024-01-06 19:55:55 -08:00
Maksim Levental
83be8a7400
[mlir][python] add MemRefTypeAttr attr builder (#76371) 2024-01-06 16:42:14 -06:00
Kohei Yamaguchi
747d8fb01c
[mlir][spirv] Support alias/restrict function argument decorations (#76353)
Closes #76106

---------

Co-authored-by: Lei Zhang <antiagainst@gmail.com>
2024-01-06 11:51:23 -08:00
Abhinav271828
bd0dc357af
[MLIR][Presburger] Shift GeneratingFunction.h to includes (#77114)
We shift the GeneratingFunction.h header file to the include/ directory
and wrap it in a `detail` namespace.
2024-01-06 17:08:25 +05:30
Guray Ozen
5b33cff397
[mlir][gpu] Add Support for Cluster of Thread Blocks in gpu.launch (#76924) 2024-01-06 11:17:01 +01:00
Dimple Prajapati
5e54319b7b
[mlir][spirv] Support spec constants as GlobalVar initializer (#75660)
Changes include:

- spirv serialization and deserialization needs handling in cases when
GlobalVariableOp initializer is defined using spirv SpecConstant or
SpecConstantComposite op, currently even though it allows SpecConst, it
only looked up in for GlobalVariable Map to find initializer symbol
reference, change is fixing this and extending the support to
SpecConstantComposite as an initializer.
- Adds tests to make sure GlobalVariable can be initialized using
specialized constants.

---------

Co-authored-by: Lei Zhang <antiagainst@gmail.com>
2024-01-05 16:27:30 -08:00
Boian Petkantchin
fc18b13492
[mlir][mesh] In sharding attr use FlatSymbolRefAttr instead of SymbolRefAttr (#76886)
Analogous to func.call use FlatSymbolRefAttr to reference the
corresponding mesh.
2024-01-05 07:14:07 -08:00
Arseniy Obolenskiy
59569eb756
[mlir] Fix support for loop normalization with integer indices (#76566)
Choose correct type for updated loop boundaries after scf loop
normalization, do not force chosen type to IndexType
2024-01-05 17:49:21 +03:00
Guray Ozen
06f1e10908
[mlir][nvvm] Add clock and clock64 special registers (#77088)
Tihs PR adds `clock` and `clock64` special registers to NVVM dialect.
2024-01-05 14:41:44 +01:00
Guray Ozen
ace69e6b94
[mlir][gpu] Improve gpu-lower-to-nvvm-pipeline Documentation (#77062)
This PR improves the documentation for the `gpu-lower-to-nvvm-pipeline`
(as it was remaning item for #75775)

- Changes pipeline `gpu-lower-to-nvvm` -> `gpu-lower-to-nvvm-pipeline`
- Adds a section in GPU Dialect in website. It clarifies the pipeline's
functionality in lowering primary dialects to NVVM targets.
2024-01-05 12:51:25 +01:00
drazi
44b3cf46e9
add prop-dict support for custom directive for mlir-tblgen (#77061)
According to
https://mlir.llvm.org/docs/DefiningDialects/Operations/#custom-directives,
custom directive supports attr-dict

> attr-dict Directive: NamedAttrList &

But it doesn't support prop-dict which is introduced into MLIR recently.
It's useful to have tblgen support prop-dict like attr-dict. This PR
enable tblgen to support prop-dict

```bash
error: only variables and types may be used as parameters to a custom directive
   ... custom<Print>(prop-dict)
```

Co-authored-by: Fung Xie <ftse@nvidia.com>
2024-01-05 12:37:24 +01:00
Dmitriy Smirnov
2952fb3495
[TOSA] Usage of 32bit integer for 'index to float' in rfft2d (#75098)
Lowering of rfft2d to linalg now uses index to i32 cast if an output
float is of 32bit and cast to i64 otherwise.
2024-01-05 09:51:23 +00:00
Guray Ozen
4319e1916d
[mlir][nvgpu] Introduce Multicast Capability to nvgpu.tma.async.load (#76935)
This PR improves the functionality of the `nvgpu.tma.async.load` Op by
adding support for multicast. While we already had this capability in
the lower-level `nvvm.cp.async.bulk.tensor.shared.cluster.global` NVVM
Op, this PR lowers mask information to the NVVM operation.
2024-01-05 10:48:55 +01:00
Matthias Springer
b662c9aa0e
[mlir][bufferization][NFC] Buffer deallocation: Add comment to handleInterface (#76956)
This is a follow-up for #68648.
2024-01-05 09:30:52 +01:00
Matthias Springer
bb6d5c2200
[mlir][Transforms] GreedyPatternRewriteDriver: Do not CSE constants during iterations (#75897)
The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply
rewrite patterns to ops. It has special handling for constants: they are
CSE'd and sometimes moved to parent regions to allow for additional
CSE'ing. This happens in `OperationFolder`.

To allow for efficient CSE'ing, `OperationFolder` maintains an internal
lookup data structure to find the existing constant ops with the same
value for each `IsolatedFromAbove` region:
```c++
/// A mapping between an insertion region and the constants that have been
/// created within it.
DenseMap<Region *, ConstantMap> foldScopes;
```

Rewrite patterns are allowed to modify operations. In particular, they
may move operations (including constants) from one region to another
one. Such an IR rewrite can make the above lookup data structure
inconsistent.

We encountered such a bug in a downstream project. This bug materialized
in the form of an op that uses the result of a constant op from a
different `IsolatedFromAbove` region (that is not accessible).

This commit changes the behavior of the `GreedyPatternRewriteDriver`
such that `OperationFolder` is used to CSE constants at the beginning of
each iteration (as the worklist is populated), but no longer during an
iteration. `OperationFolder` is no longer used after populating the
worklist, so we do not have to care about inconsistent state in the
`OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver`
now performs the op folding by itself instead of calling
`OperationFolder::tryToFold`.

This change changes the order of constant ops in test cases, but not the
region in which they appear. All broken test cases were fixed by turning
`CHECK` into `CHECK-DAG`.

Alternatives considered: The state of `OperationFolder` could be
partially invalidated with every `notifyOperationModified` notification.
That is more fragile than the solution in this commit because incorrect
rewriter API usage can lead to missing notifications and hard-to-debug
`IsolatedFromAbove` violations. (It did not fix the above mention bug in
a downstream project, which could be due to incorrect rewriter API usage
or due to another conceptual problem that I missed.) Moreover, ops are
frequently getting modified during a greedy pattern rewrite, so we would
likely keep invalidating large parts of the state of `OperationFolder`
over and over.

Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant
ops are no longer folded during a greedy pattern rewrite. If you rely on
folding (and rematerialization) of constant ops during a greedy pattern
rewrite, turn the folder into a pattern.
2024-01-05 09:22:18 +01:00
Uday Bondhugula
c1eef483b2
[MLIR] Support interrupting AffineExpr walks (#74792)
Support WalkResult for AffineExpr walk and support interrupting walks
along the lines of Operation::walk. This allows interrupted walks when a
condition is met. Also, switch from std::function to llvm::function_ref
for the walk function.
2024-01-05 06:35:22 +05:30
Valentin Clement (バレンタイン クレメン)
e456689fb3
[mlir][flang][openacc] Support device_type on loop construct (#76892)
This is adding support for `device_type` clause representation in the
OpenACC MLIR dialect on the acc.loop operation and adjust flang to lower
correctly to the new representation.

Each "value" that can be impacted by a `device_type` clause is now
associated with an array attribute that carry this information. This
includes:
- `worker` clause information
- `gang` clause information
- `vector` clause information
- `collapse` clause information
- `tile` clause information

The representation of the `gang` clause information has been updated and
all values are now carried in a single operand segment. This segment is
then subdivided by `device_type`. Each value in a segment is also
associated with a `GangArgType` so it can be differentiated
(num/dim/static). This simplify the handling of gang values an limit the
number of new attributes needed.

When the clause can be associated with the operation without any value
(`gang`, `vector`, `worker`). These are represented by a dedicated
attributes with device_type information.

Extra getter functions are provided to make it easier to retrieve a
value based on a device_type.
2024-01-04 16:33:33 -08:00
Valentin Clement (バレンタイン クレメン)
71ec30132b
[mlir][openacc] Add device_type support for data operation (#76126)
Following #75864, this patch adds device_type support to the data
operation on the async and wait operands and attributes.
2024-01-04 16:33:20 -08:00
Aart Bik
4241e84707
[mlir][sparse] minor comment edits in sparsifier pipeline (#77000) 2024-01-04 14:09:31 -08:00
Maksim Levental
a0c19bd455
[mlir][RegionBranchOpInterface] explicitly check for existance of block terminator (#76831) 2024-01-04 14:43:52 -06:00
Oleksandr "Alex" Zinenko
71c17424b5
[mlir][TD] update more tests to use the "main" interpreter pass (#76963)
Update several tests under mlir/test/Dialect/Transform to use the "main"
transform interpreter pass with named entry points rather than the test
interpreter pass.

This helped discover a logic error in the expensive checks mechanism
that was exiting too early.
2024-01-04 21:33:51 +01:00
Valentin Clement
85939e5e24
[mlir][openacc][NFC] Rename custom parser from WaitOperands to DeviceTypeOperandsWithSegment 2024-01-04 10:28:37 -08:00
Andrzej Warzyński
db9a16eaed
[mlir][nfc] Update comments in the Linalg vectoriser (#76797) 2024-01-04 17:24:22 +00:00
Jakub Kuderski
9215741726
[mlir] Make fold result type check more verbose (#76867)
Print the op and its types when the fold type check fails. This is to
speed up debuging as it should be trivial to map the offending op to its
folder based on the op name.
2024-01-04 11:08:36 -05:00
Oleksandr "Alex" Zinenko
b336ab42dc
[mlir] add a way to query non-property attributes (#76959)
This helps support generic manipulation of operations that don't (yet)
use properties to store inherent attributes.

Use this mechanism in type inference and operation equivalence.

Note that only minimal unit tests are introduced as all the upstream
dialects seem to have been updated to use properties and the
non-property behavior is essentially deprecated and untested.
2024-01-04 16:40:13 +01:00
Krzysztof Drewniak
2aff7f3919
[mlir][LLVM] Add !invariant.load metadata support to llvm.load (#76754)
Add support for !invariant.load metadata (by way of a unit attribute) to
the MLIR representation of llvm.load.
2024-01-04 09:33:09 -06:00
Simon Camphausen
96c23ebd3b
[mlir][EmitC] Use declarative assembly format for opaque types and attributes (#76066)
The parser and printer of string attributes were changed to handle
escape sequences. Therefore, we no longer require a custom parser and
printer. Verification is moved from the parser to the verifier
accordingly.
2024-01-04 15:43:33 +01:00
Andrzej Warzyński
ca5d34ec71
[mlir][TD] Fix the order of return handles (#76929)
Replace (in tests and docs):

    %forall, %tiled = transform.structured.tile_using_forall

with (updated order of return handles):

    %tiled, %forall = transform.structured.tile_using_forall

Similar change is applied to (in the TD tutorial):

    transform.structured.fuse_into_containing_op

This update makes sure that the tests/documentation are consistent with
the Op specifications. Follow-up for #67320 which updated the order of
the return handles for `tile_using_forall`.
2024-01-04 12:54:16 +00:00
Alex Zinenko
5ed11e767c [mlir] don't use magic numbers in IRNumbering.cpp
Bytecode versions have named constants that should be used instead of
magic numbers.
2024-01-04 09:49:34 +00:00
Alex Zinenko
985bb3a20a [mlir] fix bytecode writer after c1eab57673
The change in c1eab57 fixed the
behavior of `getDiscardableAttrDictionary` for ops that are not using
properties to only return discardable attributes. Bytecode writer was
relying on the wrong behavior and would assume all attributes are
discardable, without appropriate testing. Fix that and add a test.
2024-01-04 09:49:34 +00:00
Mitch Phillips
0c23163184 Revert "[mlir] Add res() method to linalg::ContractionOpInterface (#76539)"
This reverts commit 53edf12e52.

Reason: Broke the sanitizer buildbots with a memory leak. More
information available on
53edf12e52
2024-01-04 10:37:32 +01:00
drblallo
2bd6642533
[mlir][dataflow]Fix dense backward dataflow intraprocedural hook (#76865)
The dataflow analysis framework within MLIR allows to customize the
transfer function when a `call-like` operation is encuntered.

The check to see if the analysis was executed in intraprocedural mode
was executed after the check to see if the callee had the
CallableOpInterface, and thus intraprocedural analyses would behave as
interpocedural ones when performing indirect calls.

This commit fixes the issue by performing the check for
intraprocedurality first.

Dense forward analyses were already behaving correctly.
https://github.com/llvm/llvm-project/blob/main/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp#L63

Co-authored-by: massimo <mo.fioravanti@gmail.com>
2024-01-04 10:28:12 +01:00
Sergei Lebedev
3737712dae
Slightly improved ir.pyi type annotations (#76728)
* Replaced `Any` with static types where appropriate
* Removed undocumented `__str__` and `__repr__` -- these are always
defined via `object`
2024-01-04 09:49:57 +01:00
Andrzej Warzyński
f8c034140b
[mlir][docs] Update TD tutorial - Ch0 (#76858)
Updates `generic` as `linalg.generic` (for consistency and to avoid
ambiguity) and a few other fixes.
2024-01-04 09:48:44 +01:00
Jacques Pienaar
6ae7f66ff5 [mlir] Add config for PDL (#69927)
Make it so that PDL in pattern rewrites can be optionally disabled.

PDL is still enabled by default and not optional bazel. So this should
be a NOP for most folks, while enabling other to disable.

This only works with tests disabled. With tests enabled this still
compiles but tests fail as there is no lit config to disable tests that
depend on PDL rewrites yet.
2024-01-03 20:37:20 -08:00
Jerry Wu
53edf12e52
[mlir] Add res() method to linalg::ContractionOpInterface (#76539)
In addition to `lhs()` and `rhs()` to return left and right operands,
add `res()` to return the result value.
2024-01-03 22:34:19 -05:00
Boian Petkantchin
7a4c49756d
[mlir][mesh] Use one type for mesh axis (#76830)
Make all ops and attributes use the types MeshAxis and MeshAxesAttr
instead of int16_t, int32_t, DenseI16ArrayAttr and DenseI32ArrayAttr.
2024-01-03 15:47:11 -08:00
Andrzej Warzyński
39298b09ec
[mlir][docs] Capitalize "Transform" in "transform dialect" (#76840)
A mix of "Transform dialect" and "transform dialect" is used ATM. This
patch capitalizes the outstanding instances of "transform".
2024-01-03 21:33:11 +00:00