Commit Graph

17397 Commits

Author SHA1 Message Date
Georgios Pinitas
363c617aac
[mlir][tosa] Align shift attribute of TOSA_MulOp with the spec (#67816)
According to specification the `shift` attribute of the Mul operator in
TOSA is of signless i8 type instead of i32.

Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>
2023-10-02 15:02:16 -07:00
Maksim Levental
d7e49736e6
[mlir][CAPI, python bindings] Expose Operation::setSuccessor (#67922)
This is useful for emitting (using the python bindings) `cf.br` to
blocks that are declared lexically post block creation.
2023-10-02 15:37:25 -05:00
Andrzej Warzynski
811b05c4ef [mlir][ArmSME] Remove dependency on a non-existing target (nfc)
Sending this one without a review - as `MLIRArmSMEIncGen` is not defined
anywhere, dependency on that target is clearly bogus.
2023-10-02 20:34:48 +00:00
Yinying Li
2cb99df609
[mlir][sparse] Fix typos (#67859) 2023-10-02 11:07:38 -04:00
Yinying Li
d2e8517912
[mlir][sparse] Update Enum name for CompressedWithHigh (#67845)
Change CompressedWithHigh to LooseCompressed.
2023-10-02 11:06:40 -04:00
Oleksandr "Alex" Zinenko
aab795a8dc
[mlir] run buffer deallocation in transform tutorial (#67978)
Buffer deallocation pipeline previously was incorrect when applied to
functions. It has since been fixed. Make sure it is exercised in the
tutorial to avoid leaking allocations.
2023-10-02 16:08:11 +02:00
Matthias Springer
c95fcd343d
[mlir][bufferization] Remove resolveUsesInRepetitiveRegions (#67927)
The bufferization analysis has been improved over the last months and
this workaround is no longer needed.
2023-10-02 16:04:27 +02:00
Matthias Springer
43198b0aa2
[mlir][bufferization] Better analysis around allocs and block arguments (#67923)
Values that are the result of buffer allocation ops are guaranteed to
*not* be the same allocation as block arguments of containing blocks.
This fact can be used to allow for more aggressive simplification of
`bufferization.dealloc` ops.
2023-10-02 11:01:12 +02:00
Kai Sasaki
9782993584
[mlir][affine] Check the input vector sizes to be greater than 0 (#65293)
In the process of vectorization of the affine loop, the 0 vector size
causes the crash with building the invalid AffineForOp. We can catch the
case beforehand propagating to the assertion.

See: https://github.com/llvm/llvm-project/issues/64262
2023-10-02 09:52:00 +09:00
Mehdi Amini
2375d84f06 Fix MLIR test for UBSAN: define enum values that are set in cl::opt (NFC) 2023-10-01 16:08:51 -07:00
Zhenyan Zhu
9580468302 [mlir][affine] Enforce each result type to match Reduction ops in affine.parallel verifier
This patch updates AffineParallelOp::verify() to check each result type matches
its corresponding reduction op (i.e, the result type must be a `FloatType` if
the reduction attribute is `addf`)

affine.parallel will crash on --lower-affine if the corresponding result type
cannot match the reduction attribute.

```
      %128 = affine.parallel (%arg2, %arg3) = (0, 0) to (8, 7) reduce ("maxf") -> (memref<8x7xf32>) {
        %alloc_33 = memref.alloc() : memref<8x7xf32>
        affine.yield %alloc_33 : memref<8x7xf32>
      }
```
This will crash and report a type conversion issue when we run `mlir-opt --lower-affine`

```
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 572.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
Stack dump:
0.	Program arguments: mlir-opt --lower-affine temp.mlir
 #0 0x0000000102a18f18 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/workspacebin/mlir-opt+0x1002f8f18)
 #1 0x0000000102a171b4 llvm::sys::RunSignalHandlers() (/workspacebin/mlir-opt+0x1002f71b4)
 #2 0x0000000102a195c4 SignalHandler(int) (/workspacebin/mlir-opt+0x1002f95c4)
 #3 0x00000001be7894c4 (/usr/lib/system/libsystem_platform.dylib+0x1803414c4)
 #4 0x00000001be771ee0 (/usr/lib/system/libsystem_pthread.dylib+0x180329ee0)
 #5 0x00000001be6ac340 (/usr/lib/system/libsystem_c.dylib+0x180264340)
 #6 0x00000001be6ab754 (/usr/lib/system/libsystem_c.dylib+0x180263754)
 #7 0x0000000106864790 mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (.cold.4) (/workspacebin/mlir-opt+0x104144790)
 #8 0x0000000102ba66ac mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x1004866ac)
 #9 0x0000000102ba6910 mlir::arith::getIdentityValue(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x100486910)
...
```

Fixes #64068

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D157985
2023-10-01 14:24:17 -07:00
Matthias Springer
0ef990d57c
[mlir][bufferization] Improve verifier for bufferization.dealloc (#67912)
Check that the number of retained operands and updated conditions match.
2023-10-01 19:36:43 +02:00
Jakub Kuderski
94189e101c
[mlir][spirv] Implement missing validation rules for ptr variables (#67871)
Variables that point to physical storage buffer require aliasing
decorations. This is specified by the `SPV_KHR_physical_storage_buffer`
extension.

Also add an example of a variable with a decoration attribute.
2023-09-30 00:21:28 -04:00
Stella Laurenzo
8d203100e8 Revert "[mlir][memref] Fix offset update in emulating narrow type for strided memref (#67714)"
This reverts commit 35ec6ea644.

Breaks downstream narrow type execution tests.
2023-09-29 18:49:33 -07:00
Mogball
2b5134f1b7 [mlir] Fix bytecode reading of resource sections
This partially reverts #66380. The assertion that the underlying buffer
of an EncodingReader is aligned to any required alignments for resource
sections. Resources know their own alignment and pad their buffers
accordingly, but the bytecode reader doesn't know that ahead of time.
Consequently, it cannot give the resource EncodingReader a base buffer
aligned to the maximum required alignment.

A simple example from the test fails without this:

```mlir
module @TestDialectResources attributes {
  bytecode.test = dense_resource<resource> : tensor<4xi32>
} {}
{-#
  dialect_resources: {
    builtin: {
      resource: "0x2000000001000000020000000300000004000000",
      resource_2: "0x2000000001000000020000000300000004000000"
    }
  }
```
2023-09-29 18:39:56 -07:00
JOE1994
204883623e [NFC] Replace uses of Type::getPointerTo
Replace some uses of `Type::getPointerTo` via 2 ways
* Remove entirely if it's only used to support an unnecessary bitcast
  (remove the bitcast as well).
* Replace with `PointerType::get`/`PointerType::getUnqual`

NFC opaque pointer clean-up effort.
2023-09-29 21:38:53 -04:00
cxy
0c63122713 [MLIR] Add stage to side effect
[MLIR] Add stage and effectOnFullRegion to side effect

    This patch add stage and effectOnFullRegion to side effect for optimization pass
    to obtain more accurate information.
    Stage uses numbering to track the side effects's stage of occurrence.
    EffectOnFullRegion indicates if effect act on every single value of resource.

    RFC disscussion: https://discourse.llvm.org/t/rfc-add-effect-index-in-memroy-effect/72235
    Differential Revision: https://reviews.llvm.org/D156087

Reviewed By: mehdi_amini, Mogball

Differential Revision: https://reviews.llvm.org/D156087
2023-09-29 17:47:13 -07:00
Mogball
bebb9dfc9c [mlir][llvm] Add a indirect call builder for CallOp (NFC) 2023-09-29 17:23:09 -07:00
tyb0807
a3af099785
[mlir][NFC] Fix comment explaining ConverVectorLoad (#67864)
The new number of elements should be the original one divided by a scale
factor computed from old and new bit width.
2023-09-30 00:52:05 +02:00
Krzysztof Drewniak
0463e00ac6
[mlir][ROCDL] Fix file leak in SeralizeToHsaco and its newer form (#67711)
SerializetToHsaco, as currently implemented, leaks the file descriptor
of the .hsaco temporary file, which causes issues in long-running
parallel compilation setups.

See also https://github.com/ROCmSoftwarePlatform/rocMLIR/pull/1257
2023-09-29 17:24:40 -05:00
Abhishek Varma
070d2114b1
[mlir][Linalg] Fix SoftmaxOp's reify result shape calculation (#67790)
-- SoftmaxOp's `reifyResultShapes` function was wrongly casting it as a
`LinalgOp`.
-- This commit thus adds a fix to SoftmaxOp's reify result shape
calculation.

Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>
2023-09-29 10:55:35 -07:00
Andrzej Warzynski
23b5f92c97 [mlir][SME] Re-order patterns alphabetically (nfc) 2023-09-29 16:54:47 +00:00
Andrzej Warzyński
35dd3a6475
[mlir][SME][nfc] Clarify the usage of insertion guard (#67668)
Added extra comment that should clarify the need for an insertion guard
when using `getLoopOverTileSlices`. Also removed some redundant calls to
`setInsertionPointAfter` - the insertion guard would overwrite that on
destruction anyway.
2023-09-29 17:33:57 +01:00
Andrzej Warzyński
94c04772bc
[mlir][vector] Prevent incorrect vector.transfer_{read|write} hoisting (#66930)
At the moment, `hoistRedundantVectorTransfers` would hoist the
`vector.transfer_read`/`vector.transfer_write` pair in this function:

```mlir
func.func @no_hoisting_write_to_memref(%rhs: i32, %arg1: vector<1xi32>) {
  %c0_i32 = arith.constant 0 : i32
  %c0 = arith.constant 0 : index
  %c1 = arith.constant 1 : index
  %c4 = arith.constant 4 : index
  %c20 = arith.constant 20 : index
  %alloca = memref.alloca() {alignment = 64 : i64} : memref<1x1x2xi32>
  %cast = memref.cast %alloca : memref<1x1x2xi32> to memref<1x1x2xi32>
  %collapsed_1 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
  scf.for %_ = %c0 to %c20 step %c4 {
    %collapsed_2 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
    %lhs = vector.transfer_read %collapsed_1[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %acc = vector.transfer_read %collapsed_2[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %op = vector.outerproduct %lhs, %rhs, %acc {kind = #vector.kind<add>} : vector<1xi32>, i32
    vector.transfer_write %op, %collapsed_1[%c0] {in_bounds = [true]} : vector<1xi32>, memref<2xi32>
  }
  return
}
```
as follows:
```mlir
  func.func @no_hoisting_write_to_memref(%arg0: i32, %arg1: vector<1xi32>) {
    %c0_i32 = arith.constant 0 : i32
    %c0 = arith.constant 0 : index
    %c4 = arith.constant 4 : index
    %c20 = arith.constant 20 : index
    %alloca = memref.alloca() {alignment = 64 : i64} : memref<1x1x2xi32>
    %collapse_shape = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
    %collapse_shape_0 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32>
    %0 = vector.transfer_read %collapse_shape[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %1 = vector.transfer_read %collapse_shape_0[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32>
    %2 = scf.for %arg2 = %c0 to %c20 step %c4 iter_args(%arg3 = %0) -> (vector<1xi32>) {
      %3 = vector.outerproduct %arg3, %arg0, %1 {kind = #vector.kind<add>} : vector<1xi32>, i32
      scf.yield %3 : vector<1xi32>
    }
    vector.transfer_write %2, %collapse_shape[%c0] {in_bounds = [true]} : vector<1xi32>, memref<2xi32>
    return
  }
```

This is not safe. While one argument for `vector.outerproduct` (`%rhs`
from the original loop) is correctly being forwarded via `iter_args`,
the other one (`%acc` from the original loop) is not.

This patch disables hoisting in cases where the source of "candidate"
`vector.transfer_read` aliases with some other `memref`. A more generic
approach would be to make sure that all values are correctly forwarded
via `iter_args`, but that would require involving alias analysis.

[1] Based on https://github.com/openxla/iree/issues/14994.
2023-09-29 15:34:37 +01:00
Ingo Müller
61ba0b2815 [mlir][transform] Improve error message when file not found. 2023-09-29 14:07:57 +00:00
Benjamin Maxwell
b34f15df55
[mlir][ArmSME] Add arm_sme.move_tile_slice_to_vector op (#67652)
This adds a simple higher-level op for the tile slice to vector
intrinsics (and updates the existing vector.print lowering to use it).
This op will be used a few more times to implement vector.insert/extract
lowerings in later patches.
2023-09-29 10:33:09 +01:00
Matthias Springer
23b794f720
[mlir][Affine][NFC] Define AffineForOp operands in ODS (#67694)
Modernize affine dialect ops: Define LB, UB, step and inits as operands
in TableGen.
2023-09-29 10:47:28 +02:00
Christian Ulmann
e594c45ead
[MLIR][LLVM] Drop unsupported DISubranges while importing (#67712)
This revision ensures that unsuppoert DISubranges are properly skipped
instead of being transformed into invalid metadata.
2023-09-29 07:29:37 +02:00
Quinn Dawkins
78c49743c7
[MLIR][Vector] Allow non-default memory spaces in gather/scatter lowerings (#67500)
GPU targets can gather on non-default address spaces (e.g. global), so
this removes the check for the default memory space.
2023-09-28 19:20:32 -04:00
Aart Bik
7ac330a461
[mlir][sparse][gpu] protect BSR method with cuda 12.1 (#67728)
MLIR official build is not quite at 12.1 yet, so until then we protext
the Bsr method with a macro guard
2023-09-28 12:58:01 -07:00
Valentin Clement (バレンタイン クレメン)
49f1232ea1
[flang][openacc] Support assumed shape arrays in private recipe (#67701)
This patch adds correct support for the assumed shape arrays in the
privatization recipes.
This follows the same IR generation than in #67610.
2023-09-28 12:40:51 -07:00
Kunwar Grover
35ec6ea644
[mlir][memref] Fix offset update in emulating narrow type for strided memref (#67714)
The offset when converting type in emulating narrow types did not
account for the offset in strided memrefs. This patch fixes this.
2023-09-29 01:08:43 +05:30
Aart Bik
3231a365c1
[mlir][sparse][gpu] add CSC to libgen GPU sparsification using cuSparse (#67713)
Add CSC, but also adds BSR as a future format. Coming soon!
2023-09-28 11:47:22 -07:00
Peiming Liu
6ca47eb49d
[mlir][sparse] rename sparse_tensor.(un)pack to sparse_tensor.(dis)as… (#67717)
…semble

Pack/Unpack are overridden in many other places, rename the operations
to avoid confusion.
2023-09-28 11:01:10 -07:00
Matthias Springer
e52899ea52
[mlir][SCF] Bufferize scf.index_switch (#67666)
Add the `BufferizableOpInterface` implementation of `scf.index_switch`.
2023-09-28 19:05:14 +02:00
Arjun P
f3c3e2f46e Matrix: remove self-include [NFC] 2023-09-28 16:53:50 +01:00
Akash Banerjee
1e0fe3bc3c [OpenMP] Add OutlineableOpenMPOpInterface trait to TargetOp
This patch adds the OutlineableOpenMPOpInterface to omp.target. This prevents other operations inside the target region such as WSLoop from hoisting new allocas outside the region.
2023-09-28 16:48:33 +01:00
Krzysztof Drewniak
2ebd633f14 [mlir][AMDGPU] Add packed 8-bit float conversion ops and lowering
Define operations that wrap the gfx940's new operations for converting
between f32 and registers containing packed sets of four 8-bit floats.

Define rocdl operations for the intrinsics and an AMDGPU dialect
wrapper around them (to account for the fact that MLIR distinguishes
the two float formats at the type level but that the LLVM IR does
not).

Define an ArithToAMDGPU pass, meant to run before conversion to LLVM,
that replaces relevant calls to arith.extf and arith.truncf with the
packed operations in the AMDGPU dialect. Note that the conversion
currently only handles scalars and vectors of rank <= 1, as we do not
have a usecase for multi-dimensional vector support right now.

Reviewed By: jsjodin

Differential Revision: https://reviews.llvm.org/D152457
2023-09-28 14:44:16 +00:00
Andrew Gozillon
064666fb66 [MLIR][OpenMP] Fix mistyped syntax in test omptarget-region-parallel-llvm.mlir test
Fix mistyped syntax in omptarget-region-parallel-llvm.mlir test added by b05d436
2023-09-28 09:36:35 -05:00
Andrzej Warzynski
1e70ab5f0d [mlir][transform] Update transform.loop.peel (reland #67482)
This patch updates `transform.loop.peel` so that this Op returns two
rather than one handle:
  * one for the peeled loop, and
  * one for the remainder loop.
Also, following this change this Op will fail if peeling fails. This is
consistent with other similar Ops that also fail if no transformation
takes place.

Relands #67482 with an extra fix for transform_loop_ext.py
2023-09-28 14:35:46 +00:00
Goran Flegar
042468bff5 [mlir][SME] Fix unused variable warning 2023-09-28 15:24:27 +02:00
Andrzej Warzyński
0cb0df41d4
[mlir][SME] Add vector.splat -> SME conversion (#67659)
This conversion is identical to vector.broadcast when broadcasting a
scalar.
2023-09-28 13:46:24 +01:00
Cullen Rhodes
8c07d5ec6d
[mlir][vector] don't emit non-rank 1 masked load and store (#67656)
The following patterns

  - TransferReadToVectorLoadLowering
  - TransferWriteToVectorStoreLowering

attempt to generate invalid vector.maskedload and vector.maskedstore ops
for non rank-1 vector types. These ops operate on 1-D vectors. This
patch adds a check to prevent this.
2023-09-28 13:06:50 +01:00
Cullen Rhodes
9816edc9f3
[mlir][vector] add result type to vector.extract assembly format (#66499)
The vector.extract assembly format currently only contains the source
type, for example:

  %1 = vector.extract %0[1] : vector<3x7x8xf32>

it's not immediately obvious if this is the source or result type. This
patch improves the assembly format to make this clearer, so the above
becomes:

  %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>
2023-09-28 11:11:16 +01:00
Cullen Rhodes
8e64e9c365
[mlir][ArmSME] Add support for vector.transfer_read with transpose (#67527)
This patch adds support for lowering a vector.transfer_read with a
transpose permutation map to a vertical tile load, for example:

  vector.transfer_read ...  permutation_map: (d0, d1) -> (d1, d0)

is converted to:

  arm_sme.tile_load ... <vertical>

On SME the transpose can be done in-flight, rather than as a separate
operation as in the TransferReadPermutationLowering, which would do the
following:

  %0 = vector.transfer_read ...
  vector.transpose %0, [1, 0] ...

The lowering doesn't support masking yet and the transfer_read must be
in-bounds. It also intentionally doesn't handle simple loads as
transfer_write currently does, as the generic
TransferReadToVectorLoadLowering can lower these to simple vector.load
ops, which can already be lowered to ArmSME.

A subsequent patch will update the existing transfer_write lowering,
this is a separate patch as there is currently no lowering for
vector.transfer_read.
2023-09-28 10:54:20 +01:00
Martin Erhart
6a651c7f44 Revert "[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626)"
This reverts commit aa9eb47da2.
It introduced a double free in a test case. Reverting to have some time
for fixing this and relanding later.
2023-09-28 09:14:46 +00:00
Martin Erhart
aa9eb47da2
[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626)
Inserting clones requires a lot of assumptions to hold on the input IR, e.g., all writes to a buffer need to dominate all reads. This is not guaranteed by one-shot bufferization and isn't easy to verify, thus it could quickly lead to incorrect results that are hard to debug. This commit changes the mechanism of how an ownership indicator is materialized when there is not already a unique ownership present. Additionally, we don't create copies of returned memrefs anymore when we don't have ownership. Instead, we insert assert operations to make sure we have ownership at runtime, or otherwise report to the user that correctness could not be guaranteed.
2023-09-28 10:45:35 +02:00
Ivan R. Ivanov
26eb4285b5
[MLIR][LLVM] Add vararg support in LLVM::CallOp and InvokeOp (#67274)
In order to support indirect vararg calls, we need to have information about the
callee type - this patch adds a `callee_type` attribute that holds that.

The attribute is required for vararg calls, else, it is optional and the callee
type is inferred by the operands and results of the operation if not present.

The syntax for non-vararg calls remains the same, whereas for vararg calls, it
is changed to this:

```
llvm.call %p(%arg0, %arg0) vararg(!llvm.func<void (i32, ...)>) : !llvm.ptr, (i32, i32) -> ()
llvm.call @s(%arg0, %arg0) vararg(!llvm.func<void (i32, ...)>) : (i32, i32) -> ()
```
2023-09-28 08:26:45 +02:00
Yinying Li
256ac4619b
[mlir][sparse] Change tests to use new syntax for ELL and slice (#67569)
Examples:

1. `#ELL = #sparse_tensor.encoding<{ lvlTypes = [ "dense", "dense",
"compressed" ], dimToLvl = affine_map<(i,j)[c] -> (c*4*i, i, j)>
}>`
to
`#ELL = #sparse_tensor.encoding<{ map = [s0](d0, d1) -> (d0 * (s0 * 4) :
dense, d0 : dense, d1 : compressed)
}>`

2. `#CSR_SLICE = #sparse_tensor.encoding<{ lvlTypes = [ "dense",
"compressed" ], dimSlices = [ (1, 4, 1), (1, 4, 2) ]
}>`
to
`#CSR_SLICE = #sparse_tensor.encoding<{ map = (d0 :
#sparse_tensor<slice(1, 4, 1)>, d1 : #sparse_tensor<slice(1, 4, 2)>) ->
(d0 : dense, d1 : compressed)
}>`
2023-09-27 19:40:52 -04:00
Gil Rapaport
a5b4ada6fe Recommit "Add a structured if operation (#67234)"
This patch recommits 126f0374cb, reverted by
3ada774d0f, along with the missing dependence.
2023-09-28 01:52:30 +03:00