llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-12-15 12:09:51 +00:00

Author	SHA1	Message	Date
Georgios Pinitas	363c617aac	[mlir][tosa] Align `shift` attribute of `TOSA_MulOp` with the spec (#67816 ) According to specification the `shift` attribute of the Mul operator in TOSA is of signless i8 type instead of i32. Signed-off-by: Georgios Pinitas <georgios.pinitas@arm.com>	2023-10-02 15:02:16 -07:00
Maksim Levental	d7e49736e6	[mlir][CAPI, python bindings] Expose `Operation::setSuccessor` (#67922 ) This is useful for emitting (using the python bindings) `cf.br` to blocks that are declared lexically post block creation.	2023-10-02 15:37:25 -05:00
Andrzej Warzynski	811b05c4ef	[mlir][ArmSME] Remove dependency on a non-existing target (nfc) Sending this one without a review - as `MLIRArmSMEIncGen` is not defined anywhere, dependency on that target is clearly bogus.	2023-10-02 20:34:48 +00:00
Yinying Li	2cb99df609	[mlir][sparse] Fix typos (#67859 )	2023-10-02 11:07:38 -04:00
Yinying Li	d2e8517912	[mlir][sparse] Update Enum name for CompressedWithHigh (#67845 ) Change CompressedWithHigh to LooseCompressed.	2023-10-02 11:06:40 -04:00
Oleksandr "Alex" Zinenko	aab795a8dc	[mlir] run buffer deallocation in transform tutorial (#67978 ) Buffer deallocation pipeline previously was incorrect when applied to functions. It has since been fixed. Make sure it is exercised in the tutorial to avoid leaking allocations.	2023-10-02 16:08:11 +02:00
Matthias Springer	c95fcd343d	[mlir][bufferization] Remove `resolveUsesInRepetitiveRegions` (#67927 ) The bufferization analysis has been improved over the last months and this workaround is no longer needed.	2023-10-02 16:04:27 +02:00
Matthias Springer	43198b0aa2	[mlir][bufferization] Better analysis around allocs and block arguments (#67923 ) Values that are the result of buffer allocation ops are guaranteed to not be the same allocation as block arguments of containing blocks. This fact can be used to allow for more aggressive simplification of `bufferization.dealloc` ops.	2023-10-02 11:01:12 +02:00
Kai Sasaki	9782993584	[mlir][affine] Check the input vector sizes to be greater than 0 (#65293 ) In the process of vectorization of the affine loop, the 0 vector size causes the crash with building the invalid AffineForOp. We can catch the case beforehand propagating to the assertion. See: https://github.com/llvm/llvm-project/issues/64262	2023-10-02 09:52:00 +09:00
Mehdi Amini	2375d84f06	Fix MLIR test for UBSAN: define enum values that are set in cl::opt (NFC)	2023-10-01 16:08:51 -07:00
Zhenyan Zhu	9580468302	[mlir][affine] Enforce each result type to match Reduction ops in affine.parallel verifier This patch updates AffineParallelOp::verify() to check each result type matches its corresponding reduction op (i.e, the result type must be a `FloatType` if the reduction attribute is `addf`) affine.parallel will crash on --lower-affine if the corresponding result type cannot match the reduction attribute. ``` %128 = affine.parallel (%arg2, %arg3) = (0, 0) to (8, 7) reduce ("maxf") -> (memref<8x7xf32>) { %alloc_33 = memref.alloc() : memref<8x7xf32> affine.yield %alloc_33 : memref<8x7xf32> } ``` This will crash and report a type conversion issue when we run `mlir-opt --lower-affine` ``` Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file Casting.h, line 572. PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace. Stack dump: 0. Program arguments: mlir-opt --lower-affine temp.mlir #0 0x0000000102a18f18 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/workspacebin/mlir-opt+0x1002f8f18) #1 0x0000000102a171b4 llvm::sys::RunSignalHandlers() (/workspacebin/mlir-opt+0x1002f71b4) #2 0x0000000102a195c4 SignalHandler(int) (/workspacebin/mlir-opt+0x1002f95c4) #3 0x00000001be7894c4 (/usr/lib/system/libsystem_platform.dylib+0x1803414c4) #4 0x00000001be771ee0 (/usr/lib/system/libsystem_pthread.dylib+0x180329ee0) #5 0x00000001be6ac340 (/usr/lib/system/libsystem_c.dylib+0x180264340) #6 0x00000001be6ab754 (/usr/lib/system/libsystem_c.dylib+0x180263754) #7 0x0000000106864790 mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (.cold.4) (/workspacebin/mlir-opt+0x104144790) #8 0x0000000102ba66ac mlir::arith::getIdentityValueAttr(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x1004866ac) #9 0x0000000102ba6910 mlir::arith::getIdentityValue(mlir::arith::AtomicRMWKind, mlir::Type, mlir::OpBuilder&, mlir::Location) (/workspacebin/mlir-opt+0x100486910) ... ``` Fixes #64068 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D157985	2023-10-01 14:24:17 -07:00
Matthias Springer	0ef990d57c	[mlir][bufferization] Improve verifier for `bufferization.dealloc` (#67912 ) Check that the number of retained operands and updated conditions match.	2023-10-01 19:36:43 +02:00
Jakub Kuderski	94189e101c	[mlir][spirv] Implement missing validation rules for ptr variables (#67871 ) Variables that point to physical storage buffer require aliasing decorations. This is specified by the `SPV_KHR_physical_storage_buffer` extension. Also add an example of a variable with a decoration attribute.	2023-09-30 00:21:28 -04:00
Stella Laurenzo	8d203100e8	Revert "[mlir][memref] Fix offset update in emulating narrow type for strided memref (#67714 )" This reverts commit `35ec6ea644`. Breaks downstream narrow type execution tests.	2023-09-29 18:49:33 -07:00
Mogball	2b5134f1b7	[mlir] Fix bytecode reading of resource sections This partially reverts #66380. The assertion that the underlying buffer of an EncodingReader is aligned to any required alignments for resource sections. Resources know their own alignment and pad their buffers accordingly, but the bytecode reader doesn't know that ahead of time. Consequently, it cannot give the resource EncodingReader a base buffer aligned to the maximum required alignment. A simple example from the test fails without this: ```mlir module @TestDialectResources attributes { bytecode.test = dense_resource<resource> : tensor<4xi32> } {} {-# dialect_resources: { builtin: { resource: "0x2000000001000000020000000300000004000000", resource_2: "0x2000000001000000020000000300000004000000" } } ```	2023-09-29 18:39:56 -07:00
JOE1994	204883623e	[NFC] Replace uses of Type::getPointerTo Replace some uses of `Type::getPointerTo` via 2 ways * Remove entirely if it's only used to support an unnecessary bitcast (remove the bitcast as well). * Replace with `PointerType::get`/`PointerType::getUnqual` NFC opaque pointer clean-up effort.	2023-09-29 21:38:53 -04:00
cxy	0c63122713	[MLIR] Add stage to side effect [MLIR] Add stage and effectOnFullRegion to side effect This patch add stage and effectOnFullRegion to side effect for optimization pass to obtain more accurate information. Stage uses numbering to track the side effects's stage of occurrence. EffectOnFullRegion indicates if effect act on every single value of resource. RFC disscussion: https://discourse.llvm.org/t/rfc-add-effect-index-in-memroy-effect/72235 Differential Revision: https://reviews.llvm.org/D156087 Reviewed By: mehdi_amini, Mogball Differential Revision: https://reviews.llvm.org/D156087	2023-09-29 17:47:13 -07:00
Mogball	bebb9dfc9c	[mlir][llvm] Add a indirect call builder for CallOp (NFC)	2023-09-29 17:23:09 -07:00
tyb0807	a3af099785	[mlir][NFC] Fix comment explaining ConverVectorLoad (#67864 ) The new number of elements should be the original one divided by a scale factor computed from old and new bit width.	2023-09-30 00:52:05 +02:00
Krzysztof Drewniak	0463e00ac6	[mlir][ROCDL] Fix file leak in SeralizeToHsaco and its newer form (#67711 ) SerializetToHsaco, as currently implemented, leaks the file descriptor of the .hsaco temporary file, which causes issues in long-running parallel compilation setups. See also https://github.com/ROCmSoftwarePlatform/rocMLIR/pull/1257	2023-09-29 17:24:40 -05:00
Abhishek Varma	070d2114b1	[mlir][Linalg] Fix SoftmaxOp's reify result shape calculation (#67790 ) -- SoftmaxOp's `reifyResultShapes` function was wrongly casting it as a `LinalgOp`. -- This commit thus adds a fix to SoftmaxOp's reify result shape calculation. Signed-off-by: Abhishek Varma <abhishek@nod-labs.com>	2023-09-29 10:55:35 -07:00
Andrzej Warzynski	23b5f92c97	[mlir][SME] Re-order patterns alphabetically (nfc)	2023-09-29 16:54:47 +00:00
Andrzej Warzyński	35dd3a6475	[mlir][SME][nfc] Clarify the usage of insertion guard (#67668 ) Added extra comment that should clarify the need for an insertion guard when using `getLoopOverTileSlices`. Also removed some redundant calls to `setInsertionPointAfter` - the insertion guard would overwrite that on destruction anyway.	2023-09-29 17:33:57 +01:00
Andrzej Warzyński	94c04772bc	[mlir][vector] Prevent incorrect vector.transfer_{read\|write} hoisting (#66930 ) At the moment, `hoistRedundantVectorTransfers` would hoist the `vector.transfer_read`/`vector.transfer_write` pair in this function: ```mlir func.func @no_hoisting_write_to_memref(%rhs: i32, %arg1: vector<1xi32>) { %c0_i32 = arith.constant 0 : i32 %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index %c4 = arith.constant 4 : index %c20 = arith.constant 20 : index %alloca = memref.alloca() {alignment = 64 : i64} : memref<1x1x2xi32> %cast = memref.cast %alloca : memref<1x1x2xi32> to memref<1x1x2xi32> %collapsed_1 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32> scf.for %_ = %c0 to %c20 step %c4 { %collapsed_2 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32> %lhs = vector.transfer_read %collapsed_1[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32> %acc = vector.transfer_read %collapsed_2[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32> %op = vector.outerproduct %lhs, %rhs, %acc {kind = #vector.kind<add>} : vector<1xi32>, i32 vector.transfer_write %op, %collapsed_1[%c0] {in_bounds = [true]} : vector<1xi32>, memref<2xi32> } return } ``` as follows: ```mlir func.func @no_hoisting_write_to_memref(%arg0: i32, %arg1: vector<1xi32>) { %c0_i32 = arith.constant 0 : i32 %c0 = arith.constant 0 : index %c4 = arith.constant 4 : index %c20 = arith.constant 20 : index %alloca = memref.alloca() {alignment = 64 : i64} : memref<1x1x2xi32> %collapse_shape = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32> %collapse_shape_0 = memref.collapse_shape %alloca [[0, 1, 2]] : memref<1x1x2xi32> into memref<2xi32> %0 = vector.transfer_read %collapse_shape[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32> %1 = vector.transfer_read %collapse_shape_0[%c0], %c0_i32 {in_bounds = [true]} : memref<2xi32>, vector<1xi32> %2 = scf.for %arg2 = %c0 to %c20 step %c4 iter_args(%arg3 = %0) -> (vector<1xi32>) { %3 = vector.outerproduct %arg3, %arg0, %1 {kind = #vector.kind<add>} : vector<1xi32>, i32 scf.yield %3 : vector<1xi32> } vector.transfer_write %2, %collapse_shape[%c0] {in_bounds = [true]} : vector<1xi32>, memref<2xi32> return } ``` This is not safe. While one argument for `vector.outerproduct` (`%rhs` from the original loop) is correctly being forwarded via `iter_args`, the other one (`%acc` from the original loop) is not. This patch disables hoisting in cases where the source of "candidate" `vector.transfer_read` aliases with some other `memref`. A more generic approach would be to make sure that all values are correctly forwarded via `iter_args`, but that would require involving alias analysis. [1] Based on https://github.com/openxla/iree/issues/14994.	2023-09-29 15:34:37 +01:00
Ingo Müller	61ba0b2815	[mlir][transform] Improve error message when file not found.	2023-09-29 14:07:57 +00:00
Benjamin Maxwell	b34f15df55	[mlir][ArmSME] Add arm_sme.move_tile_slice_to_vector op (#67652 ) This adds a simple higher-level op for the tile slice to vector intrinsics (and updates the existing vector.print lowering to use it). This op will be used a few more times to implement vector.insert/extract lowerings in later patches.	2023-09-29 10:33:09 +01:00
Matthias Springer	23b794f720	[mlir][Affine][NFC] Define AffineForOp operands in ODS (#67694 ) Modernize affine dialect ops: Define LB, UB, step and inits as operands in TableGen.	2023-09-29 10:47:28 +02:00
Christian Ulmann	e594c45ead	[MLIR][LLVM] Drop unsupported DISubranges while importing (#67712 ) This revision ensures that unsuppoert DISubranges are properly skipped instead of being transformed into invalid metadata.	2023-09-29 07:29:37 +02:00
Quinn Dawkins	78c49743c7	[MLIR][Vector] Allow non-default memory spaces in gather/scatter lowerings (#67500 ) GPU targets can gather on non-default address spaces (e.g. global), so this removes the check for the default memory space.	2023-09-28 19:20:32 -04:00
Aart Bik	7ac330a461	[mlir][sparse][gpu] protect BSR method with cuda 12.1 (#67728 ) MLIR official build is not quite at 12.1 yet, so until then we protext the Bsr method with a macro guard	2023-09-28 12:58:01 -07:00
Valentin Clement (バレンタインクレメン)	49f1232ea1	[flang][openacc] Support assumed shape arrays in private recipe (#67701 ) This patch adds correct support for the assumed shape arrays in the privatization recipes. This follows the same IR generation than in #67610.	2023-09-28 12:40:51 -07:00
Kunwar Grover	35ec6ea644	[mlir][memref] Fix offset update in emulating narrow type for strided memref (#67714 ) The offset when converting type in emulating narrow types did not account for the offset in strided memrefs. This patch fixes this.	2023-09-29 01:08:43 +05:30
Aart Bik	3231a365c1	[mlir][sparse][gpu] add CSC to libgen GPU sparsification using cuSparse (#67713 ) Add CSC, but also adds BSR as a future format. Coming soon!	2023-09-28 11:47:22 -07:00
Peiming Liu	6ca47eb49d	[mlir][sparse] rename sparse_tensor.(un)pack to sparse_tensor.(dis)as… (#67717 ) …semble Pack/Unpack are overridden in many other places, rename the operations to avoid confusion.	2023-09-28 11:01:10 -07:00
Matthias Springer	e52899ea52	[mlir][SCF] Bufferize scf.index_switch (#67666 ) Add the `BufferizableOpInterface` implementation of `scf.index_switch`.	2023-09-28 19:05:14 +02:00
Arjun P	f3c3e2f46e	Matrix: remove self-include [NFC]	2023-09-28 16:53:50 +01:00
Akash Banerjee	1e0fe3bc3c	[OpenMP] Add OutlineableOpenMPOpInterface trait to TargetOp This patch adds the OutlineableOpenMPOpInterface to omp.target. This prevents other operations inside the target region such as WSLoop from hoisting new allocas outside the region.	2023-09-28 16:48:33 +01:00
Krzysztof Drewniak	2ebd633f14	[mlir][AMDGPU] Add packed 8-bit float conversion ops and lowering Define operations that wrap the gfx940's new operations for converting between f32 and registers containing packed sets of four 8-bit floats. Define rocdl operations for the intrinsics and an AMDGPU dialect wrapper around them (to account for the fact that MLIR distinguishes the two float formats at the type level but that the LLVM IR does not). Define an ArithToAMDGPU pass, meant to run before conversion to LLVM, that replaces relevant calls to arith.extf and arith.truncf with the packed operations in the AMDGPU dialect. Note that the conversion currently only handles scalars and vectors of rank <= 1, as we do not have a usecase for multi-dimensional vector support right now. Reviewed By: jsjodin Differential Revision: https://reviews.llvm.org/D152457	2023-09-28 14:44:16 +00:00
Andrew Gozillon	064666fb66	[MLIR][OpenMP] Fix mistyped syntax in test omptarget-region-parallel-llvm.mlir test Fix mistyped syntax in omptarget-region-parallel-llvm.mlir test added by `b05d436`	2023-09-28 09:36:35 -05:00
Andrzej Warzynski	1e70ab5f0d	[mlir][transform] Update transform.loop.peel (reland #67482 ) This patch updates `transform.loop.peel` so that this Op returns two rather than one handle: * one for the peeled loop, and * one for the remainder loop. Also, following this change this Op will fail if peeling fails. This is consistent with other similar Ops that also fail if no transformation takes place. Relands #67482 with an extra fix for transform_loop_ext.py	2023-09-28 14:35:46 +00:00
Goran Flegar	042468bff5	[mlir][SME] Fix unused variable warning	2023-09-28 15:24:27 +02:00
Andrzej Warzyński	0cb0df41d4	[mlir][SME] Add vector.splat -> SME conversion (#67659 ) This conversion is identical to vector.broadcast when broadcasting a scalar.	2023-09-28 13:46:24 +01:00
Cullen Rhodes	8c07d5ec6d	[mlir][vector] don't emit non-rank 1 masked load and store (#67656 ) The following patterns - TransferReadToVectorLoadLowering - TransferWriteToVectorStoreLowering attempt to generate invalid vector.maskedload and vector.maskedstore ops for non rank-1 vector types. These ops operate on 1-D vectors. This patch adds a check to prevent this.	2023-09-28 13:06:50 +01:00
Cullen Rhodes	9816edc9f3	[mlir][vector] add result type to vector.extract assembly format (#66499 ) The vector.extract assembly format currently only contains the source type, for example: %1 = vector.extract %0[1] : vector<3x7x8xf32> it's not immediately obvious if this is the source or result type. This patch improves the assembly format to make this clearer, so the above becomes: %1 = vector.extract %0[1] : vector<7x8xf32> from vector<3x7x8xf32>	2023-09-28 11:11:16 +01:00
Cullen Rhodes	8e64e9c365	[mlir][ArmSME] Add support for vector.transfer_read with transpose (#67527 ) This patch adds support for lowering a vector.transfer_read with a transpose permutation map to a vertical tile load, for example: vector.transfer_read ... permutation_map: (d0, d1) -> (d1, d0) is converted to: arm_sme.tile_load ... <vertical> On SME the transpose can be done in-flight, rather than as a separate operation as in the TransferReadPermutationLowering, which would do the following: %0 = vector.transfer_read ... vector.transpose %0, [1, 0] ... The lowering doesn't support masking yet and the transfer_read must be in-bounds. It also intentionally doesn't handle simple loads as transfer_write currently does, as the generic TransferReadToVectorLoadLowering can lower these to simple vector.load ops, which can already be lowered to ArmSME. A subsequent patch will update the existing transfer_write lowering, this is a separate patch as there is currently no lowering for vector.transfer_read.	2023-09-28 10:54:20 +01:00
Martin Erhart	6a651c7f44	Revert "[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626 )" This reverts commit `aa9eb47da2`. It introduced a double free in a test case. Reverting to have some time for fixing this and relanding later.	2023-09-28 09:14:46 +00:00
Martin Erhart	aa9eb47da2	[mlir][bufferization] Don't clone on unknown ownership and verify function boundary ABI (#66626 ) Inserting clones requires a lot of assumptions to hold on the input IR, e.g., all writes to a buffer need to dominate all reads. This is not guaranteed by one-shot bufferization and isn't easy to verify, thus it could quickly lead to incorrect results that are hard to debug. This commit changes the mechanism of how an ownership indicator is materialized when there is not already a unique ownership present. Additionally, we don't create copies of returned memrefs anymore when we don't have ownership. Instead, we insert assert operations to make sure we have ownership at runtime, or otherwise report to the user that correctness could not be guaranteed.	2023-09-28 10:45:35 +02:00
Ivan R. Ivanov	26eb4285b5	[MLIR][LLVM] Add vararg support in LLVM::CallOp and InvokeOp (#67274 ) In order to support indirect vararg calls, we need to have information about the callee type - this patch adds a `callee_type` attribute that holds that. The attribute is required for vararg calls, else, it is optional and the callee type is inferred by the operands and results of the operation if not present. The syntax for non-vararg calls remains the same, whereas for vararg calls, it is changed to this: ``` llvm.call %p(%arg0, %arg0) vararg(!llvm.func<void (i32, ...)>) : !llvm.ptr, (i32, i32) -> () llvm.call @s(%arg0, %arg0) vararg(!llvm.func<void (i32, ...)>) : (i32, i32) -> () ```	2023-09-28 08:26:45 +02:00
Yinying Li	256ac4619b	[mlir][sparse] Change tests to use new syntax for ELL and slice (#67569 ) Examples: 1. `#ELL = #sparse_tensor.encoding<{ lvlTypes = [ "dense", "dense", "compressed" ], dimToLvl = affine_map<(i,j)[c] -> (c4i, i, j)> }>` to `#ELL = #sparse_tensor.encoding<{ map = [s0](d0, d1) -> (d0 * (s0 * 4) : dense, d0 : dense, d1 : compressed) }>` 2. `#CSR_SLICE = #sparse_tensor.encoding<{ lvlTypes = [ "dense", "compressed" ], dimSlices = [ (1, 4, 1), (1, 4, 2) ] }>` to `#CSR_SLICE = #sparse_tensor.encoding<{ map = (d0 : #sparse_tensor<slice(1, 4, 1)>, d1 : #sparse_tensor<slice(1, 4, 2)>) -> (d0 : dense, d1 : compressed) }>`	2023-09-27 19:40:52 -04:00
Gil Rapaport	a5b4ada6fe	Recommit "Add a structured if operation (#67234 )" This patch recommits `126f0374cb`, reverted by `3ada774d0f`, along with the missing dependence.	2023-09-28 01:52:30 +03:00

1 2 3 4 5 ...

17397 Commits