llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2024-11-26 23:21:11 +00:00

Author	SHA1	Message	Date
mlevesquedion	3dff20cfa2	[mlir] Reformat whitespace in dependent dialects codegen (#78090 ) The generated code for dependent dialects is awkwardly formatted, making the code harder to read. This change reformats the whitespace to align code in its context and avoid unnecessary empty lines. Also included are some typo fixes. Below are examples of the codegen for a dialect before and after the change. Before: ``` GPUDialect::GPUDialect(::mlir::MLIRContext context) : ::mlir::Dialect(getDialectNamespace(), context, ::mlir::TypeID::get<GPUDialect>()) { getContext()->loadDialect<arith::ArithDialect>(); initialize(); } ``` After: ``` GPUDialect::GPUDialect(::mlir::MLIRContext context) : ::mlir::Dialect(getDialectNamespace(), context, ::mlir::TypeID::get<GPUDialect>()) { getContext()->loadDialect<arith::ArithDialect>(); initialize(); } ``` Below are examples of the codegen for a pass before and after the change. Before: ``` /// Return the dialect that must be loaded in the context before this pass. void getDependentDialects(::mlir::DialectRegistry &registry) const override { registry.insert<func::FuncDialect>(); registry.insert<tensor::TensorDialect>(); registry.insert<tosa::TosaDialect>(); } ``` After: ``` /// Register the dialects that must be loaded in the context before this pass. void getDependentDialects(::mlir::DialectRegistry &registry) const override { registry.insert<func::FuncDialect>(); registry.insert<tensor::TensorDialect>(); registry.insert<tosa::TosaDialect>(); } ```	2024-01-15 11:11:52 +01:00
martin-luecke	06e3abcb54	[MLIR][transform][python] Introduce abstractions for handles to values and parameters (#77305 ) In addition to the existing `OpHandle` which provides an abstraction to emit transform ops targeting operations this introduces a similar concept for _values_ and _parameters_ in form of `ValueHandle` and `ParamHandle`. New core transform abstractions: - `constant_param` - `OpHandle.get_result` - `OpHandle.print` - `ValueHandle.get_defining_op`	2024-01-15 10:31:22 +01:00
Matthias Springer	0cb024b357	[mlir][Mesh] Fix invalid IR in rewrite pattern (#78094 ) This commit fixes `test/Dialect/Mesh/folding.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Mesh/folding.mlir:19:10: error: Unexpected number of results 0. Expected 2. %0:2 = mesh.cluster_shape @mesh1 : index, index ^ /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Mesh/folding.mlir:19:10: note: see current operation: "mesh.cluster_shape"() <{axes = array<i16>, mesh = @mesh1}> : () -> () mlir-asm-printer: Verifying operation: builtin.module Unexpected number of results 0. Expected 2. mlir-asm-printer: 'builtin.module' failed to verify and will be printed in generic form "builtin.module"() ({ "mesh.cluster"() <{dim_sizes = array<i64: 2, 3>, rank = 2 : i64, sym_name = "mesh1"}> : () -> () "func.func"() <{function_type = () -> (index, index), sym_name = "cluster_shape_op_folding_all_axes_static_mesh"}> ({ %0 = "arith.constant"() <{value = 2 : index}> : () -> index %1 = "arith.constant"() <{value = 3 : index}> : () -> index "mesh.cluster_shape"() <{axes = array<i16>, mesh = @mesh1}> : () -> () %2:2 = "mesh.cluster_shape"() <{axes = array<i16>, mesh = @mesh1}> : () -> (index, index) "func.return"(%0, %1) : (index, index) -> () }) : () -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ``` If `axes` is empty, the op verifier assumes that all dimensions are queried. (Expected 2 results.)	2024-01-15 09:00:43 +01:00
Christian Ulmann	fa5255eee2	[MLIR][LLVM] Enable export of DISubprograms on function declarations (#78026 ) This commit changes the MLIR to LLVMIR export to also attach subprogram debug attachements to function declarations. This commit additonally fixes the two passes that produce subprograms to not attach the "Definition" flag to function declarations. This otherwise results in invalid LLVM IR.	2024-01-15 07:34:13 +01:00
Fabian Mora	48e8cd8345	[mlir][ExecutionEngine] Add support for global constructors and destructors (#78070 ) This patch add support for executing global constructors and destructors in the `ExecutionEngine`.	2024-01-14 21:41:23 -05:00
Fabian Mora	a1eaed7a21	[mlir][gpu] Fix GPU YieldOP format and traits (#78006 ) This patch adds assembly format to `gpu::YieldOp`. It also adds the return like trait, to make it compatible with `RegionBranchOpInterface`.	2024-01-14 21:19:20 -05:00
Yuanqiang Liu	510ec2079e	[mlir] fix IRPrinterInstrumentation to use the user-provided IRPrinting config (#70023 )	2024-01-14 15:58:32 -08:00
Matthias Springer	4ed696c348	[mlir][Transforms] `OneToNTypeConversion.cpp`: Fix invalid IR (#77922 ) `buildUnrealizedCast` used to generate invalid `builtin.unrealized_conversion_cast` ops with zero results. This commit fixes `test/Conversion/OneToNTypeConversion/one-to-n-type-conversion.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` * Pattern (anonymous namespace)::ConvertMakeTupleOp : 'test.make_tuple -> ()' { Trying to match "(anonymous namespace)::ConvertMakeTupleOp" [...] "(anonymous namespace)::ConvertMakeTupleOp" result 1 } -> success : pattern applied successfully // * IR Dump After Pattern Application * mlir-asm-printer: Verifying operation: func.func 'builtin.unrealized_conversion_cast' op expected at least one result for cast operation mlir-asm-printer: 'func.func' failed to verify and will be printed in generic form "func.func"() <{function_type = (i1, i2) -> (i1, i2), sym_name = "pack_unpack"}> ({ ^bb0(%arg0: i1, %arg1: i2): %0 = "test.make_tuple"() : () -> tuple<> "builtin.unrealized_conversion_cast"(%0) {"__one-to-n-type-conversion_cast-kind__" = "target"} : (tuple<>) -> () [...] }) : () -> () within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Conversion/OneToNTypeConversion/one-to-n-type-conversion.mlir:1 offset :20:8: error: 'builtin.unrealized_conversion_cast' op expected at least one result for cast operation %0 = "test.make_tuple"() : () -> tuple<> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Conversion/OneToNTypeConversion/one-to-n-type-conversion.mlir:1 offset :20:8: note: see current operation: "builtin.unrealized_conversion_cast"(%0) {"__one-to-n-type-conversion_cast-kind__" = "target"} : (tuple<>) -> () LLVM ERROR: IR failed to verify after pattern application ```	2024-01-14 12:02:54 +01:00
Benjamin Maxwell	b0aebbd41a	[mlir][ArmSME] Workaround for old versions of GCC (NFC) (#78046 ) See: https://github.com/llvm/llvm-project/pull/76086#issuecomment-1890424955	2024-01-14 09:18:53 +00:00
Benjamin Kramer	9c33a2e9a3	[MLIR][Presburger] Fold loop into assert This way it doesn't trigger -Wunused-variable when assertions are disabled.	2024-01-13 17:52:58 +01:00
Abhinav271828	850f713e80	[MLIR][Presburger] Helper functions to compute the constant term of a generating function (#77819 ) We implement two functions that are needed to compute the constant term of a GF. One finds a vector not orthogonal to all the non-null vectors in a given set. One computes the coefficient of any term in an arbitrary rational function (quotient of two polynomials).	2024-01-13 21:30:06 +05:30
Bharathi Ramana Joshi	66786a79d6	[MLIR][Presburger] Implement Matrix::moveColumns (#68362 )	2024-01-13 18:51:26 +05:30
Kazu Hirata	8e8bbbd48e	[mlir] Use llvm::is_contained (NFC)	2024-01-12 22:08:29 -08:00
MaheshRavishankar	974ded9725	[mlir][Linalg] Change `linalg.transpose` to use the output indexing map as identity. (#77951 ) This makes it consistent with how other linalg operations represent indexing maps.	2024-01-12 14:17:51 -08:00
Felix Schneider	f6f1ab9d90	[mlir][scf] Fix `for-loop-peeling` crash (#77697 ) Before applying the peeling patterns, it can happen that the `ForOp` gets a step of zero during folding. This leads to a division-by-zero down the line. This patch adds an additional check for a constant-zero step and a test. Fix https://github.com/llvm/llvm-project/issues/75758	2024-01-12 19:08:16 +01:00
donald chen	eaa4b6cf29	[mlir][bufferization] Clone simplify fails when input and result type not cast compatiable (#71310 ) The simplify of bufferization.clone generates a memref.cast op, but the checks in simplify do not verify whether the operand types and return types of clone op is compatiable, leading to errors. This patch addresses this issue.	2024-01-12 16:11:00 +01:00
Benjamin Maxwell	5417a5fed6	[mlir][ArmSME] Add rudimentary support for tile spills to the stack (#76086 ) This adds very basic (and inelegant) support for something like spilling and reloading tiles, if you use more SME tiles than physically exist. This is purely implemented to prevent the compiler from aborting if a function uses too many tiles (i.e. due to bad unrolling), but is expected to perform very poorly. Currently, this works in two stages: During tile allocation, if we run out of tiles instead of giving up, we switch to allocating 'in-memory' tile IDs. These are tile IDs that start at 16 (which is higher than any real tile ID). A warning will also be emitted for each (root) tile op assigned an in-memory tile ID: ``` warning: failed to allocate SME virtual tile to operation, all tile operations will go through memory, expect degraded performance ``` Everything after this works like normal until `-convert-arm-sme-to-llvm` Here the in-memory tile op: ```mlir arm_sme.tile_op { tile_id = <IN MEMORY TILE> } ``` Is lowered to: ```mlir // At function entry: %alloca = memref.alloca ... : memref<?x?xty> // Around the op: // Swap the contents of %alloca and tile 0. scf.for %slice_idx { %current_slice = "arm_sme.intr.read.horiz" ... <{tile_id = 0 : i32}> "arm_sme.intr.ld1h.horiz"(%alloca, %slice_idx) <{tile_id = 0 : i32}> vector.store %current_slice, %alloca[%slice_idx, %c0] } // Execute op using tile 0. arm_sme.tile_op { tile_id = 0 } // Swap the contents of %alloca and tile 0. // This restores tile 0 to its original state. scf.for %slice_idx { %current_slice = "arm_sme.intr.read.horiz" ... <{tile_id = 0 : i32}> "arm_sme.intr.ld1h.horiz"(%alloca, %slice_idx) <{tile_id = 0 : i32}> vector.store %current_slice, %alloca[%slice_idx, %c0] } ``` This is inserted during the lowering to LLVM as spilling/reloading registers is a very low-level concept, that can't really be modeled correctly at a high level in MLIR. Note: This is always doing the worst case full-tile swap. This could be optimized to only spill/load data the tile op will use, which could be just a slice. It's also not making any use of liveness, which could allow reusing tiles. But these is not seen as important as correct code should only use the available number of tiles.	2024-01-12 14:51:47 +00:00
Matthias Springer	dec908a285	[mlir][Transforms] `GreedyPatternRewriteDriver`: log successful folding (#77796 ) Similar to successful pattern applications, dump the rewritten IR after each successful folding when running with `-debug`.	2024-01-12 15:50:52 +01:00
Matthias Springer	ad100b36e7	[mlir][vector] Fix dominance error in warp vector distribution (#77771 ) This commit fixes a test in `vector-warp-distribute.mlir` when `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS` is enabled. ``` within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: error: operand #0 does not dominate this use %1 = vector.extract %0[9] : f32 from vector<64xf32> ^ within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: see current operation: %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index within split at /usr/local/google/home/springerm/mlir_public/llvm-project/mlir/test/Dialect/Vector/vector-warp-distribute.mlir:1 offset :18:10: note: operand defined here (op in a child region) "func.func"() <{function_type = (index) -> f32, sym_name = "vector_extract_1d"}> ({ ^bb0(%arg0: index): %0:2 = "vector.warp_execute_on_lane_0"(%arg0) <{warp_size = 32 : i64}> ({ %7 = "some_def"() : () -> vector<64xf32> %8 = "arith.constant"() <{value = 9 : index}> : () -> index %9 = "vector.extractelement"(%7, %8) : (vector<64xf32>, index) -> f32 "vector.yield"(%9, %7) : (f32, vector<64xf32>) -> () }) : (index) -> (f32, vector<2xf32>) %1 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 ceildiv 2)>}> : (index) -> index %2 = "affine.apply"(%8) <{map = affine_map<()[s0] -> (s0 mod 2)>}> : (index) -> index %3 = "vector.extractelement"(%0#1, %2) : (vector<2xf32>, index) -> f32 %4 = "arith.index_cast"(%1) : (index) -> i32 %5 = "arith.constant"() <{value = 32 : i32}> : () -> i32 %6:2 = "gpu.shuffle"(%3, %4, %5) <{mode = #gpu<shuffle_mode idx>}> : (f32, i32, i32) -> (f32, i1) "func.return"(%6#0) : (f32) -> () }) : () -> () LLVM ERROR: IR failed to verify after pattern application ``` The position at which `vector.extractelement` extracts must also be distributed. The fix in `WarpOpExtractElement` is similar to `WarpOpInsertElement`.	2024-01-12 15:08:13 +01:00
Guray Ozen	ae5d63924a	[mlir][nvvm] Introduce `cp.async.bulk.wait_group` (#77917 )	2024-01-12 14:16:38 +01:00
Matthias Springer	aa2dc792ab	[mlir][vector] Fix rewrite pattern API violation in `VectorToSCF` (#77909 ) A rewrite pattern is not allowed to change the IR if it returns "failure". This commit fixes `test/Conversion/VectorToSCF/vector-to-scf.mlir` when running with `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`. ``` Processing operation : 'vector.transfer_read'(0x55823a409a60) { %5 = "vector.transfer_read"(%arg0, %0, %0, %2, %4) <{in_bounds = [true, true], operandSegmentSizes = array<i32: 1, 2, 1, 1>, permutation_map = affine_map<(d0, d1) -> (d0, d1)>}> : (memref<?x4xf32>, index, index, f32, vector<[4]x4xi1>) -> vector<[4]x4xf32> * Pattern (anonymous namespace)::lowering_n_d_unrolled::UnrollTransferReadConversion : 'vector.transfer_read -> ()' { Trying to match "(anonymous namespace)::lowering_n_d_unrolled::UnrollTransferReadConversion" ** Insert : 'vector.splat'(0x55823a445640) "(anonymous namespace)::lowering_n_d_unrolled::UnrollTransferReadConversion" result 0 } -> failure : pattern failed to match LLVM ERROR: pattern returned failure but IR did change ```	2024-01-12 13:44:54 +01:00
Oleksandr "Alex" Zinenko	2798b72ae7	[mlir] introduce debug transform dialect extension (#77595 ) Introduce a new extension for simple print-debugging of the transform dialect scripts. The initial version of this extension consists of two ops that are printing the payload objects associated with transform dialect values. Similar ops were already available in the test extenion and several downstream projects, and were extensively used for testing.	2024-01-12 13:24:02 +01:00
Matthias Springer	35c19fdde2	[mlir][vector] Support warp distribution of `transfer_read` with dependencies (#77779 ) Support distribution of `vector.transfer_read` ops when operands are defined inside of the region of `warp_execute_on_lane_0` (except for the buffer from which the op is reading). Such IR was previously not supported. This commit changes the implementation such that indices and the padding value are also distributed. This commit simplifies the implementation considerably: the original implementation created a new `transfer_read` op and then checked if this new op is valid. If not, the rewrite pattern failed. This was a bit hacky. It was also a violation of the rewrite pattern API (detected by `MLIR_ENABLE_EXPENSIVE_PATTERN_API_CHECKS`) because the IR was modified, but the pattern returned "failure".	2024-01-12 11:55:37 +01:00
Matthias Springer	0a8e3dd432	[mlir][Interfaces] `DestinationStyleOpInterface`: Rename `hasTensor/BufferSemantics` (#77574 ) Rename interface functions as follows: * `hasTensorSemantics` -> `hasPureTensorSemantics` * `hasBufferSemantics` -> `hasPureBufferSemantics` These two functions return "true" if the op has tensor/buffer operands but not buffer/tensor operands. Also drop the "ranked" part from the interface, i.e., do not distinguish between ranked/unranked types. The new function names describe the functions more accurately. They also align their semantics with the notion of "tensor semantics" with the bufferization framework. (An op is supposed to be bufferized if it has tensor operands, and we don't care if it also has memref operands.) This change is in preparation of #75273, which adds `BufferizableOpInterface::hasTensorSemantics`. By renaming the functions in the `DestinationStyleOpInterface`, we can avoid name clashes between the two interfaces.	2024-01-12 10:02:54 +01:00
Jie Fu	c297597e1c	[mlir] Fix linking failure of libMLIRTilingInterfaceTestPasses.dylib (NFC)	2024-01-12 16:06:25 +08:00
Guray Ozen	249186701d	[mlir][nvgpu] Improve verifier of `ldmatrix` (#77807 ) PR improves the verifier of `nvgpu.ldmatrix` Op, so `nvgpu-to-nvvm` lowering does not crash.	2024-01-12 08:57:12 +01:00
MaheshRavishankar	aa2a96a24a	[mlir][TilingInterface] Move TilingInterface tests to use transform dialect ops. (#77204 ) In the process a couple of test transform dialect ops are added just for testing. These operations are not intended to use as full flushed out of transformation ops, but are rather operations added for testing. A separate operation is added to `LinalgTransformOps.td` to convert a `TilingInterface` operation to loops using the `generateScalarImplementation` method implemented by the operation. Eventually this and other operations related to tiling using the `TilingInterface` need to move to a better place (i.e. out of `Linalg` dialect)	2024-01-11 21:31:03 -08:00
Bharathi Ramana Joshi	c39926e679	[MLIR][Presburger] Fix style violations in `ff80414` (NFC) (#76720 ) Use preincrement not postincrement; use `Identifier::getIds` not `getVarKindOffset`	2024-01-12 10:36:28 +05:30
Ivan Butygin	5f59b720a8	Revert "[mlir][arith] Add overflow flags support to arith ops (#77211 )" Temporarily reverting as it broke python bindings This reverts commit `a7262d2d9b`.	2024-01-12 00:05:22 +01:00
Ivan Butygin	5afc4f3a5f	Revert "[mlir][arith][nfc] Fix typos (#77700 )" Temporarily reverting as it broke python bindings This reverts commit `9ed30012fb`.	2024-01-12 00:05:21 +01:00
Ivan Butygin	649b391799	Revert "[mlir][spirv] Lower `arith` overflow flags to corresponding SPIR-V op decorations (#77714 )" Temporaryly reverting as it broke python bindings This reverts commit `4278d9b593`.	2024-01-12 00:05:21 +01:00
Jakub Kuderski	3513267770	[mlir] Add op printing flag to skip regions (#77726 ) The new flag, `--mlir-print-skip-regions`, sets the op printing option that disables region printing. This results in the usual `--mlir-print-ir-*` debug options printing only the names of the executed passes and the signatures of the ops. Example: ```mlir // -----// IR Dump Before CSE (cse) //----- // func.func @bar(%arg0: f32, %arg1: f32) -> f32 {...} // -----// IR Dump Before Canonicalizer (canonicalize) //----- // func.func @bar(%arg0: f32, %arg1: f32) -> f32 {...} ``` The main use-case is to be triage compilation issues (crashes, slowness) on very deep pass pipelines and with very large IR files, where printing IR is prohibitively slow otherwise.	2024-01-11 17:52:41 -05:00
Valentin Clement (バレンタインクレメン)	40f5f90507	[mlir][openacc][flang] Simplify gang, vector and worker representation (#77667 ) The IR representation for gang, vector and worker has grown with the support for device_type. This patch simplify the IR representation for gang, vector and worker information on the acc.loop operation. When the only the keyword is present without any values, the information is printed at the same place than when there is values. The device_type is omitted if there is no values and it is equal to None. Otherwise the full information is displayed. First the keyword only device_type information and then the values with their device_type.	2024-01-11 13:02:06 -08:00
Kazu Hirata	3e82663b05	[Dialect] Fix a warning This patch fixes: mlir/lib/Dialect/MemRef/IR/MemRefOps.cpp:3154:8: error: unused variable 'rank' [-Werror,-Wunused-variable]	2024-01-11 12:18:09 -08:00
Felix Schneider	4619e21c72	[mlir][memref] Transpose: allow affine map layouts in result, extend folder (#76294 ) Currently, the `memref.transpose` verifier forces the result type of the Op to have an explicit `StridedLayoutAttr` via the method `inferTransposeResultType`. This means that the example Op given in the documentation is actually invalid because it uses an `AffineMap` to specify the layout. It also means that we can't "un-transpose" a transposed memref back to the implicit layout form, because the verifier will always enforce the explicit strided layout. This patch makes the following changes: 1. The verifier checks whether the canonicalized strided layout of the result Type is identitcal to the canonicalized infered result type layout. This way, it's only important that the two Types have the same strided layout, not necessarily the same representation of it. 2. The folder is extended to support folding away the trivial case of identity permutation and to fold one transposition into another by composing the permutation maps.	2024-01-11 19:54:49 +01:00
Felix Schneider	061b777c82	[mlir][affine] Add dependency on `UBDialect` for `PoisonAttr` (#77691 ) The folder for `AffineApplyOp` will try creating a `PoisonAttr` under certain circumstances. However, this will result in a crash if the `UBDialect` isn't loaded. This patch adds a dependency of `AffineDialect` on `UBDialect`.	2024-01-11 19:52:15 +01:00
Mats Petersson	21e1bf2d00	Add more ZA modes (#77361 ) Add more ZA modes Adds the arm_shared_za and arm_preserves_za attributes to the existing arm_new_za attribute. The functionality already exists in LLVM, so just "linking the pieces together". For more details see: https://arm-software.github.io/acle/main/acle.html#sme-attributes-relating-to-za	2024-01-11 18:49:52 +00:00
Ivan Butygin	4278d9b593	[mlir][spirv] Lower `arith` overflow flags to corresponding SPIR-V op decorations (#77714 )	2024-01-11 20:39:57 +03:00
Jie Fu	21133f1da4	[TOSA] Fix -Wdangling-gsl and -Wunused-variable in TosaToLinalg.cpp (NFC) llvm-project/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp:2376:9: error: object backing the pointer will be destroyed at the end of the full-expression [-Werror,-Wdangling-gsl] tensor::getMixedSizes(rewriter, loc, input_real); ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ llvm-project/mlir/lib/Conversion/TosaToLinalg/TosaToLinalg.cpp:2366:10: error: unused variable 'imag_el_ty' [-Werror,-Wunused-variable] auto imag_el_ty = cast<FloatType>( ^ 2 errors generated.	2024-01-11 22:41:17 +08:00
Dmitriy Smirnov	566124222e	[TOSA] FFT2D operator (#77005 ) This PR adds lowering for TOSA Fft2d operator down to Linalg.	2024-01-11 12:23:00 +00:00
Benjamin Maxwell	dc974573a8	[mlir][ArmSME][test] Make use of arm_sme.streaming_vl (NFC) (#77322 )	2024-01-11 10:24:55 +00:00
Matthias Springer	21aacb0b4c	[mlir] Improve `GreedyPatternRewriteDriver` and pass documentation (#77614 ) Clarify what kind of IR modifications are allowed. Also improve the documentation of the greedy rewrite driver entry points. Addressing comments in #76219.	2024-01-11 11:24:28 +01:00
Ivan Butygin	9ed30012fb	[mlir][arith][nfc] Fix typos (#77700 ) Cleanup after https://github.com/llvm/llvm-project/pull/77211	2024-01-11 06:51:57 +03:00
Boian Petkantchin	31fd6d116d	[mlir][mesh] fix ProcessMultiIndexOp building (#77676 ) Insert default empty mesh axes array instead of null attribute without MLIR context, since the attribute is default-valued not just optional.	2024-01-10 17:28:17 -08:00
Yinying Li	753dc0a01c	[mlir][verifyMemref] Fix bug and support more types for verifyMemref (#77682 ) 1. Fix a bug in verifyMemref to pass in `data` instead of `baseptr`, which didn't verify data correctly. 2. Add `==` for f16 and bf16. 3. Add a comprehensive test of verifyMemref for all supported types.	2024-01-10 20:04:43 -05:00
Billy Zhu	422b84a771	[MLIR][LLVM] DI Expression Rewrite & Legalization (#77541 ) Add a rewriter for DIExpressions & use it to run legalization patterns before exporting to llvm (because LLVM dialect allows DI Expressions that may not be valid in LLVM IR). The rewriter driver works similarly to the existing mlir rewriter drivers, except it operates on lists of DIExpressionElemAttr (i.e. DIExpressionAttr). Each rewrite pattern transforms a range of DIExpressionElemAttr into a new list of DIExpressionElemAttr. In addition, this PR sets up a place to add legalization patterns that are broadly applicable internally to the LLVM dialect, and they will always be applied prior to export. This PR adds one pattern for merging fragment operators. --------- Co-authored-by: Tobias Gysi <tobias.gysi@nextsilicon.com>	2024-01-10 16:10:06 -08:00
Aart Bik	aec73eade7	[mlir][sparse] allow unknown ops in one-shot bufferization in mini-pipeline (#77688 ) Rationale: Since this mini-pipeline may be used in alternative pipelines (viz. different from the default "sparsifier" pipeline) where unknown ops are handled by alternative bufferization methods that are downstream of this mini-pipeline, we allow unknown ops by default (failure to bufferize is eventually apparent by failing to convert to LLVM IR). This is part of enabling e2e testing for TORCH-MLIR tests using a sparsifier backend	2024-01-10 13:36:20 -08:00
Abhinav271828	2dde029df8	[MLIR][Presburger] Implement computation of generating function for unimodular cones (#77235 ) We implement a function that computes the generating function corresponding to a unimodular cone. The generating function for a polytope is obtained by summing these generating functions over all tangent cones.	2024-01-11 01:28:36 +05:30
Emilio Cota	a1dc813f75	[mlir][mesh] fix unused variable error	2024-01-10 14:32:57 -05:00
Jacques Pienaar	c1d02bd147	[mlir] Change end of OperationDefinition. (#77273 ) Store the last token parsed in the parser state so that the range parsed can utilize its end rather than the start of the token after parsed. This results in a tighter range (especially true in the case of comments, see ```mlir \|%c4 = arith.constant 4 : index // Foo \| ``` vs ```mlir \|%c4 = arith.constant 4 : index\| ``` ). Discovered while working on a little textual post processing tool.	2024-01-10 10:41:02 -08:00
Durgadoss R	6a075a9d5d	[MLIR][NVVM]: Update setmaxregister NVVM Op (#77594 ) This patch updates the setmaxregister NVVM Op to use the intrinsics instead of inline-ptx. * The interface remains same (as expected). * Tests are added to verify the lowered intrinsics in Target/LLVMIR/nvvmir.mlir. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2024-01-10 18:49:09 +01:00
Okwan Kwon	7cc9ae9551	[mlir] allow inlining complex ops (#77514 ) Complex ops are pure ops just like the arithmetic ops so they can be inlined.	2024-01-10 09:23:36 -08:00
lorenzo chelini	6bc7e3764c	[MLIR][Tensor] Fix checks for `fold-into-pack-and-unpack.mlir` (#77622 ) Fix after `113bce0`	2024-01-10 11:23:02 -06:00
Tai Ly	af78e5daf0	[mlir][tosa]Fix Rescale shift attr data type (#71084 ) Change Rescale shift attribute to be DenseI8ArrayAttr to match spec (instead of DenseI32ArrayAttr) This replaces https://reviews.llvm.org/D157439 Signed-off-by: Tai Ly <tai.ly@arm.com>	2024-01-10 16:57:39 +00:00
Han-Chung Wang	2472c45ba3	[mlir][tensor] Enhance pack/unpack simplification for identity outer_dims_perm cases. (#77409 ) They can be simplified to reshape ops if outer_dims_perm is an identity permutation. The revision adds a `isIdentityPermutation` method to IndexingUtils.	2024-01-10 08:30:34 -08:00
Andrzej Warzyński	6876fe53af	[mlir][linalg] Add a test to demonstrate peeling + vectorisation (#77590 ) Following on from #75842, we can demonstrate that loop peeling combined with masked vectorisation and existing canonicalization for vector.mask operations leads to the following loop structure: ``` // M dimension scf.for 1:M // N dimension (contains vector ops _without_ masking) scf.for 1:UB // K dimension scf.for 1:K vector.add // N dimension (contains vector ops _with_ masking) scf.for UB:N // K dimension scf.for 1:K vector.mask { vector.add } ``` This is particularly beneficial for scalable vectors which normally require masking. This example demonstrates how to avoid them.	2024-01-10 15:19:16 +00:00
Boian Petkantchin	79aa776267	[mlir][mesh] Add lowering of process multi-index op (#77490 ) * Rename mesh.process_index -> mesh.process_multi_index. * Add mesh.process_linear_index op. * Add lowering of mesh.process_multi_index into an expression using mesh.process_linear_index, mesh.cluster_shape and affine.delinearize_index. This is useful to lower mesh ops and prepare them for further lowering where the runtime may have only the linear index of a device/process. For example in MPI we have a rank (linear index) in a communicator.	2024-01-10 07:01:16 -08:00
Prathamesh Tagore	113bce0c79	[mlir][tensor] Fold producer linalg transpose with consumer tensor pack (#75658 ) Successor to https://github.com/llvm/llvm-project/pull/74206 Partial fix to https://github.com/openxla/iree/issues/15367	2024-01-10 06:55:27 -08:00
Vivek Khandelwal	b8dca4fa72	[mlir][math] Add math.acosh\|asin\|asinh\|atanh op (#77463 ) Signed-Off By: Vivek Khandelwal <vivekkhandelwal1424@gmail.com>	2024-01-10 13:39:32 +01:00
Thomas Raoux	c933bd8185	[MLIR][SCF] Add checks to verify that the pipeliner schedule is correct. (#77083 ) Add a check to validate that the schedule passed to the pipeliner transformation is valid and won't cause the pipeliner to break SSA. This checks that the for each operation in the loop operations are scheduled after their operands.	2024-01-10 04:25:57 -08:00
darkfeline	5b4abae763	[emacs] Fix Emacs library formatting (#76110 ) This makes it easier to ship/install these using the builtin Emacs package format (in particular, a Version is required).	2024-01-10 13:14:21 +01:00
Benjamin Maxwell	53d48902bc	[mlir][ArmSME] Add arm_sme.streaming_vl operation (#77321 ) This operation provides a convenient way to query the streaming vector length regardless of the streaming mode. This most useful for functions that call/pass data to streaming functions, but are not streaming themselves. Example: ```mlir %svl_w = arm_sme.streaming_vl <word> ``` Created based on discussion here: https://github.com/llvm/llvm-project/pull/76086#discussion_r1434226352	2024-01-10 10:11:44 +00:00
Dominik Adamski	f443fbc49b	[Flang][OpenMP][MLIR] Add support for -nogpulib option (#71045 ) If -nogpulib option is passed by the user, then the OpenMP device runtime is not used and we should not emit globals to configure debugging at compile-time for the device runtime. Link to -nogpulib flag implementation for Clang: https://reviews.llvm.org/D125314	2024-01-10 09:38:58 +01:00
Yinying Li	412d784188	[mlir][sparse][CRunnerUtils] Add shuffle in CRunnerUtils (#77124 ) Shuffle can generate an array of unique and random numbers from 0 to size-1. It can be used to generate tensors with specified sparsity level.	2024-01-09 19:46:35 -05:00
Jie Fu	046dffce23	Fix -Wunused-variable in TestSimplifications.cpp (NFC) llvm-project/mlir/test/lib/Dialect/Mesh/TestSimplifications.cpp:36:17: error: unused variable 'status' [-Werror,-Wunused-variable] LogicalResult status = ^ 1 error generated.	2024-01-10 07:59:19 +08:00
Ivan Butygin	a7262d2d9b	[mlir][arith] Add overflow flags support to arith ops (#77211 ) Add overflow flags support to the following ops: * `arith.addi` * `arith.subi` * `arith.muli` Example of new syntax: ``` %res = arith.addi %arg1, %arg2 overflow<nsw> : i64 ``` Similar to existing LLVM dialect syntax ``` %res = llvm.add %arg1, %arg2 overflow<nsw> : i64 ``` Tablegen canonicalization patterns updated to always drop flags, proper support with tests will be added later. Updated LLVMIR translation as part of this commit as it currenly written in a way that it will crash when new attributes added to arith ops otherwise. Discussion https://discourse.llvm.org/t/rfc-integer-overflow-flags-support-in-arith-dialect/76025 --------- Co-authored-by: Yi Wu <yi.wu2@arm.com>	2024-01-10 01:17:36 +03:00
Boian Petkantchin	ab590377a3	[mlir][mesh] Add folding of ClusterShapeOp (#77033 ) If the mesh has static size on some of the requested axes, the result is substituted with a constant.	2024-01-09 13:42:56 -08:00
Razvan Lupusoru	b565ee1ad3	[acc] Fix OpenACC documentation (#77502 ) After PR#75548, the OpenACC documentation on the MLIR website has a few issues. This change corrects them: - Renames OpenACC.md to OpenACCDialect.md so that links remain unchanged. In its current state, the links to https://mlir.llvm.org/docs/Dialects/OpenACCDialect/ no longer work. - Since the old OpenACCDialect.md (the one with operation definitions) is being included in the new file, rename the old file to prevent name ambiguity. - A header is needed in the .md file, otherwise the index on website is not properly created. - Add a new section before including the operations .md file because otherwise the separation is not clear.	2024-01-09 11:12:41 -08:00
Valentin Clement (バレンタインクレメン)	02fa434b92	[mlir][openacc] Restore unit tests for device_type functions (#77122 ) These tests were initially pushed together with https://github.com/llvm/llvm-project/pull/75864 but they were triggering some buildbot failure (sanitizers). They now make use of the `OwningOpRef` so all the resources are correctly destroyed at the end of each tests. They will be extended to includes all the extra getter functions added with device_type support.	2024-01-09 09:51:05 -08:00
Krzysztof Drewniak	5cfe24eee4	[mlir][Vector] Add nontemporal attribute, mirroring memref (#76752 ) Since vector loads and stores from scalar memrefs translate to llvm.load/store, add the ability to tag said loads and stores as nontemporal. This mirrors functionality available in memref.load/store.	2024-01-09 11:05:20 -06:00
Guray Ozen	2aec7083ad	[mlir][gpu] Use DenseI32Array for NVVM's maxntid and reqntid (NFC) (#77466 )	2024-01-09 16:44:25 +01:00
Razvan Lupusoru	ab4af25d5d	[acc] OpenACC dialect design philosophy and details (#75548 ) This document captures the design philosophy of the acc dialect. It also shares the rationale behind the design and implementation of various operations - and ties that back to the dialect design goals. Co-authored-by: Valentin Clement <clementval@gmail.com> Co-authored-by: Slava Zakharin <szakharin@nvidia.com>	2024-01-09 07:33:11 -08:00
Pradeep Kumar	0242d27dc8	[MLIR][NVVM] Add missing `;` when lowering stmatrix Op (#77471 )	2024-01-09 16:31:51 +01:00
agozillon	c1ed45a271	[mlir] Add global and program memory space handling to the data layout subsystem (#77367 ) This patch is based on a previous PR https://reviews.llvm.org/D144657 that added alloca address space handling to MLIR's DataLayout and DLTI interface. This patch aims to add identical features to import and access the global and program memory space through MLIR's DataLayout/DLTI system.	2024-01-09 13:56:11 +01:00
Oleksandr "Alex" Zinenko	4cb2ef4fe3	[mlir] add a chapter on matchers to the transform dialect tutorial (#76725 ) These operations has been available for a while, but were not described in the tutorial. Add a new chapter on using and defining match operations.	2024-01-09 13:19:41 +01:00
Oleksandr "Alex" Zinenko	633d9184f5	[mlir] introduce transform.collect_matching (#76724 ) Introduce a new match combinator into the transform dialect. This operation collects all operations that are yielded by a satisfactory match into its results. This is a simpler version of `foreach_match` that can be inserted directly into existing transform scripts.	2024-01-09 13:18:57 +01:00
Kohei Yamaguchi	d5985d4c70	[mlir][docs] Fix a broken passes documentation (#77402 ) - Add EmitC passes into Pass.md - Modify header level of the pass description to under the `LegalizeVectorStorage` pass	2024-01-09 11:45:40 +00:00
Benjamin Maxwell	ae5575db15	[mlir][ArmSME] Add `arm_sme.intr.cnts(b\|h\|w\|d)` intrinsics (#77319 ) This adds MLIR versions of the Arm streaming vector length intrinsics. These allow reading the streaming vector length regardless of the streaming mode.	2024-01-09 09:05:31 +00:00
Andrzej Warzyński	81df51fb31	[mlir][vector] Don't treat memrefs with empty stride as non-contiguous (#76848 ) As per the docs [1]: ``` In absence of an explicit layout, a memref is considered to have a multi-dimensional identity affine map layout. ``` This patch makes sure that MemRefs with no strides (i.e. no explicit layout) are treated as contiguous when checking whether a particular vector is a contiguous slice of the given MemRef. [1] https://mlir.llvm.org/docs/Dialects/Builtin/#layout Follow-up for #76428.	2024-01-09 08:13:31 +00:00
Kazu Hirata	abaa79b25d	[mlir] Use StringRef::ltrim (NFC)	2024-01-08 21:49:32 -08:00
Justin Fargnoli	b43c50490c	[mlir] Declare promised interfaces for the ConvertToLLVM extension (#76341 ) This PR adds promised interface declarations for `ConvertToLLVMPatternInterface` in all the dialects that support the `ConvertToLLVM` dialect extension. Promised interfaces allow a dialect to declare that it will have an implementation of a particular interface, crashing the program if one isn't provided when the interface is used.	2024-01-08 20:19:18 -08:00
Kai Sasaki	eee71ed3f7	[mlir][complex] Support Fastmath flag for complex.mulf (#74554 ) Support fast math flag in the conversion of `complex.mulf` op to standard dialect. See: https://discourse.llvm.org/t/rfc-fastmath-flags-support-in-complex-dialect/71981	2024-01-09 09:29:27 +09:00
Jakub Kuderski	6e90f13cc9	[mlir][spirv] Drop support for SPV_NV_cooperative_matrix (#76782 ) This extension has been superseded by SPV_KHR_cooperative_matrix which is supported across major vendors GPU like Nvidia, AMD, and Intel. Given that the KHR version has been supported for nearly half a year, drop the NV-specific extension to reduce the maintenance burden and code duplication.	2024-01-08 17:57:52 -05:00
MaheshRavishankar	4435ced949	[mlir][TilingInterface] Allow controlling what fusion is done within tile and fuse (#76871 ) Currently the `tileConsumerAndFuseProducerGreedilyUsingSCFFor` method greedily fuses through all slices that are generated during the tile and fuse flow. That is not the normal use case. Ideally the caller would like to control which slices get fused and which dont. This patch introduces a new field to the `SCFTileAndFuseOptions` to specify this control. The contol function also allows the caller to specify if the replacement for the fused producer needs to be yielded from within the tiled computation. This allows replacing the fused producers in case they have other uses. Without this the original producers still survive negating the utility of the fusion. The change here also means that the name of the function `tileConsumerAndFuseProducerGreedily...` can be updated. Defering that to a later stage to reduce the churn of API changes.	2024-01-08 13:26:10 -08:00
Billy Zhu	eb42868f25	[MLIR] Handle materializeConstant failure in GreedyPatternRewriteDriver (#77258 ) Make GreedyPatternRewriteDriver handle failures of `materializeConstant` gracefully. Previously it was not checking whether the returned op was null and crashing. This PR handles it similarly to how OperationFolder does it.	2024-01-08 10:29:32 -08:00
Guray Ozen	763109e346	[mlir][gpu] Use `known_block_size` to set `maxntid` for NVVM target (#77301 ) Setting thread block size with `maxntid` on the kernel has great performance benefits. In this way, downstream PTX compiler can do better register allocation. MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute (`known_block_size`) that keeps the thread block size when it is known. This PR simply uses this attribute to set `maxntid`.	2024-01-08 14:49:19 +01:00
Javed Absar	0ba868db70	[MLIR][Bufferizer][NFC] Simplify some codes. (#77254 ) NFC. clean up.	2024-01-08 09:37:57 +00:00
Adrian Kuegel	2642240de9	[mlir] Add explicit call to flush ClangTidy performance suggested to use '\n' instead of std::endl, but it seems the flushing behavior was intended here (tests started failing).	2024-01-08 08:04:13 +00:00
Adrian Kuegel	6343b4e482	[mlir] Apply ClangTidy performance finding - Use '\n' instead of std::endl; https://clang.llvm.org/extra/clang-tidy/checks/performance/avoid-endl.html	2024-01-08 07:47:14 +00:00
Tobias Gysi	7e54ae24d8	[mlir][llvm] Do not inline variadic functions (#77241 ) This revision updates the llvm dialect inliner to explicitly disallow the inlining of variadic functions. Already previously the inlining failed if the number of function arguments did not match the number of call arguments. After the change, inlining checks the function is not variadic and it does not contain a va_start intrinsic.	2024-01-08 08:30:10 +01:00
Christian Ulmann	bae1fdea71	[MLIR][LLVM] Add distinct identifier to the DISubprogram attribute (#77093 ) This commit adds an optional distinct attribute parameter to the DISubprogramAttr. This enables modeling of distinct subprograms, as required for LLVM IR. This change is required to avoid accidential uniquing of subprograms on functions that would lead to invalid LLVM IR post export.	2024-01-08 08:25:30 +01:00
Christian Ulmann	b3037ae1fc	[MLIR][LLVM] Add distinct identifier to DICompileUnit attribute (#77070 ) This commit adds a distinct attribute parameter to the DICompileUnit to enable the modeling of distinctness. LLVM requires DICompileUnits to be distinct and there are cases where one gets two equivalent compilation units but LLVM still requires differentiates them. We observed such cases for combinations of LTO and inline functions. This patch also changes the DIScopeForLLVMFuncOp pass to a module pass, to ensure that only one distinct DICompileUnit is created, instead of one for each function.	2024-01-08 07:42:33 +01:00
Matthias Springer	752df2bc0b	[mlir][IR] `DominanceInfo`: Add function to query dominator of a range of block (#77098 ) Also improve the implementation of `findCommonDominator` (skip duplicate blocks) and extract it from `BufferPlacementTransformationBase` (so that `BufferPlacementTransformationBase` can be retired eventually).	2024-01-07 14:01:11 +01:00
Matthias Springer	dd450f08cf	[mlir][Interfaces][NFC] Move region loop detection to `RegionBranchOpInterface` (#77090 ) `BufferPlacementTransformationBase::isLoop` checks if there a loop in the region branching graph of an operation. This algorithm is similar to `isRegionReachable` in the `RegionBranchOpInterface`. To avoid duplicate code, `isRegionReachable` is generalized, so that it can be used to detect region loops. A helper function `RegionBranchOpInterface::hasLoop` is added. This change also turns a recursive implementation into an iterative one, which is the preferred implementation strategy in LLVM. Also move the `isLoop` to `BufferOptimizations.cpp`, so that we can gradually retire `BufferPlacementTransformationBase`. (This is so that proper error handling can be added to `BufferViewFlowAnalysis`.)	2024-01-07 13:49:29 +01:00
Bharathi Ramana Joshi	3eb9fd8ac8	[MLIR][Presburger] Implement IntegerRelation::mergeAndAlignSymbols (#76736 )	2024-01-07 17:06:52 +05:30
Abhinav271828	2835be82db	[MLIR][Presburger] Fix ParamPoint to be column-wise instead of row-wise (#77232 ) The ParamPoint datatype has each column representing an affine function. The code for generating functions is modified to reflect this.	2024-01-07 16:27:10 +05:30
Abhinav271828	4c8dbb6813	[MLIR][Presburger] Definitions for basic functions related to cones (#76650 ) We add some basic type aliases and function definitions relating to cones for Barvinok's algorithm. These include functions to get the dual of a cone and find its index.	2024-01-07 10:30:22 +00:00
Alex Beloi	c63febb102	[mlir][spirv] Use assemblyFormat to define atomic op assembly (#76323 ) see #73359 Declarative assemblyFormat ODS is more concise and requires less boilerplate than filling out CPP interfaces. Changes: * updates the Ops defined in `SPIRVAtomicOps.td` to use assemblyFormat. * Removes print/parse from`AtomcOps.cpp` which is now generated by assemblyFormat * Adds `Trait` to verify that a pointer operand `foo`'s pointee type matches operand `bar`'s type * * Updates error message expected in tests from new Trait * Updates tests to updated format (largely using <operand> in place of "operand")	2024-01-06 19:55:55 -08:00
Maksim Levental	83be8a7400	[mlir][python] add MemRefTypeAttr attr builder (#76371 )	2024-01-06 16:42:14 -06:00
Kohei Yamaguchi	747d8fb01c	[mlir][spirv] Support alias/restrict function argument decorations (#76353 ) Closes #76106 --------- Co-authored-by: Lei Zhang <antiagainst@gmail.com>	2024-01-06 11:51:23 -08:00
Abhinav271828	bd0dc357af	[MLIR][Presburger] Shift GeneratingFunction.h to includes (#77114 ) We shift the GeneratingFunction.h header file to the include/ directory and wrap it in a `detail` namespace.	2024-01-06 17:08:25 +05:30
Guray Ozen	5b33cff397	[mlir][gpu] Add Support for Cluster of Thread Blocks in `gpu.launch` (#76924 )	2024-01-06 11:17:01 +01:00
Dimple Prajapati	5e54319b7b	[mlir][spirv] Support spec constants as GlobalVar initializer (#75660 ) Changes include: - spirv serialization and deserialization needs handling in cases when GlobalVariableOp initializer is defined using spirv SpecConstant or SpecConstantComposite op, currently even though it allows SpecConst, it only looked up in for GlobalVariable Map to find initializer symbol reference, change is fixing this and extending the support to SpecConstantComposite as an initializer. - Adds tests to make sure GlobalVariable can be initialized using specialized constants. --------- Co-authored-by: Lei Zhang <antiagainst@gmail.com>	2024-01-05 16:27:30 -08:00
Boian Petkantchin	fc18b13492	[mlir][mesh] In sharding attr use FlatSymbolRefAttr instead of SymbolRefAttr (#76886 ) Analogous to func.call use FlatSymbolRefAttr to reference the corresponding mesh.	2024-01-05 07:14:07 -08:00
Arseniy Obolenskiy	59569eb756	[mlir] Fix support for loop normalization with integer indices (#76566 ) Choose correct type for updated loop boundaries after scf loop normalization, do not force chosen type to IndexType	2024-01-05 17:49:21 +03:00
Guray Ozen	06f1e10908	[mlir][nvvm] Add clock and clock64 special registers (#77088 ) Tihs PR adds `clock` and `clock64` special registers to NVVM dialect.	2024-01-05 14:41:44 +01:00
Guray Ozen	ace69e6b94	[mlir][gpu] Improve `gpu-lower-to-nvvm-pipeline` Documentation (#77062 ) This PR improves the documentation for the `gpu-lower-to-nvvm-pipeline` (as it was remaning item for #75775) - Changes pipeline `gpu-lower-to-nvvm` -> `gpu-lower-to-nvvm-pipeline` - Adds a section in GPU Dialect in website. It clarifies the pipeline's functionality in lowering primary dialects to NVVM targets.	2024-01-05 12:51:25 +01:00
drazi	44b3cf46e9	add prop-dict support for custom directive for mlir-tblgen (#77061 ) According to https://mlir.llvm.org/docs/DefiningDialects/Operations/#custom-directives, custom directive supports attr-dict > attr-dict Directive: NamedAttrList & But it doesn't support prop-dict which is introduced into MLIR recently. It's useful to have tblgen support prop-dict like attr-dict. This PR enable tblgen to support prop-dict ```bash error: only variables and types may be used as parameters to a custom directive ... custom<Print>(prop-dict) ``` Co-authored-by: Fung Xie <ftse@nvidia.com>	2024-01-05 12:37:24 +01:00
Dmitriy Smirnov	2952fb3495	[TOSA] Usage of 32bit integer for 'index to float' in rfft2d (#75098 ) Lowering of rfft2d to linalg now uses index to i32 cast if an output float is of 32bit and cast to i64 otherwise.	2024-01-05 09:51:23 +00:00
Guray Ozen	4319e1916d	[mlir][nvgpu] Introduce Multicast Capability to `nvgpu.tma.async.load` (#76935 ) This PR improves the functionality of the `nvgpu.tma.async.load` Op by adding support for multicast. While we already had this capability in the lower-level `nvvm.cp.async.bulk.tensor.shared.cluster.global` NVVM Op, this PR lowers mask information to the NVVM operation.	2024-01-05 10:48:55 +01:00
Matthias Springer	b662c9aa0e	[mlir][bufferization][NFC] Buffer deallocation: Add comment to `handleInterface` (#76956 ) This is a follow-up for #68648.	2024-01-05 09:30:52 +01:00
Matthias Springer	bb6d5c2200	[mlir][Transforms] `GreedyPatternRewriteDriver`: Do not CSE constants during iterations (#75897 ) The `GreedyPatternRewriteDriver` tries to iteratively fold ops and apply rewrite patterns to ops. It has special handling for constants: they are CSE'd and sometimes moved to parent regions to allow for additional CSE'ing. This happens in `OperationFolder`. To allow for efficient CSE'ing, `OperationFolder` maintains an internal lookup data structure to find the existing constant ops with the same value for each `IsolatedFromAbove` region: ```c++ /// A mapping between an insertion region and the constants that have been /// created within it. DenseMap<Region *, ConstantMap> foldScopes; ``` Rewrite patterns are allowed to modify operations. In particular, they may move operations (including constants) from one region to another one. Such an IR rewrite can make the above lookup data structure inconsistent. We encountered such a bug in a downstream project. This bug materialized in the form of an op that uses the result of a constant op from a different `IsolatedFromAbove` region (that is not accessible). This commit changes the behavior of the `GreedyPatternRewriteDriver` such that `OperationFolder` is used to CSE constants at the beginning of each iteration (as the worklist is populated), but no longer during an iteration. `OperationFolder` is no longer used after populating the worklist, so we do not have to care about inconsistent state in the `OperationFolder` due to IR rewrites. The `GreedyPatternRewriteDriver` now performs the op folding by itself instead of calling `OperationFolder::tryToFold`. This change changes the order of constant ops in test cases, but not the region in which they appear. All broken test cases were fixed by turning `CHECK` into `CHECK-DAG`. Alternatives considered: The state of `OperationFolder` could be partially invalidated with every `notifyOperationModified` notification. That is more fragile than the solution in this commit because incorrect rewriter API usage can lead to missing notifications and hard-to-debug `IsolatedFromAbove` violations. (It did not fix the above mention bug in a downstream project, which could be due to incorrect rewriter API usage or due to another conceptual problem that I missed.) Moreover, ops are frequently getting modified during a greedy pattern rewrite, so we would likely keep invalidating large parts of the state of `OperationFolder` over and over. Migration guide: Turn `CHECK` into `CHECK-DAG` in test cases. Constant ops are no longer folded during a greedy pattern rewrite. If you rely on folding (and rematerialization) of constant ops during a greedy pattern rewrite, turn the folder into a pattern.	2024-01-05 09:22:18 +01:00
Uday Bondhugula	c1eef483b2	[MLIR] Support interrupting AffineExpr walks (#74792 ) Support WalkResult for AffineExpr walk and support interrupting walks along the lines of Operation::walk. This allows interrupted walks when a condition is met. Also, switch from std::function to llvm::function_ref for the walk function.	2024-01-05 06:35:22 +05:30
Valentin Clement (バレンタインクレメン)	e456689fb3	[mlir][flang][openacc] Support device_type on loop construct (#76892 ) This is adding support for `device_type` clause representation in the OpenACC MLIR dialect on the acc.loop operation and adjust flang to lower correctly to the new representation. Each "value" that can be impacted by a `device_type` clause is now associated with an array attribute that carry this information. This includes: - `worker` clause information - `gang` clause information - `vector` clause information - `collapse` clause information - `tile` clause information The representation of the `gang` clause information has been updated and all values are now carried in a single operand segment. This segment is then subdivided by `device_type`. Each value in a segment is also associated with a `GangArgType` so it can be differentiated (num/dim/static). This simplify the handling of gang values an limit the number of new attributes needed. When the clause can be associated with the operation without any value (`gang`, `vector`, `worker`). These are represented by a dedicated attributes with device_type information. Extra getter functions are provided to make it easier to retrieve a value based on a device_type.	2024-01-04 16:33:33 -08:00
Valentin Clement (バレンタインクレメン)	71ec30132b	[mlir][openacc] Add device_type support for data operation (#76126 ) Following #75864, this patch adds device_type support to the data operation on the async and wait operands and attributes.	2024-01-04 16:33:20 -08:00
Aart Bik	4241e84707	[mlir][sparse] minor comment edits in sparsifier pipeline (#77000 )	2024-01-04 14:09:31 -08:00
Maksim Levental	a0c19bd455	[mlir][RegionBranchOpInterface] explicitly check for existance of block terminator (#76831 )	2024-01-04 14:43:52 -06:00
Oleksandr "Alex" Zinenko	71c17424b5	[mlir][TD] update more tests to use the "main" interpreter pass (#76963 ) Update several tests under mlir/test/Dialect/Transform to use the "main" transform interpreter pass with named entry points rather than the test interpreter pass. This helped discover a logic error in the expensive checks mechanism that was exiting too early.	2024-01-04 21:33:51 +01:00
Valentin Clement	85939e5e24	[mlir][openacc][NFC] Rename custom parser from WaitOperands to DeviceTypeOperandsWithSegment	2024-01-04 10:28:37 -08:00
Andrzej Warzyński	db9a16eaed	[mlir][nfc] Update comments in the Linalg vectoriser (#76797 )	2024-01-04 17:24:22 +00:00
Jakub Kuderski	9215741726	[mlir] Make fold result type check more verbose (#76867 ) Print the op and its types when the fold type check fails. This is to speed up debuging as it should be trivial to map the offending op to its folder based on the op name.	2024-01-04 11:08:36 -05:00
Oleksandr "Alex" Zinenko	b336ab42dc	[mlir] add a way to query non-property attributes (#76959 ) This helps support generic manipulation of operations that don't (yet) use properties to store inherent attributes. Use this mechanism in type inference and operation equivalence. Note that only minimal unit tests are introduced as all the upstream dialects seem to have been updated to use properties and the non-property behavior is essentially deprecated and untested.	2024-01-04 16:40:13 +01:00
Krzysztof Drewniak	2aff7f3919	[mlir][LLVM] Add !invariant.load metadata support to llvm.load (#76754 ) Add support for !invariant.load metadata (by way of a unit attribute) to the MLIR representation of llvm.load.	2024-01-04 09:33:09 -06:00
Simon Camphausen	96c23ebd3b	[mlir][EmitC] Use declarative assembly format for opaque types and attributes (#76066 ) The parser and printer of string attributes were changed to handle escape sequences. Therefore, we no longer require a custom parser and printer. Verification is moved from the parser to the verifier accordingly.	2024-01-04 15:43:33 +01:00
Andrzej Warzyński	ca5d34ec71	[mlir][TD] Fix the order of return handles (#76929 ) Replace (in tests and docs): %forall, %tiled = transform.structured.tile_using_forall with (updated order of return handles): %tiled, %forall = transform.structured.tile_using_forall Similar change is applied to (in the TD tutorial): transform.structured.fuse_into_containing_op This update makes sure that the tests/documentation are consistent with the Op specifications. Follow-up for #67320 which updated the order of the return handles for `tile_using_forall`.	2024-01-04 12:54:16 +00:00
Alex Zinenko	5ed11e767c	[mlir] don't use magic numbers in IRNumbering.cpp Bytecode versions have named constants that should be used instead of magic numbers.	2024-01-04 09:49:34 +00:00
Alex Zinenko	985bb3a20a	[mlir] fix bytecode writer after `c1eab57673` The change in `c1eab57` fixed the behavior of `getDiscardableAttrDictionary` for ops that are not using properties to only return discardable attributes. Bytecode writer was relying on the wrong behavior and would assume all attributes are discardable, without appropriate testing. Fix that and add a test.	2024-01-04 09:49:34 +00:00
Mitch Phillips	0c23163184	Revert "[mlir] Add `res()` method to `linalg::ContractionOpInterface` (#76539 )" This reverts commit `53edf12e52`. Reason: Broke the sanitizer buildbots with a memory leak. More information available on `53edf12e52`	2024-01-04 10:37:32 +01:00
drblallo	2bd6642533	[mlir][dataflow]Fix dense backward dataflow intraprocedural hook (#76865 ) The dataflow analysis framework within MLIR allows to customize the transfer function when a `call-like` operation is encuntered. The check to see if the analysis was executed in intraprocedural mode was executed after the check to see if the callee had the CallableOpInterface, and thus intraprocedural analyses would behave as interpocedural ones when performing indirect calls. This commit fixes the issue by performing the check for intraprocedurality first. Dense forward analyses were already behaving correctly. https://github.com/llvm/llvm-project/blob/main/mlir/lib/Analysis/DataFlow/DenseAnalysis.cpp#L63 Co-authored-by: massimo <mo.fioravanti@gmail.com>	2024-01-04 10:28:12 +01:00
Sergei Lebedev	3737712dae	Slightly improved ir.pyi type annotations (#76728 ) * Replaced `Any` with static types where appropriate * Removed undocumented `__str__` and `__repr__` -- these are always defined via `object`	2024-01-04 09:49:57 +01:00
Andrzej Warzyński	f8c034140b	[mlir][docs] Update TD tutorial - Ch0 (#76858 ) Updates `generic` as `linalg.generic` (for consistency and to avoid ambiguity) and a few other fixes.	2024-01-04 09:48:44 +01:00
Jacques Pienaar	6ae7f66ff5	[mlir] Add config for PDL (#69927 ) Make it so that PDL in pattern rewrites can be optionally disabled. PDL is still enabled by default and not optional bazel. So this should be a NOP for most folks, while enabling other to disable. This only works with tests disabled. With tests enabled this still compiles but tests fail as there is no lit config to disable tests that depend on PDL rewrites yet.	2024-01-03 20:37:20 -08:00
Jerry Wu	53edf12e52	[mlir] Add `res()` method to `linalg::ContractionOpInterface` (#76539 ) In addition to `lhs()` and `rhs()` to return left and right operands, add `res()` to return the result value.	2024-01-03 22:34:19 -05:00
Boian Petkantchin	7a4c49756d	[mlir][mesh] Use one type for mesh axis (#76830 ) Make all ops and attributes use the types MeshAxis and MeshAxesAttr instead of int16_t, int32_t, DenseI16ArrayAttr and DenseI32ArrayAttr.	2024-01-03 15:47:11 -08:00
Andrzej Warzyński	39298b09ec	[mlir][docs] Capitalize "Transform" in "transform dialect" (#76840 ) A mix of "Transform dialect" and "transform dialect" is used ATM. This patch capitalizes the outstanding instances of "transform".	2024-01-03 21:33:11 +00:00
Muhammad Omair Javaid	a24c58140f	Revert "[mlir] Consider mlir-linalg-ods-gen as a tablegen tool in build (#75093 )" This reverts commit `9191ac0bdb`. Breaks build on following buildbot: https://lab.llvm.org/buildbot/#/builders/177/builds/27432	2024-01-04 02:01:16 +05:00
Krzysztof Drewniak	ddd6acd7a8	[mlir][GPU] Expand LLVM function attribute copies (#76755 ) Expand the copying of attributes on GPU kernel arguments during LLVM lowering. Support copying attributes from values that are already LLVM pointers. Support copying attributes, like `noundef`, that aren't specific to (the pointer parts of) arguments.	2024-01-03 14:28:15 -06:00
max	b49e0ebedf	Revert "[mlir] Add config for PDL (#69927 )" This reverts commit `5930725c89`.	2024-01-03 12:16:19 -06:00
Jacques Pienaar	5930725c89	[mlir] Add config for PDL (#69927 ) Make it so that PDL in pattern rewrites can be optionally disabled. PDL is still enabled by default and not optional bazel. So this should be a NOP for most folks, while enabling other to disable. This is piped through mlir-tblgen invocation and that could be changed/avoided by splitting up the passes file instead. This only works with tests disabled. With tests enabled this still compiles but tests fail as there is no lit config to disable tests that depend on PDL rewrites yet.	2024-01-03 09:43:22 -08:00
Alex Zinenko	fc0fdd1ae2	[mlir] fix AsmPrinter after `c1eab57673` The change in `c1eab57673` fixed the behavior of `getDiscardableAttrDictionary` for ops that are not using properties to only return discardable attributes. AsmPrinter was relying on the wrong behavior when printing such ops in the generic form, assuming all attributes are discardable.	2024-01-03 17:37:41 +00:00
Puyan Lotfi	03e29a49d9	[mlir][Pass] Enable the option for reproducer generation without crashing (#75421 ) This PR adds API `makeReproducer` and cl::opt flag `--mlir-generate-reproducer=<filename>` in order to allow for mlir reproducer dumps even when the pipeline doesn't crash. This PR also decouples the code that handles generation of an MLIR reproducer from the crash recovery portion. The purpose is to allow for generating reproducers outside of the context of a compiler crash. This will be useful for frameworks and runtimes that use MLIR where it is needed to reproduce the pipeline behavior for reasons outside of diagnosing crashes. An example is for diagnosing performance issues using offline tools, where being able to dump the reproducer from a runtime compiler would be helpful.	2024-01-03 12:36:43 -05:00
Han-Chung Wang	76cb0bb7a4	[mlir][tensor] Add a pattern to simplify tensor.unpack to collpase shape (#76607 )	2024-01-03 09:34:52 -08:00
Alex Zinenko	f557f05b8d	[mlir] update InferTypeOpInterface after `c1eab57673` The change in `c1eab57673` fixed the behavior of `getDiscardableAttrDictionary` for ops that are not using properties to only return discardable attributes. `InferTypeOpInterface` was relying on the wrong behavior when constructing an adaptor and would assume that all attributes were discardable, which is not the case.	2024-01-03 16:49:46 +00:00
Balaji V. Iyer	21fe8b635c	[mlir] Check if the stride tensor is empty. (#76428 ) Added a check to see if the stride tensor is empty. If so then return false for isContiguousSlice function. Possible fix for #74463	2024-01-03 10:00:15 -06:00
Oleksandr "Alex" Zinenko	c1eab57673	[mlir] fix Operation::getDiscardableAttrs in absence of properties (#76816 ) When properties are not enabled in an operation, inherent attributes are stored in the common dictionary with discardable attributes. However, `getDiscardableAttrs` and `getDiscardableAttrDictionary` were returning the entire dictionary, making the caller mistakenly believe that all inherent attributes are discardable. Fix this by filtering out attributes whose names are registered with the operation, i.e., inherent attributes. This requires an API change so `getDiscardableAttrs` returns a filter range.	2024-01-03 16:33:27 +01:00
Rik Huijzer	6b21948f26	[mlir][vector] Fix invalid `LoadOp` indices being created (#76292 ) Fixes https://github.com/llvm/llvm-project/issues/71326. This is the second PR. The first PR at https://github.com/llvm/llvm-project/pull/75519 was reverted because an integration test failed. The failed integration test was simplified and added to the core MLIR tests. Compared to the first PR, the current PR uses a more reliable approach. In summary, the current PR determines the mask indices by looking up the _mask_ buffer load indices from the previous iteration, whereas `main` looks up the indices for the _data_ buffer. The mask and data indices can differ when using a `permutation_map`. The cause of the issue was that a new `LoadOp` was created which looked something like: ```mlir func.func main(%arg1 : index, %arg2 : index) { %alloca_0 = memref.alloca() : memref<vector<1x32xi1>> %1 = vector.type_cast %alloca_0 : memref<vector<1x32xi1>> to memref<1xvector<32xi1>> %2 = memref.load %1[%arg1, %arg2] : memref<1xvector<32xi1>> return } ``` which crashed inside the `LoadOp::verify`. Note here that `%alloca_0` is the mask as can be seen from the `i1` element type and note it is 0 dimensional. Next, `%1` has one dimension, but `memref.load` tries to index it with two indices. This issue occured in the following code (a simplified version of the bug report): ```mlir #map1 = affine_map<(d0, d1, d2, d3) -> (d0, 0, 0, d3)> func.func @main(%subview: memref<1x1x1x1xi32>, %mask: vector<1x1xi1>) -> vector<1x1x1x1xi32> { %c0 = arith.constant 0 : index %c0_i32 = arith.constant 0 : i32 %3 = vector.transfer_read %subview[%c0, %c0, %c0, %c0], %c0_i32, %mask {permutation_map = #map1} : memref<1x1x1x1xi32>, vector<1x1x1x1xi32> return %3 : vector<1x1x1x1xi32> } ``` After this patch, it is lowered to the following by `-convert-vector-to-scf`: ```mlir func.func @main(%arg0: memref<1x1x1x1xi32>, %arg1: vector<1x1xi1>) -> vector<1x1x1x1xi32> { %c0_i32 = arith.constant 0 : i32 %c0 = arith.constant 0 : index %c1 = arith.constant 1 : index %alloca = memref.alloca() : memref<vector<1x1x1x1xi32>> %alloca_0 = memref.alloca() : memref<vector<1x1xi1>> memref.store %arg1, %alloca_0[] : memref<vector<1x1xi1>> %0 = vector.type_cast %alloca : memref<vector<1x1x1x1xi32>> to memref<1xvector<1x1x1xi32>> %1 = vector.type_cast %alloca_0 : memref<vector<1x1xi1>> to memref<1xvector<1xi1>> scf.for %arg2 = %c0 to %c1 step %c1 { %3 = vector.type_cast %0 : memref<1xvector<1x1x1xi32>> to memref<1x1xvector<1x1xi32>> scf.for %arg3 = %c0 to %c1 step %c1 { %4 = vector.type_cast %3 : memref<1x1xvector<1x1xi32>> to memref<1x1x1xvector<1xi32>> scf.for %arg4 = %c0 to %c1 step %c1 { %5 = memref.load %1[%arg2] : memref<1xvector<1xi1>> %6 = vector.transfer_read %arg0[%arg2, %c0, %c0, %c0], %c0_i32, %5 {in_bounds = [true]} : memref<1x1x1x1xi32>, vector<1xi32> memref.store %6, %4[%arg2, %arg3, %arg4] : memref<1x1x1xvector<1xi32>> } } } %2 = memref.load %alloca[] : memref<vector<1x1x1x1xi32>> return %2 : vector<1x1x1x1xi32> } ``` What was causing the problems is that one dimension of the data buffer `%alloca` (eltype `i32`) is unpacked (`vector.type_cast`) inside the outmost loop (loop with index variable `%arg2`) and the nested loop (loop with index variable `%arg3`), whereas the mask buffer `%alloca_0` (eltype `i1`) is not unpacked in these loops. Before this patch, the load indices would be determined by looking up the load indices for the data buffer load op. However, as shown in the specific example, when a permutation map is specified then the load indices from the data buffer load op start to differ from the indices for the mask op. To fix this, this patch ensures that the load indices for the mask buffer are used instead. --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2024-01-03 13:46:52 +01:00
Oleksandr "Alex" Zinenko	f90b609004	[mlir] introduce transform.num_associations (#76723 ) Add a new transform operation that creates a new parameter containing the number of payload objects (operations, values or attributes) associated with the argument. This is useful in matching and for debugging purposes. This replaces three ad-hoc operations previously provided by the test extension.	2024-01-03 13:33:18 +01:00
Jie Fu	ab43cf26ca	[mlir][mesh] Fix -Wunused-variable in Spmdization.cpp (NFC) llvm-project/mlir/lib/Dialect/Mesh/Transforms/Spmdization.cpp:573:14: error: unused variable 'targetShardType' [-Werror,-Wunused-variable] ShapedType targetShardType = ^ 1 error generated.	2024-01-03 09:29:14 +08:00
Peiming Liu	d933b88b71	[mlir][sparse] use a common util function to query the tensor level s… (#76764 ) …et in a lattice point.	2024-01-02 15:56:42 -08:00
Boian Petkantchin	1a8fb88719	[mlir][mesh] Add resharding spmdization on a 1D device mesh (#76179 ) The current implementation supports only sharding of tensor axes that have size divisible by the mesh axis size.	2024-01-02 15:50:07 -08:00

1 2 3 4 5 ...

18719 Commits