llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-02-12 21:28:48 +00:00

Author	SHA1	Message	Date
frgossen	06f9ffa050	Fix unused variable (#66644 )	2023-09-18 09:35:20 -04:00
Matthias Springer	64839fbd45	[mlir][bufferization] Empty tensor elimination for materialize_in_destination (#65468 ) This revision adds support for empty tensor elimination to "bufferization.materialize_in_destination" by implementing the `SubsetInsertionOpInterface`. Furthermore, the One-Shot Bufferize conflict detection is improved for "bufferization.materialize_in_destination".	2023-09-18 15:34:28 +02:00
Nicolas Vasilache	bf7c490ab7	[mlir][Vector] Add a rewrite pattern for better low-precision bitcast… (#66387 ) …(trunci) expansion This revision adds a rewrite for sequences of vector `bitcast(trunci)` to use a more efficient sequence of vector operations comprising `shuffle` and `bitwise` ops. Such patterns appear naturally when writing quantization / dequantization functionality with the vector dialect. The rewrite performs a simple enumeration of each of the bits in the result vector and determines its provenance in the pre-trunci vector. The enumeration is used to generate the proper sequence of `shuffle`, `andi`, `ori` followed by an optional final `trunci`/`extui`. The rewrite currently only applies to 1-D non-scalable vectors and bails out if the final vector element type is not a multiple of 8. This is a failsafe heuristic determined empirically: if the resulting type is not an even number of bytes, further complexities arise that are not improved by this pattern: the heavy lifting still needs to be done by LLVM.	2023-09-18 15:08:18 +02:00
Matthias Springer	0f952cfe24	[mlir][IR] Change `MutableOperandRange::operator[]` to return an `OpOperand &` (#66515 ) `operator[]` returns `OpOperand &` instead of `Value`. * This allows users to get OpOperands by name instead of "magic" number. E.g., `extractSliceOp->getOpOperand(0)` can be written as `extractSliceOp.getSourceMutable()[0]`. * `OperandRange` provides a read-only API to operands: `operator[]` returns `Value`. `MutableOperandRange` now provides a mutable API: `operator[]` returns `OpOperand &`, which can be used to set operands. Note: The TableGen code generator could be changed to return `OpOperand &` (instead of `MutableOperandRange`) for non-variadic and non-optional arguments in a subsequent change. Then the `[0]` part in the above example would no longer be necessary.	2023-09-18 09:43:03 +02:00
Matthias Springer	5cf714bb2f	[mlir][SCF] scf.for: Consistent API around `initArgs` (#66512 ) * Always use the auto-generated `getInitArgs` function. Remove the hand-written `getInitOperands` duplicate. * Remove `hasIterOperands` and `getNumIterOperands`. The names were inconsistent because the "arg" is called `initArgs` in TableGen. Use `getInitArgs().size()` instead. * Fix verification around ops with no results.	2023-09-18 09:13:43 +02:00
Dudeldu	ed8bd7176d	[MLIR] Allow comparison of opaque properties (#66378 ) Add capabilities for comparing opaque properties. This is useful when dealing with arbitrary operations which can be compare based on their OperationName. Now you can furthermore compare their properties without the need to determine their actual type.	2023-09-17 23:46:31 -07:00
Bharathi Ramana Joshi	ccf194b845	[MLIR][Presburger] Implement convertVarKind for PresburgerRelation	2023-09-17 20:02:16 +05:30
Christian Sigg	1c8c365de2	[mlir][bytecode] Check that bytecode source buffer is sufficiently aligned. (#66380 ) Before this change, the `ByteCode` test failed on CentOS 7 with devtoolset-9, because strings happen to be only 8 byte aligned. In general though, strings have no alignment guarantee. Increase resource alignment in test to 32 bytes. Adjust test to sufficiently align buffer. Add test to check error when buffer is insufficiently aligned.	2023-09-17 13:46:01 +02:00
Tai Ly	3323702bff	[TOSA] Change EightK MaxScale to 256 (#66536 ) This patch changes the MaxScale value for level EightK to 256. Also updated affected level check tests Change-Id: Id9cffd5eb9053bb688196cd5b3b55b3ddd2a359c Signed-off-by: Tai Ly <tai.ly@arm.com>	2023-09-15 13:57:10 -07:00
Daniil Dudkin	01e80a0f41	[mlir] Add `maxnumf` and `minnumf` to `AtomicRMWKind` (#66442 ) This commit adds the mentioned kinds of `AtomicRMWKind` as well as code generation for them.	2023-09-15 22:41:51 +03:00
Ingo Müller	68033aaac5	[mlir][transform] Fix crash in transform.get_parent_op. (#66492 ) The previous implementation crashed if run on a `builtin.module` using an `op_name` filter (because the initial value of `parent` in the while loop was a `nullptr`). This PR fixes the crash and adds a test.	2023-09-15 21:32:17 +02:00
Anton Korobeynikov	51d5d7bbae	Extend `retcon.once` coroutines lowering to optionally produce a normal result (#66333 ) One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes. However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return` is transformed to a member / callback call on promise object). The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.	2023-09-15 09:54:38 -07:00
Andrzej Warzyński	57cf6896cd	[mlir][vector] Fix vector.broadcast lowering for scalable vectors (#66344 ) This patch makes sure that the following case is lowered correctly ("duplication"): ``` func.func @broadcast_scalable_duplication(%arg0: vector<[32]xf32>) -> vector<1x[32]xf32> { %res = vector.broadcast %arg0 : vector<[32]xf32> to vector<1x[32]xf32> return %res : vector<1x[32]xf32> } ```	2023-09-15 16:35:47 +01:00
vic	87d77d3cfb	[mlir][IR] Insert operations before `SingleBlock`'s terminator (#65959 ) Change `SingleBlock::{insert,push_back}` to avoid inserting the argument operation after the block's terminator. This allows removing `SingleBlockImplicitTerminator`'s functions with the same name. Define `Block::hasTerminator` checking whether the block has a terminator operation. Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2023-09-15 16:17:35 +02:00
Benjamin Maxwell	665995b918	[mlir][Conversion] Allow lowering to fixed arrays of scalable vectors This allows lowering vector types like: vector<3x[4]> or vector<3x2x[4]> to LLVM IR, i.e. vectors where the trailing dim is scalable. This is contingent on: https://discourse.llvm.org/t/rfc-enable-arrays-of-scalable-vector-types/72935 More tests will be added in later patches, however, some MLIR fixes are needed first. Depends on: D158517 Reviewed By: awarzynski Differential Revision: https://reviews.llvm.org/D158752	2023-09-15 09:33:18 +00:00
Martin Erhart	08b7a71bcc	[mlir][bufferization] Define a pipeline for buffer deallocation (#66352 ) Since ownership based buffer deallocation requires a few passes to be run in a somewhat fixed sequence, it makes sense to have a pipeline for convenience (and to reduce the number of transform ops to represent default deallocation).	2023-09-15 09:39:17 +02:00
Christopher Bate	831041be79	[mlir][vector] Cleanup VectorUnroll and create a generic tile iteration utility This change refactors some of the utilities used to unroll larger vector computations into smaller vector computations. In fact, the indexing computations used here are rather generic and are useful in other dialects or downstream projects. Therefore, a utility for iterating over all possible tile offsets for a particular pair of static (shape, tiled shape) is introduced in IndexingUtils and replaces the existing computations in the vector unrolling transformations. This builds off of the refactoring of IndexingUtils introduced in 203fad476b7e. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D150000	2023-09-14 20:34:44 -06:00
Jakub Kuderski	ed4daeaa13	[mlir][spirv][gpu] Add conversion for load/store/mad coop matrix ops (#66311 ) This is plugged in as an alternative lowering path in the gpu to spirv dialect conversion. Add custom op builders for coop matrix ops to make the create functions nicer to work with and less error-prone. The latter is accomplished by following the op syntax and also requiring stride to be a constant op to avoid confusion around the order of arguments. The remaining lowering patterns will be added in a future patch.	2023-09-14 22:16:27 -04:00
Stella Laurenzo	f66cd9e955	[mlir] Add Python bindings for DenseResourceElementsAttr. (#66319 ) Only construction and type casting are implemented. The method to create is explicitly named "unsafe" and the documentation calls out what the caller is responsible for. There really isn't a better way to do this and retain the power-user feature this represents.	2023-09-14 18:45:29 -07:00
Yinying Li	2a07f0fd40	[mlir][sparse] Migrate more tests to use new syntax (#66443 ) Dense `lvlTypes = [ "dense", "dense" ]` to `map = (d0, d1) -> (d0 : dense, d1 : dense)` `lvlTypes = [ "dense", "dense" ], dimToLvl = affine_map<(i,j) -> (j,i)>` to `map = (d0, d1) -> (d1 : dense, d0 : dense)` DCSR `lvlTypes = [ "compressed", "compressed" ]` to `map = (d0, d1) -> (d0 : compressed, d1 : compressed)` DCSC `lvlTypes = [ "compressed", "compressed" ], dimToLvl = affine_map<(i,j) -> (j,i)>` to `map = (d0, d1) -> (d1 : compressed, d0 : compressed)` Block Row `lvlTypes = [ "compressed", "dense" ]` to `map = (d0, d1) -> (d0 : compressed, d1 : dense)` Block Column `lvlTypes = [ "compressed", "dense" ], dimToLvl = affine_map<(i,j) -> (j,i)>` to `map = (d0, d1) -> (d1 : compressed, d0 : dense)` This is an ongoing effort: #66146, #66309	2023-09-14 23:19:57 +00:00
Aart Bik	8998bcfbce	[mlir][sparse][gpu] refine type of workspace size variables (#66438 ) Rationale: Some compiler settings don't like the size_t vs uint64_t setup.	2023-09-14 15:49:52 -07:00
Daniil Dudkin	331ebb0783	[mlir][arith] Add LLVM lowering for `maxnumf`, `minnumf` ops (#66431 ) This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. The commit addresses the task 1.4 of the RFC by adding LLVM lowering to the corresponding LLVM intrinsics. Please note: this PR is part of a stack of patches and depends on #66429.	2023-09-15 01:17:05 +03:00
Fabian Mora	5093413a50	[mlir][gpu][NVPTX] Enable NVIDIA GPU JIT compilation path (#66220 ) This patch adds an NVPTX compilation path that enables JIT compilation on NVIDIA targets. The following modifications were performed: 1. Adding a format field to the GPU object attribute, allowing the translation attribute to use the correct runtime function to load the module. Likewise, a dictionary attribute was added to add any possible extra options. 2. Adding the `createObject` method to `GPUTargetAttrInterface`; this method returns a GPU object from a binary string. 3. Adding the function `mgpuModuleLoadJIT`, which is only available for NVIDIA GPUs, as there is no equivalent for AMD. 4. Adding the CMake flag `MLIR_GPU_COMPILATION_TEST_FORMAT` to specify the format to use during testing.	2023-09-14 18:00:27 -04:00
Daniil Dudkin	6f4a528698	[mlir][memref] Use dedicated ops in `AtomicRMWOpConverter` (#66437 ) This patch refactors the `AtomicRMWOpConverter` class to use the dedicated operations from Arith dialect instead of using `cmpf` + `select` pattern. Also, a test for `minimumf` kind of `atomic_rmw` has been added.	2023-09-15 00:52:35 +03:00
Daniil Dudkin	ca8cba76f9	[mlir][arith] Introduce `minnumf` and `maxnumf` operations (#66429 ) This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. Here we introduce new operations for floating-point numbers: `minnum` and `maxnum`. These operations have different semantics than `minumumf` and `maximumf` ops. They follow the eponymous LLVM intrinsics semantics, which differs in the handling positive and negative zeros and NaNs. This patch addresses the 1.3 task from the RFC.	2023-09-15 00:49:58 +03:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Jakub Kuderski	12175bcbce	[mlir][spirv] Support coop matrix in `spirv.CompositeConstruct` (#66399 ) Also improve the documentation (code and website).	2023-09-14 16:57:59 -04:00
Christopher Bate	cafb6284d1	[mlir][VectorToGPU] Update memref stride preconditions on `nvgpu.mma.sync` path This change removes the requirement that the row stride be statically known when converting `vector.transfer_read` and `vector.transfer_write` to distributed SIMT operations in the `nvgpu` lowering path. It also adds a check to verify that the last dimension of the source memref is statically known to have stride 1 since this is assumed in the conversion logic. No other change should be required since the generated `vector.load` operations are never created across dimensions other than the last. The routines for checking preconditions on `vector.transfer_read/write` are moved to under nvgpu utilities. The change is NFC with respect to the GPU dialect lowering path. Reviewed By: ThomasRaoux Differential Revision: https://reviews.llvm.org/D155753	2023-09-14 13:51:42 -06:00
Christopher Bate	e2d39f799b	[mlir][Transform] Add `updateConversionTarget` to `ConversionPatternDescriptorOpInterface` This change adds a method to modify the ConversionTarget used during `transform.apply_conversion_patterns` to the `ConversionPatternDescriptorOpInterface`. This is needed when the TypeConverter is used to dictate the dynamic legality of operations, as in "structural" conversion patterns present in, for example, the SCF and func dialects. As a first use case/test, this change also adds a `transform.apply_patterns.scf.structural_conversions` operation to the SCF dialect. Reviewed By: springerm Differential Revision: https://reviews.llvm.org/D158672	2023-09-14 11:39:47 -06:00
Aart Bik	156a4ba9b4	[mlir][sparse] deprecate the convert{To,From}MLIRSparseTensor methods (#66304 ) Rationale: These libraries provided COO input and output at external boundaries which, since then, has been generalized to the much more powerful pack and unpack operations of the sparse tensor dialect.	2023-09-14 10:02:29 -07:00
Matthias Springer	aca9019be0	[mlir][transform] Check for invalidated iterators on payload IR mappings (#66369 ) Add extra error checking (in debug mode) to detect cases where an iterator on "direct" payload IR mappings is invalidated (due to elements being removed). Such errors are hard to debug: they are often non-deterministic; sometimes the program crashes, sometimes it produces wrong results. Even when it crashes, the stack trace often points to completely unrelated code locations. Store a timestamp with each "direct" mapping. The timestamp is increased whenever an operation is performed that invaldiates an iterator on that mapping. A debug iterator is added that checks the timestamp as payload IR is enumerated.	2023-09-14 16:34:32 +02:00
Martin Erhart	66aa9a2517	[mlir][bufferization] Implement BufferDeallocationopInterface for scf.forall.in_parallel (#66351 ) The scf.forall.in_parallel terminator operation has a nested graph region with the NoTerminator trait. Such regions are not supported by the default implementations. Therefore, this commit adds a specialized implementation for this operation which only covers the case where the nested region is empty. This is because after bufferization, ops like tensor.parallel_insert_slice were already converted to memref operations residing int the scf.forall only and the nested region of scf.forall.in_parallel ends up empty.	2023-09-14 16:20:24 +02:00
Matthias Springer	4f63252d5d	[mlir][transform] Fix crash when op is erased during transform.foreach (#66357 ) Fixes a crash when an op, that is mapped to handle that a `transform.foreach` iterates over, was erased (through the `TrackingRewriter`). Erasing an op removes it from all mappings and invalidates iterators. This is already taken care of when an op is iterating over payload ops in its `apply` method, but not when another transform op is erasing a tracked payload op.	2023-09-14 14:59:36 +02:00
Martin Erhart	942ce31985	[mlir][bufferization] BufferDeallocationOpInterface: support custom ownership update logic (#66350 ) Add a method to the BufferDeallocationOpInterface that allows operations to implement the interface and provide custom logic to compute the ownership indicators of values it defines. As a demonstrating example, this new method is implemented by the `arith.select` operation.	2023-09-14 14:34:04 +02:00
Martin Erhart	8160bce969	[mlir][bufferization][NFC] Introduce BufferDeallocationOpInterface (#66349 ) This new interface allows operations to implement custom handling of ownership values and insertion of dealloc operations which is useful when an op cannot implement the interfaces supported by default by the buffer deallocation pass (e.g., because they are not exactly compatible or because there are some additional semantics to it that would render the default implementations in buffer deallocation invalid, or because no interfaces exist for this kind of behavior and it's not worth introducing one plus a default implementation in buffer deallocation). Additionally, it can also be used to provide more efficient handling for a specific op than the interface based default implementations can.	2023-09-14 13:58:30 +02:00
Felix Schneider	c336a06144	[mlir] [memref] Fix alignment bug in memref.copy lowering memref.copy gets lowered to a function call sometimes, this function is passed the element size of the memref in bytes as an argument. The element size passed to the copyMemRef() function call can be miscalculated if the LLVM IR uses aligned access to the memory. This can be fixed by using llvm.getelementptr to calculate the element size natively. This is also done in the other lowering path that lowers to an intrinsic. Fix https://github.com/llvm/llvm-project/issues/64072 Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D156126	2023-09-14 13:18:12 +02:00
Felix Schneider	55088efe06	[mlir][memref] Fix MemrefToLLVM lowering pattern for memref.transpose The lowering pattern to LLVM for memref.transpose has a bug where instead of transposing from (source) -> (dest) it actually transposes (dest) -> (source). This patch fixes the bug and updates the test. Fix https://github.com/llvm/llvm-project/issues/65145 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D159290	2023-09-14 13:12:55 +02:00
Martin Erhart	01334d1abb	[mlir][bufferization] Add an ownership based buffer deallocation pass (#66337 ) Add a new Buffer Deallocation pass with the intend to replace the old one. For now it is added as a separate pass alongside in order to allow downstream users to migrate over gradually. This new pass has the goal of inserting fewer clone operations and supporting additional use-cases. Please refer to the Buffer Deallocation section in the updated Bufferization.md file for more information on how this new pass works.	2023-09-14 12:13:37 +02:00
Sergio Afonso	9058762789	[OpenMP][Flang][MLIR] Lowering of requires directive from MLIR to LLVM IR Default atomic ordering information is processed in the OpenMP dialect to LLVM IR lowering stage at every spot where an operation can be affected by it. The rest of clauses are stored globally in the OpenMPIRBuilderConfig object before starting that lowering stage, so that the OMPIRBuilder can conditionally modify code generation depending on these. At the end of the process, the omp.requires attribute is itself lowered into a global constructor that passes these clauses as flags to the OpenMP runtime. Depends on D147217, D147218 and D158278. Differential Revision: https://reviews.llvm.org/D147219	2023-09-14 10:35:44 +01:00
Sergio Afonso	094a63a20b	[OpenMP][OMPIRBuilder] OpenMPIRBuilder support for requires directive This patch updates the `OpenMPIRBuilderConfig` structure to hold all available 'requires' clauses, and it replicates part of the code generation for the 'requires' registration function from clang in the `OMPIRBuilder`, to be used with flang. Porting the rest of features of the clang implementation to the IRBuilder and sharing it between clang and flang remains for a future patch, due to the complexity of the logic selecting the attributes of the generated registration function. Differential Revision: https://reviews.llvm.org/D147217	2023-09-14 10:33:54 +01:00
vic	a9d0f5e2f0	[mlir] Allow loop-like operations in `AbstractDenseForwardDataFlowAnalysis` (#66179 ) Remove assertion violated by loop-like operations. Signed-off-by: Victor Perez <victor.perez@codeplay.com>	2023-09-14 10:30:40 +02:00
Matthias Springer	a1ef5a9437	[mlir][bufferization] Empty tensor elimination based on SubsetOpInterface (#65766 ) This commit generalizes empty tensor elimination to operate on subset ops. No new test cases are added because all current subset ops were already supported previously. From this perspective, this change is NFC. A new interface method (and a helper method) are added to `SubsetInsertionOpInterface` to build the subset of the destination tensor.	2023-09-14 09:45:22 +02:00
Cullen Rhodes	f75d46a7ec	[mlir][ArmSME] Lower vector.outerproduct to FMOPA/BFMOPA (#65621 ) This patch adds support for lowering vector.outerproduct to the ArmSME MOPA intrinsic for the following types: vector<[8]xf16>, vector<[8]xf16> -> vector<[8]x[8]xf16> vector<[8]xbf16>, vector<[8]xbf16> -> vector<[8]x[8]xbf16> vector<[4]xf32>, vector<[4]xf32> -> vector<[4]x[4]xf32> vector<[2]xf64>, vector<[2]xf64> -> vector<[2]x[2]xf64> The FP variants are lowered to FMOPA (non-widening) [1] and BFloat to BFMOPA (non-widening) [2]. Note at the ISA level these variants are implemented by different architecture features, these are listed below: FMOPA (non-widening) * half-precision - +sme2p1,+sme-f16f16 * single-precision - +sme * double-precision - +sme-f64f64 BFMOPA (non-widening) * half-precision - +sme2p1,+b16b16 There's currently no way to target different features when lowering to ArmSME. Integration tests are added for F32 and F64. We use QEMU to run the integration tests but SME2 support isn't available yet, it's targeted for 9.0, so integration tests for these variants excluded. Masking is currently unsupported. Depends on #65450. [1] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/FMOPA--non-widening---Floating-point-outer-product-and-accumulate- [2] https://developer.arm.com/documentation/ddi0602/2023-06/SME-Instructions/BFMOPA--non-widening---BFloat16-floating-point-outer-product-and-accumulate-	2023-09-14 08:31:52 +01:00
Daniil Dudkin	8f5d519458	[mlir][vector] Implement Workaround Lowerings for Masked `fm**imum` Reductions This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. Within LLVM, there are no masked reduction counterparts for vector reductions such as `fmaximum` and `fminimum`. More information can be found here: https://github.com/llvm/llvm-project/issues/64940#issuecomment-1690694156. To address this issue in MLIR, where we need to generate appropriate lowerings for these cases, we employ regular non-masked intrinsics. However, we modify the input vector using the `arith.select` operation to effectively deactivate undesired elements using a "neutral mask value". The neutral mask value is the smallest possible value for the `fmaximum` reduction and the largest possible value for the `fminimum` reduction. Depends on D158618 Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158773	2023-09-13 22:49:08 +00:00
Daniil Dudkin	709b27427b	[mlir][vector] Bring back `maxf`/`minf` reductions This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. In line with the mentioned RFC, this patch tackles tasks 2.3 and 2.4. It adds LLVM conversions for the `maxf`/`minf` reductions to the non-NaN-propagating LLVM intrinsics. Depends on D158618 Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158659	2023-09-13 22:49:07 +00:00
Daniil Dudkin	4a831250b8	[mlir][vector] Rename vector reductions: `maxf` → `maximumf`, `minf` → `minimumf` This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. Here, we are addressing task 2.1 from the plan, which involves renaming the vector reductions to align with the semantics of the corresponding LLVM intrinsics. Reviewed By: dcaballe Differential Revision: https://reviews.llvm.org/D158618	2023-09-13 22:49:07 +00:00
Daniil Dudkin	c46a04339a	[mlir][arith] Rename `AtomicRMWKind`'s `maxf` → `maximumf`, `minf` → `minimumf` (#66135 ) This patch is part of a larger initiative aimed at fixing floating-point `max` and `min` operations in MLIR: https://discourse.llvm.org/t/rfc-fix-floating-point-max-and-min-operations-in-mlir/72671. This commit renames `maxf` and `minf` enumerators of `AtomicRMWKind` to better reflect the current naming scheme and the goals of the RFC.	2023-09-14 01:09:37 +03:00
Jakub Kuderski	d6d4a526f4	[mlir][spirv][gpu] Clean up wmma to coop matrix NV conversion. NFC. (#66278 ) This is a cleanup in preparation for adding a second conversion path using the KHR cooperative matrix extension. Make the existing lowering explicit about emitting ops from the NV coop matrix extension. Clean up surrounding code.	2023-09-13 15:51:26 -04:00
Peiming Liu	098f46dce3	[sparse] allow unpack op to return 0-ranked tensor type. (#66269 ) Many frontends canonicalize scalar into 0-ranked tensor, it change will hopefully make the operation easier to use for those cases.	2023-09-13 11:33:01 -07:00
Krzysztof Drewniak	df852599f3	[mlir] Split up VectorToLLVM pass Currently, the VectorToLLVM patterns are built into a library along with the corresponding pass, which also pulls in all the platform-specific vector dialects (like AMXDialect) to apply all the vector to LLVM conversions. This causes dependency bloat when writing libraries - for example the GPU to LLVM passes, which use the vector to LLVM patterns, don't need the X86Vector dialect to be present at all. This commit partitions the library into VectorToLLVM and VectorToLLVMPass, where the latter pulls in all the other vector transformations. Reviewed By: nicolasvasilache, mehdi_amini Differential Revision: https://reviews.llvm.org/D158287	2023-09-13 16:09:56 +00:00

1 2 3 4 5 ...

12895 Commits