mirror of
https://github.com/capstone-engine/llvm-capstone.git
synced 2024-11-26 23:21:11 +00:00
12c241b365
The current implementation of `nvvm.wgmma.mma_async` Op deduces the data type of the output matrix from the data type of struct member, which can be non-intuitive, especially in cases where types like `2xf16` are packed into `i32`. This PR addresses this issue by improving the Op to include an explicit data type for the output matrix. The modified Op now includes an explicit data type for Matrix-D (<f16>), and looks as follows: ``` %result = llvm.mlir.undef : !llvm.struct<(struct<(i32, i32, ... nvvm.wgmma.mma_async %descA, %descB, %result, #nvvm.shape<m = 64, n = 32, k = 16>, D [<f16>, #nvvm.wgmma_scale_out<zero>], A [<f16>, #nvvm.wgmma_scale_in<neg>, <col>], B [<f16>, #nvvm.wgmma_scale_in<neg>, <col>] ``` |
||
---|---|---|
.. | ||
dialects | ||
integration | ||
ir | ||
lib | ||
CMakeLists.txt | ||
develoment_files.py | ||
execution_engine.py | ||
lit.local.cfg | ||
pass_manager.py | ||
python_test_ops.td |