llvm-capstone

mirror of https://github.com/capstone-engine/llvm-capstone.git synced 2025-02-08 10:30:09 +00:00

Author	SHA1	Message	Date
Aart Bik	3635c74375	[mlir][gpu][sparse] gracefully accept zero size allocation (#66127 ) This cleans up a unnecessary code that changes zero size allocation to avoid the following error message 'cuMemAlloc(&ptr, sizeBytes)' failed with 'CUDA_ERROR_INVALID_VALUE'	2023-09-12 13:07:24 -07:00
Guray Ozen	1dc0071216	[MLIR] Guard Cuda 12.0+ newer driver APIs with CUDA_VERSION macro checks Fixes #64529 https://github.com/llvm/llvm-project/issues/64529 Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D159440	2023-09-06 08:17:06 +02:00
Aart Bik	9ce445b8c7	[mlir][sparse] simplification of sparse runtime support lib Incorporated two header files directly into other since other parts were used (and it makes it hard to find the definitions). Removed TODOs that are less likely to be done. Reviewed By: yinying-lisa-li Differential Revision: https://reviews.llvm.org/D159381	2023-09-01 14:00:19 -07:00
Mehdi Amini	471004c5c9	Revert "[mlir][sparse] simplification of sparse runtime support lib" This reverts commit 14c58cf5c39a39a335893bc98493c5edc75a91b3. The gcc7 build is broken.	2023-09-01 11:50:14 -07:00
Aart Bik	14c58cf5c3	[mlir][sparse] simplification of sparse runtime support lib Incorporated two header files directly into other since other parts were used (and it makes it hard to find the definitions). Removed TODOs that are less likely to be done. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D159330	2023-09-01 09:28:48 -07:00
Aart Bik	b86d3cbc12	[mlir][sparse] complete various FIXMEs in sparse support lib Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D159245	2023-08-30 21:30:25 -07:00
Peiming Liu	fa6726e27b	[mlir][sparse] supports sparse_tensor.pack on libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D158012	2023-08-15 20:20:54 +00:00
Aart Bik	289f7231f9	[mlir][sparse][gpu] minor code cleanup for sparse gpu ops Consistent order of ops and related methods. Also, renamed SpGEMMGetSizeOp to SpMatGetSizeOp since this is a general utility for sparse matrices, not specific to GEMM ops only. Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D157922	2023-08-14 15:08:57 -07:00
Aart Bik	95a6c509c9	[mlir][sparse][gpu] add set csr pointers, remove estimate op, fix bugs Rationale: Since we only support default algorithm for SpGEMM, we can remove the estimate op (for now at least). This also introduces the set csr pointers op, and fixes a few bugs in the existing lowering for the SpGEMM breakdown. This revision paves the way for actual recognition of SpGEMM in the sparsifier. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D157645	2023-08-10 13:52:47 -07:00
Aart Bik	e7e4ed0d7a	[mlir][sparse][gpu] only support default algorithm for SpGEMM Rationale: This is the approach taken for all the others too (SpMV, SpMM, SDDMM), so it is more consistent to follow the same path (until we have a need for more algorithms). Also, in a follow up revision, this will allow us to remove some unused GEMM ops. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D157542	2023-08-09 12:49:47 -07:00
Kun Wu	0664db5425	[mlir][sparse][gpu] fix spgemm runtime compile error Differential Revision: https://reviews.llvm.org/D157349	2023-08-08 01:37:31 +00:00
Kun Wu	dfe2942909	[mlir][sparse][gpu] add spgemm operator Differential Revision: https://reviews.llvm.org/D152981	2023-08-08 00:29:23 +00:00
Krzysztof Drewniak	2f8d8c78a8	Generalize finding the rocm_agent_enumberator program On Fedora, rocminfo is a fedora package and rocm_agent_enumberator is installed to /usr/bin. This causes this error when building. CMake Error at external/llvm-project/mlir/lib/ExecutionEngine/CMakeLists.txt:232 (message): Could not run rocm_agent_enumerator and ROCM_TEST_CHIPSET is not defined So use find_program() to look for rocm_agent_enumerator instead of assuming a single location. Signed-off-by: Tom Rix <trix@redhat.com> Reviewed By: krzysz00 Differential Revision: https://reviews.llvm.org/D156826	2023-08-02 20:35:54 +00:00
Guray Ozen	53881490c2	[mlir][cuda runtime] Set Max Dynamic Shared Memory Attribute This works aims to address the issue related to larger shared memory usage in the MLIR CUDA runtime. Currently, when the shared memory usage exceeds 48KB, we need to set the CU_FUNC_ATTRIBUTE_MAX_DYNAMIC_SHARED_SIZE_BYTES attribute of the CUDA kernel appropriately. This work takes care of that by setting the attribute as required. Additionally, it includes some debug prints for better visibility and troubleshooting. Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D156874	2023-08-02 14:18:59 +02:00
Guray Ozen	19b1107963	[mlir][gpu] Add debug print with environment value This work introduces `MLIR_CUDA_DEBUG` environment value and `debug_print` function to be able to debug runtimes. Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D156232	2023-08-02 11:55:32 +02:00
Kun Wu	1e491c425b	[mlir][sparse][gpu] add 2:4 spmm prune_and_check flag Differential Revision: https://reviews.llvm.org/D155909	2023-08-01 18:24:18 +00:00
Guray Ozen	e56d6745f7	[mlir][nvgpu] Add `tma.create.descriptor` to create tensor map descriptor The Op creates a tensor map descriptor object representing tiled memory region. The descriptor is used by Tensor Memory Access (TMA). The `tensor` is the source tensor to be tiled. The `boxDimensions` is the size of the tiled memory region in each dimension. The pattern here lowers `tma.create.descriptor` to a runtime function call that eventually calls calls CUDA Driver's `cuTensorMapEncodeTiled`. For more information see below: https://docs.nvidia.com/cuda/cuda-driver-api/group__CUDA__TENSOR__MEMORY.html Depends on D155453 Reviewed By: nicolasvasilache Differential Revision: https://reviews.llvm.org/D155680	2023-07-21 11:33:04 +02:00
Aart Bik	4df01dc270	[mlir][sparse][gpu][nvidia] add pruning step and check to 2:4 matrix multiplication (1) without the check, the results may silently be wrong, so check is needed (2) add pruning step to guarantee 2:4 property Note, in the longer run, we may want to split out the pruning step somehow, or make it optional. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155320	2023-07-14 12:08:13 -07:00
Aart Bik	97678cec1b	[mlir][sparse][gpu] remove zero init memset avoids quite a big memory fill for each setup Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D155251	2023-07-13 18:22:21 -07:00
Aart Bik	86eff489e7	[mlir][sparse][gpu] force 16-byte alignment on data structs for cuSparseLt Also makes some minor consistency edits in the cuSparseLt wrapper lib. Reviewed By: Peiming, K-Wu Differential Revision: https://reviews.llvm.org/D155139	2023-07-13 10:45:15 -07:00
Adrian Kuegel	f250fbcbbb	[mlir] Apply ClangTidy fix (NFC) The return statement is redundant.	2023-07-10 11:46:32 +02:00
Aart Bik	03125e6894	[mlir][sparse][gpu] fix missing dealloc This dealloc was incorrectly removed in https://reviews.llvm.org/D153173 Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D154564	2023-07-06 09:48:19 -07:00
Cullen Rhodes	fb27d542b7	[mlir-cpu-runner] Check entry function is void Currently crashes if function isn't void when specifiying '-entry-point-result=void'. Reviewed By: jpienaar Differential Revision: https://reviews.llvm.org/D154352	2023-07-04 07:25:16 +00:00
Kun Wu	be2dd22b8f	[mlir][sparse][gpu] reuse CUDA environment handle throughout instance lifetime Differential Revision: https://reviews.llvm.org/D153173	2023-06-30 21:52:34 +00:00
Kun Wu	7a3ebba9cb	[mlir][sparse][gpu] Add explaining string to three static_assert stmts Differential Revision: https://reviews.llvm.org/D154243	2023-06-30 14:10:45 -05:00
Job Noorman	8de9f2b558	Move SubtargetFeature.h from MC to TargetParser SubtargetFeature.h is currently part of MC while it doesn't depend on anything in MC. Since some LLVM components might have the need to work with target features without necessarily needing MC, it might be worthwhile to move SubtargetFeature.h to a different location. This will reduce the dependencies of said components. Note that I choose TargetParser as the destination because that's where Triple lives and SubtargetFeatures feels related to that. This issues came up during a JITLink review (D149522). JITLink would like to avoid a dependency on MC while still needing to store target features. Reviewed By: MaskRay, arsenm Differential Revision: https://reviews.llvm.org/D150549	2023-06-26 11:20:08 +02:00
Mitch Phillips	3c3ebebca2	Revert "[mlir][RunnerUtils] Make symbols private + implement loading mechanism." This reverts commit bba2b656110209a3d9863b92c060082479b06ab1. Reason: Broke the HWASan buildbot. See https://reviews.llvm.org/D153250 for more information.	2023-06-22 01:01:19 +02:00
Mehdi Amini	7b4ea67f31	Revert "[mlir][CRunnerUtils] Use explicit execution engine symbol registration." This reverts commit 9119325a5666e557a19f38a05525578b556c215b. A buildbot is broken, probably because of this change breaking the SHARED_LIBS=ON build more.	2023-06-21 17:50:18 +02:00
Ingo Müller	9119325a56	[mlir][CRunnerUtils] Use explicit execution engine symbol registration. As a follow up of https://reviews.llvm.org/D153250, this path uses the explicit symbol registration mechanism of the execution engine in the CRunnerUtils library. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D153354	2023-06-21 09:44:10 +00:00
Ingo Müller	bba2b65611	[mlir][RunnerUtils] Make symbols private + implement loading mechanism. There are two ways to make symbols from a shared library visible in the execution engine: exporting the symbols with public visibility or implementing a loading/unloading mechansim that registers the exported symbols explicitly. The latter has only been available in the JIT runner until recently, but https://reviews.llvm.org/D153029 makes it available in any usage of the execution engine (including the Python bindings). This patch makes the runner utils library use the latter mechanism instead of the former, i.e., it makes all of its symbols private and implements the init/destroy functions of the loading mechanism to control explicitly which symbols it registers. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D153250	2023-06-20 19:28:33 +00:00
Ingo Müller	f9bce19e2e	[mlir][async] Mark exported symbols of runtime lib as visible. The async runtime library explicitly registers the symbols it exports with the loading mechanism of the execution engine. This even works even though these symbols were marked as hidden in the library. However, if used outside the execution engine, such as with `lli --dlopen` or if AOT compiled, these hidden symbols would not be found. This patch thus marks all symbols that are part of the API as visible. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D153348	2023-06-20 19:27:47 +00:00
Kun Wu	632ccc538c	[mlir][sparse][gpu] remove tuple as one of the spmm_buffer_size output type Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D153188	2023-06-19 15:57:50 +00:00
Ingo Müller	0eb0fecbc5	[mlir][ExecutionEngine] Only load JITDyLibs without init/destroy funcs. In https://reviews.llvm.org/D153029, I moved the loading/unloading mechanisms of shared libraries from the JIT runner to the execution engine in order to make that mechanism available in the latter (including its Python bindings). However, I realized that I introduced a small change in semantic: previously, the JIT runner checked for the presence of init/destroy functions and only loaded the library as JITDyLib if they were not present. After I moved the code, all libraries were loaded as JITDyLib, even if they registered their symbols explicitly in their init function. I am not sure if this is really a problem but (1) the previous behavior was different and (2) I guess it could cause a problem if some symbols are exported through the init function and have public visibility. This patch reestablishes the original behaviour in the new place of the code. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D153249	2023-06-19 07:38:51 +00:00
Ingo Müller	530db6a3b4	[mlir][async] Update comments about library registration. (NFC) This updates the code comments about the library registration mechanism, which changed in https://reviews.llvm.org/D153029, and which should have updated as part of that patch. Reviewed By: ingomueller-net Differential Revision: https://reviews.llvm.org/D153147	2023-06-16 18:03:53 +00:00
Ingo Müller	0b3841eb97	[mlir] Move symbol loading from mlir-cpu-runner to ExecutionEngine. Both the mlir-cpu-runner and the execution engine allow to provide a list of shared libraries that should be loaded into the process such that the jitted code can use the symbols from those libraries. The runner had implemented a protocol that allowed libraries to control which symbols it wants to provide in that context (with a function called __mlir_runner_init). In absence of that, the runner would rely on the loading mechanism of the execution engine, which didn't do anything particular with the symbols, i.e., only symbols with public visibility were visible to jitted code. Libraries used a mix of the two mechanisms: while the runner utils and C runner utils libs (and potentially others) used public visibility, the async runtime lib (as the only one in the monorepo) used the loading protocol. As a consequence, the async runtime library could not be used through the Python bindings of the execution engine. This patch moves the loading protocol from the runner to the execution engine. For the runner, this should not change anything: it lets the execution engine handle the loading which now implements the same protocol that the runner had implemented before. However, the Python binding now get to benefit from the loading protocol as well, so the async runtime library (and potentially other out-of-tree libraries) can now be used in that context. Reviewed By: mehdi_amini Differential Revision: https://reviews.llvm.org/D153029	2023-06-16 14:50:14 +00:00
Kun Wu	9167dd46ba	[mlir][sparse][gpu] recognizing sddmm pattern in GPU libgen path Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151582	2023-06-15 23:48:11 +00:00
Kun Wu	ac30f48e37	[mlir][sparse][gpu]fix various cusparseLt bugs Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D152489	2023-06-12 23:48:49 +00:00
Navdeep Katel	18cc07aa07	[MLIR][GPU] Add 16-bit version of cudaMemset in cudaRuntimeWrappers Add 16-bit version of cudaMemset in cudaRuntimeWrappers and update the GPU to LLVM lowering. Reviewed By: bondhugula Differential Revision: https://reviews.llvm.org/D151642	2023-06-08 17:33:26 +05:30
Aart Bik	50db4789a8	[mlir][sparse][gpu] refined build setup for cusparse Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D152387	2023-06-07 11:09:22 -07:00
Kun Wu	8ed59c53de	[mlir][sparse][gpu] add sm8.0+ tensor core 2:4 sparsity support Reviewed By: aartbik Differential Revision: https://reviews.llvm.org/D151775	2023-06-06 23:13:21 +00:00
Aart Bik	9fc02a7a08	[mlir][sparse][gpu] add AoS COO support to cuSPARSE Even though this feature was deprecated in release 11.2, any library before this version still supports the feature, which is why we are making it available under a macro. Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D152290	2023-06-06 12:32:46 -07:00
Mehdi Amini	53a5c3ab4d	Fix MLIR build with Shared lib enabled db7cc0348c9a0 added OpenMP support to the execution engine but didn't add the explicit CMake dependency.	2023-06-05 12:10:24 -07:00
Kun Wu	7e44f0736a	[mlir][gpu][sparse] fix broken type in cusparseCreateCsr Differential Revision: https://reviews.llvm.org/D151912	2023-06-01 18:06:09 +00:00
Kun Wu	be6c532005	[mlir][sparse][gpu] fixing broken literal names in cuda runner macros Differential Revision: https://reviews.llvm.org/D151910	2023-06-01 17:52:58 +00:00
Kun Wu	cc402de0b1	[mlir][sparse][gpu] add result type to spmv and spmm gpu libgen path Differential Revision: https://reviews.llvm.org/D151592	2023-06-01 17:17:40 +00:00
Mehdi Amini	b936816fb3	MLIR/Cuda: Add the appropriate "HINTS" on CMake find_library and mark these REQUIRED The cmake logic to find cuda paths exposes some paths to search for the cuda library, we need to propagate this through the call for find_library. This was already done for cuSparse but not for cuda. Differential Revision: https://reviews.llvm.org/D151645	2023-05-29 14:32:24 -07:00
Uday Bondhugula	53be2e0f59	[MLIR] NFC. JitRunner - use range-based for Address clang-tidy warning in JitRunner.cpp. Use range-based for.	2023-05-29 01:54:50 +05:30
Aart Bik	752c04777f	[mlir][sparse][gpu] fix merge conflict Reviewed By: K-Wu Differential Revision: https://reviews.llvm.org/D151619	2023-05-27 13:42:20 -07:00
Kun Wu	cf44847b4d	[mlir][gpu][sparse] adding cusparse sddmm support Differential Revision: https://reviews.llvm.org/D151279	2023-05-27 20:01:41 +00:00
Aart Bik	74e29d3715	[mlir][sparse][gpu] fix merge conflict Reviewed By: Peiming Differential Revision: https://reviews.llvm.org/D151574	2023-05-26 11:00:20 -07:00

1 2 3 4 5 ...

500 Commits