mirror of
https://github.com/capstone-engine/llvm-capstone.git
synced 2024-11-30 09:01:19 +00:00
763109e346
Setting thread block size with `maxntid` on the kernel has great performance benefits. In this way, downstream PTX compiler can do better register allocation. MLIR's `gpu.launch` and `gpu.launch_func` already has an attribute (`known_block_size`) that keeps the thread block size when it is known. This PR simply uses this attribute to set `maxntid`. |
||
---|---|---|
.. | ||
benchmark/python | ||
cmake/modules | ||
docs | ||
examples | ||
include | ||
lib | ||
python | ||
test | ||
tools | ||
unittests | ||
utils | ||
.clang-format | ||
.clang-tidy | ||
CMakeLists.txt | ||
LICENSE.TXT | ||
README.md |
Multi-Level Intermediate Representation
See https://mlir.llvm.org/ for more information.