## History ### v0.7.16 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.16) #### Linux arm64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.9 | 12.6, 12.8, 13.0 | #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.6, 2.7, 2.8, 2.9 | 12.4, 12.6, 12.8, 13.0 | #### Manylinux 2_24 x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.6, 2.7, 2.8, 2.9 | 12.6, 12.8, 13.0 | #### Manylinux 2_34 arm64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.9 | 12.6, 12.8, 13.0 | #### Manylinux2014 x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.6 | 12.4 | ### v0.7.13 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.13) #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.9 | 12.8, 13.0 | ### v0.7.15 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.15) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10 | 12.6 | #### Manylinux 2_24 x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10 | 12.6 | ### v0.7.11 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.11) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.8, 2.9 | 12.9, 13.1 | #### Manylinux 2_24 x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.8, 2.9 | 12.9, 13.1 | #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.5, 2.6, 2.7, 2.8, 2.9 | 12.8 | ### v0.7.7 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.7) #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.8.3 | 3.10, 3.11, 3.13 | 2.5, 2.7, 2.8 | 12.8 | ### v0.7.6 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.6) #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --- | --- | --- | --- | | 2.8.3 | 3.12 | 2.9 | 12.8 | ### v0.7.2 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.2) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------------- | ------------------ | ---------- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.5, 2.6, 2.7, 2.8 | 12.8, 12.9 | #### Manylinux 2_24 x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------------- | ------------- | ---------- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.6, 2.7, 2.8 | 12.8, 12.9 | #### Manylinux2014 x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------------- | ------- | ---- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.5 | 12.8 | ### v0.7.0 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.0) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------- | ------- | ---------- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12 | 2.9 | 12.8, 13.0 | ### v0.6.9 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.6.9) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ------ | ------- | ---- | | 2.6.3 | 3.14 | 2.9 | 13.0 | ### v0.6.4 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.6.4) #### Linux arm64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------------------ | ---------------- | | 2.7.4, 2.8.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.9 | 12.4, 12.8, 13.0 | ### v0.6.3 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.6.3) #### Linux arm64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------------------ | ---------------- | | 2.6.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.9 | 12.4, 12.8, 13.0 | ### v0.5.4 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.5.4) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------- | ----------------------- | ---------------------- | | 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.8, 2.9 | 12.4, 12.6, 12.8, 13.0 | ### v0.4.22 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.22) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------------- | ------- | ---------- | | 2.8.1 | 3.10, 3.11, 3.12, 3.13 | 2.9 | 12.8, 13.0 | ### v0.4.18 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.18) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------------- | ------- | ---- | | 2.6.3, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.9 | 13.0 | ### v0.4.17 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.17) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------------- | ------- | ---------- | | 2.6.3, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.9 | 12.6, 12.8 | ### v0.4.16 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.16) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ------ | ------------------ | ---------- | | 2.6.3, 2.8.3 | 3.9 | 2.5, 2.6, 2.7, 2.8 | 12.4, 12.6 | ### v0.4.15 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.15) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------- | ---------- | | 2.8.3 | 3.11, 3.12, 3.13 | 2.9 | 12.6, 12.8 | #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------- | ---- | | 2.8.3 | 3.11, 3.12, 3.13 | 2.9 | 12.6 | ### v0.4.12 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.12) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ------ | ------------- | ---------------------- | | 2.8.3 | 3.13 | 2.6, 2.7, 2.8 | 12.4, 12.6, 12.8, 12.9 | #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ------ | ------------- | ---------- | | 2.8.2 | 3.13 | 2.6, 2.7, 2.8 | 12.4, 12.6 | ### v0.4.11 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.11) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------------------ | ---------------------- | | 2.8.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.8 | 12.4, 12.6, 12.8, 12.9 | ### v0.4.10 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.10) #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | -------- | ---- | | 2.7.4, 2.8.2 | 3.10, 3.11, 3.12 | 2.7, 2.8 | 12.8 | ### v0.4.9 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.9) #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ------ | ------- | ---- | | 2.7.4 | 3.11 | 2.7 | 12.8 | ### v0.3.18 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.18) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------------------ | ---------------- | | 2.7.4 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.8 | 12.4, 12.8, 12.9 | ### v0.3.14 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.14) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | -------------------------- | ---------------------- | | 2.6.3, 2.8.2 | 3.10, 3.11, 3.12 | 2.5.1, 2.6.0, 2.7.1, 2.8.0 | 12.4.1, 12.8.1, 12.9.1 | ### v0.3.13 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.13) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | -------------------------- | ------ | | 2.8.1 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0, 2.7.1 | 12.8.1 | ### v0.3.12 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.12) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | -------------------------- | -------------- | | 2.8.0 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0, 2.7.1 | 12.4.1, 12.8.1 | ### v0.3.10 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.10) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ------- | ------ | | 2.7.4 | 3.10, 3.11, 3.12 | 2.7.1 | 12.8.1 | ### v0.3.9 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.9) #### Linux x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------- | ------- | ------ | | 2.4.3, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.7.1 | 12.8.1 | #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------- | ------------------- | ------ | | 2.5.9, 2.6.3, 2.7.4 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0 | 12.4.1 | > [!IMPORTANT] > ⚠️ Building flash-attn v2.7.4 with CUDA 12.8 on Windows cannot be completed because of GitHub Actions’ processing-time limits. In the future, I plan to add a self-hosted Windows runner to resolve this issue. ### v0.3.1 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.1) #### Windows x86_64 | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ------ | ------- | ------ | | 2.6.3 | 3.11 | 2.6.0 | 12.6.3 | From this version, Wheels for Windows are released. However, we are waiting for a report on how it works because we have not tested it enough. ### v0.2.1 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.2.1) | Flash-Attention | Python | PyTorch | CUDA | | -------------------------- | ---------------- | ----------------- | ------ | | 2.4.3, 2.5.9, 2.6.3, 2.7.4 | 3.10, 3.11, 3.12 | 2.8.0.dev20250523 | 12.8.1 | ### v0.2.0 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.2.0) | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------- | ----------------- | ------ | | 2.4.3, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.8.0.dev20250523 | 12.8.1 | ### v0.1.0 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.1.0) | Flash-Attention | Python | PyTorch | CUDA | | -------------------------- | ---------------- | ------- | ------ | | 2.4.3, 2.5.9, 2.6.3, 2.7.4 | 3.10, 3.11, 3.12 | 2.7.0 | 12.8.1 | v2.7.4 and v2.7.4.post1 are the same version. From this release, self-hosted runners are used for building some wheels. ### v0.0.9 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.9) | Flash-Attention | Python | PyTorch | CUDA | | ------------------- | ---------------- | ------- | ------ | | 2.4.3, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.7.0 | 12.8.1 | ### v0.0.8 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.8) | Flash-Attention | Python | PyTorch | CUDA | | -------------------------------- | ---------------- | -------------------------- | ---------------------- | | 2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0, 2.7.0 | 11.8.0, 12.4.1, 12.6.3 | ### v0.0.7 Skip for experimental reasons. ### v0.0.6 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.6) | Flash-Attention | Python | PyTorch | CUDA | | -------------------------------- | ---------------- | --------------------------------- | -------------- | | 2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 | 3.10, 3.11, 3.12 | 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 | 12.4.1, 12.6.3 | ### v0.0.5 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.5) | Flash-Attention | Python | PyTorch | CUDA | | ------------------ | ---------------- | ----------------------------------------------- | -------------- | | 2.6.3, 2.7.4.post1 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 | 12.4.1, 12.6.3 | ### v0.0.4 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.4) | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ---------------------------------------- | ---------------------- | | 2.7.3 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 | 11.8.0, 12.1.1, 12.4.1 | ### v0.0.3 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.3) | Flash-Attention | Python | PyTorch | CUDA | | --------------- | ---------------- | ---------------------------------------- | ---------------------- | | 2.7.2.post1 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 | 11.8.0, 12.1.1, 12.4.1 | ### v0.0.2 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.2) | Flash-Attention | Python | PyTorch | CUDA | | -------------------------------- | ---------------- | ---------------------------------------- | ---------------------- | | 2.4.3, 2.5.6, 2.6.3, 2.7.0.post2 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 | 11.8.0, 12.1.1, 12.4.1 | ### v0.0.1 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.1) | flash-attention | Python | PyTorch | CUDA | | --------------------------------- | ---------------- | ---------------------------------------- | ---------------------- | | 1.0.9, 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 | ### v0.0.0 [Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.0) | flash-attention | Python | PyTorch | CUDA | | -------------------------- | ---------- | ---------------------------------------- | ---------------------- | | 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 |