Files
flash-attention-prebuild-wh…/doc/release_history.md
T
2026-01-30 18:14:33 +00:00

470 lines
16 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
## History
### v0.7.16
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.16)
#### Linux arm64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.9 | 12.6, 12.8, 13.0 |
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.6, 2.7, 2.8, 2.9 | 12.4, 12.6, 12.8, 13.0 |
#### Manylinux 2_24 x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.6, 2.7, 2.8, 2.9 | 12.6, 12.8, 13.0 |
#### Manylinux 2_34 arm64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.9 | 12.6, 12.8, 13.0 |
#### Manylinux2014 x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.6 | 12.4 |
### v0.7.13
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.13)
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10, 2.9 | 12.8, 13.0 |
### v0.7.15
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.15)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10 | 12.6 |
#### Manylinux 2_24 x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.10 | 12.6 |
### v0.7.11
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.11)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.8, 2.9 | 12.9, 13.1 |
#### Manylinux 2_24 x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13, 3.14 | 2.8, 2.9 | 12.9, 13.1 |
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.5, 2.6, 2.7, 2.8, 2.9 | 12.8 |
### v0.7.7
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.7)
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.8.3 | 3.10, 3.11, 3.13 | 2.5, 2.7, 2.8 | 12.8 |
### v0.7.6
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.6)
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --- | --- | --- | --- |
| 2.8.3 | 3.12 | 2.9 | 12.8 |
### v0.7.2
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.2)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------------- | ------------------ | ---------- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.5, 2.6, 2.7, 2.8 | 12.8, 12.9 |
#### Manylinux 2_24 x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------------- | ------------- | ---------- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.6, 2.7, 2.8 | 12.8, 12.9 |
#### Manylinux2014 x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------------- | ------- | ---- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.5 | 12.8 |
### v0.7.0
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.7.0)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------- | ------- | ---------- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12 | 2.9 | 12.8, 13.0 |
### v0.6.9
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.6.9)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ------ | ------- | ---- |
| 2.6.3 | 3.14 | 2.9 | 13.0 |
### v0.6.4
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.6.4)
#### Linux arm64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------------------ | ---------------- |
| 2.7.4, 2.8.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.9 | 12.4, 12.8, 13.0 |
### v0.6.3
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.6.3)
#### Linux arm64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------------------ | ---------------- |
| 2.6.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.9 | 12.4, 12.8, 13.0 |
### v0.5.4
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.5.4)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------- | ----------------------- | ---------------------- |
| 2.6.3, 2.7.4, 2.8.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.8, 2.9 | 12.4, 12.6, 12.8, 13.0 |
### v0.4.22
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.22)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------------- | ------- | ---------- |
| 2.8.1 | 3.10, 3.11, 3.12, 3.13 | 2.9 | 12.8, 13.0 |
### v0.4.18
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.18)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------------- | ------- | ---- |
| 2.6.3, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.9 | 13.0 |
### v0.4.17
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.17)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------------- | ------- | ---------- |
| 2.6.3, 2.8.3 | 3.10, 3.11, 3.12, 3.13 | 2.9 | 12.6, 12.8 |
### v0.4.16
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.16)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ------ | ------------------ | ---------- |
| 2.6.3, 2.8.3 | 3.9 | 2.5, 2.6, 2.7, 2.8 | 12.4, 12.6 |
### v0.4.15
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.15)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------- | ---------- |
| 2.8.3 | 3.11, 3.12, 3.13 | 2.9 | 12.6, 12.8 |
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------- | ---- |
| 2.8.3 | 3.11, 3.12, 3.13 | 2.9 | 12.6 |
### v0.4.12
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.12)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ------ | ------------- | ---------------------- |
| 2.8.3 | 3.13 | 2.6, 2.7, 2.8 | 12.4, 12.6, 12.8, 12.9 |
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ------ | ------------- | ---------- |
| 2.8.2 | 3.13 | 2.6, 2.7, 2.8 | 12.4, 12.6 |
### v0.4.11
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.11)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------------------ | ---------------------- |
| 2.8.3 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.8 | 12.4, 12.6, 12.8, 12.9 |
### v0.4.10
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.10)
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | -------- | ---- |
| 2.7.4, 2.8.2 | 3.10, 3.11, 3.12 | 2.7, 2.8 | 12.8 |
### v0.4.9
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.4.9)
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ------ | ------- | ---- |
| 2.7.4 | 3.11 | 2.7 | 12.8 |
### v0.3.18
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.18)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------------------ | ---------------- |
| 2.7.4 | 3.10, 3.11, 3.12 | 2.5, 2.6, 2.7, 2.8 | 12.4, 12.8, 12.9 |
### v0.3.14
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.14)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | -------------------------- | ---------------------- |
| 2.6.3, 2.8.2 | 3.10, 3.11, 3.12 | 2.5.1, 2.6.0, 2.7.1, 2.8.0 | 12.4.1, 12.8.1, 12.9.1 |
### v0.3.13
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.13)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | -------------------------- | ------ |
| 2.8.1 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0, 2.7.1 | 12.8.1 |
### v0.3.12
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.12)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | -------------------------- | -------------- |
| 2.8.0 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0, 2.7.1 | 12.4.1, 12.8.1 |
### v0.3.10
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.10)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ------- | ------ |
| 2.7.4 | 3.10, 3.11, 3.12 | 2.7.1 | 12.8.1 |
### v0.3.9
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.9)
#### Linux x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------- | ------- | ------ |
| 2.4.3, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.7.1 | 12.8.1 |
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------- | ------------------- | ------ |
| 2.5.9, 2.6.3, 2.7.4 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0 | 12.4.1 |
> [!IMPORTANT]
> ⚠️ Building flash-attn v2.7.4 with CUDA 12.8 on Windows cannot be completed because of GitHub Actions processing-time limits. In the future, I plan to add a self-hosted Windows runner to resolve this issue.
### v0.3.1
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.3.1)
#### Windows x86_64
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ------ | ------- | ------ |
| 2.6.3 | 3.11 | 2.6.0 | 12.6.3 |
From this version, Wheels for Windows are released.
However, we are waiting for a report on how it works because we have not tested it enough.
### v0.2.1
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.2.1)
| Flash-Attention | Python | PyTorch | CUDA |
| -------------------------- | ---------------- | ----------------- | ------ |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4 | 3.10, 3.11, 3.12 | 2.8.0.dev20250523 | 12.8.1 |
### v0.2.0
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.2.0)
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------- | ----------------- | ------ |
| 2.4.3, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.8.0.dev20250523 | 12.8.1 |
### v0.1.0
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.1.0)
| Flash-Attention | Python | PyTorch | CUDA |
| -------------------------- | ---------------- | ------- | ------ |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4 | 3.10, 3.11, 3.12 | 2.7.0 | 12.8.1 |
v2.7.4 and v2.7.4.post1 are the same version.
From this release, self-hosted runners are used for building some wheels.
### v0.0.9
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.9)
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------- | ---------------- | ------- | ------ |
| 2.4.3, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.7.0 | 12.8.1 |
### v0.0.8
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.8)
| Flash-Attention | Python | PyTorch | CUDA |
| -------------------------------- | ---------------- | -------------------------- | ---------------------- |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 | 3.10, 3.11, 3.12 | 2.4.1, 2.5.1, 2.6.0, 2.7.0 | 11.8.0, 12.4.1, 12.6.3 |
### v0.0.7
Skip for experimental reasons.
### v0.0.6
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.6)
| Flash-Attention | Python | PyTorch | CUDA |
| -------------------------------- | ---------------- | --------------------------------- | -------------- |
| 2.4.3, 2.5.9, 2.6.3, 2.7.4.post1 | 3.10, 3.11, 3.12 | 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 | 12.4.1, 12.6.3 |
### v0.0.5
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.5)
| Flash-Attention | Python | PyTorch | CUDA |
| ------------------ | ---------------- | ----------------------------------------------- | -------------- |
| 2.6.3, 2.7.4.post1 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1, 2.6.0 | 12.4.1, 12.6.3 |
### v0.0.4
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.4)
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ---------------------------------------- | ---------------------- |
| 2.7.3 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 | 11.8.0, 12.1.1, 12.4.1 |
### v0.0.3
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.3)
| Flash-Attention | Python | PyTorch | CUDA |
| --------------- | ---------------- | ---------------------------------------- | ---------------------- |
| 2.7.2.post1 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 | 11.8.0, 12.1.1, 12.4.1 |
### v0.0.2
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.2)
| Flash-Attention | Python | PyTorch | CUDA |
| -------------------------------- | ---------------- | ---------------------------------------- | ---------------------- |
| 2.4.3, 2.5.6, 2.6.3, 2.7.0.post2 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.1 | 11.8.0, 12.1.1, 12.4.1 |
### v0.0.1
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.1)
| flash-attention | Python | PyTorch | CUDA |
| --------------------------------- | ---------------- | ---------------------------------------- | ---------------------- |
| 1.0.9, 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.10, 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 |
### v0.0.0
[Release](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/tag/v0.0.0)
| flash-attention | Python | PyTorch | CUDA |
| -------------------------- | ---------- | ---------------------------------------- | ---------------------- |
| 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 |