mirror of
https://github.com/BillyOutlast/flash-attention-prebuild-wheels-rocm.git
synced 2026-07-01 01:27:54 -04:00
Update README.md
This commit is contained in:
@@ -15,11 +15,11 @@ The built packages are available on the [release page](https://github.com/mjun08
|
|||||||
## Table of Contents
|
## Table of Contents
|
||||||
|
|
||||||
- [Install](#install)
|
- [Install](#install)
|
||||||
|
- [Self-build runner](#self-build)
|
||||||
- [Packages](#packages)
|
- [Packages](#packages)
|
||||||
- [Linux x86_64](#linux-x86_64)
|
- [Linux x86_64](#linux-x86_64)
|
||||||
- [Windows x86_64](#windows-x86_64)
|
- [Windows x86_64](#windows-x86_64)
|
||||||
- [History](#history)
|
- [History](#history)
|
||||||
- [Self-build runner](#self-build-runner)
|
|
||||||
- [Original Repository](#original-repository)
|
- [Original Repository](#original-repository)
|
||||||
|
|
||||||
## Install
|
## Install
|
||||||
@@ -33,7 +33,7 @@ flash_attn-[flash_attn Version]+cu[CUDA Version]torch[PyTorch Version]-cp[Python
|
|||||||
flash_attn-2.6.3+cu124torch2.5-cp312-cp312-linux_x86_64.whl
|
flash_attn-2.6.3+cu124torch2.5-cp312-cp312-linux_x86_64.whl
|
||||||
```
|
```
|
||||||
|
|
||||||
2. Find the corresponding version of a wheel from the below table and [releases](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases)
|
2. Find the corresponding version of a wheel from the below [Package section](#packages) and [releases](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases)
|
||||||
|
|
||||||
3. Direct Install or Download and Local Install
|
3. Direct Install or Download and Local Install
|
||||||
|
|
||||||
@@ -46,6 +46,61 @@ wget https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/downlo
|
|||||||
pip install ./flash_attn-2.6.3+cu124torch2.5-cp312-cp312-linux_x86_64.whl
|
pip install ./flash_attn-2.6.3+cu124torch2.5-cp312-cp312-linux_x86_64.whl
|
||||||
```
|
```
|
||||||
|
|
||||||
|
## Self build
|
||||||
|
|
||||||
|
If you cannot find the version you are looking for, you can fork this repository and create a wheel on GitHub Actions.
|
||||||
|
|
||||||
|
1. Fork this repository
|
||||||
|
2. Edit workflow file [`.github/workflows/build.yml`](https://github.com/mjun0812/flash-attention-prebuild-wheels/blob/main/.github/workflows/build.yml) to set the version you want to build.
|
||||||
|
3. Add tag `v*.*.*` to trigger the build workflow.
|
||||||
|
|
||||||
|
Please note that depending on the combination of versions, it may not be possible to build.
|
||||||
|
|
||||||
|
### Self-Hosted Runner Build
|
||||||
|
|
||||||
|
In some version combinations, you cannot build wheels on GitHub-hosted runners due to job time limitations.
|
||||||
|
To build the wheels for these versions, you can use self-hosted runners.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
git clone https://github.com/mjun0812/flash-attention-prebuild-wheels.git
|
||||||
|
cd self-hosted-runner
|
||||||
|
cp env.template env
|
||||||
|
```
|
||||||
|
|
||||||
|
Edit `env` file to set the environment variables.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Edit env
|
||||||
|
PERSONAL_ACCESS_TOKEN=[Github Personal Access Token]
|
||||||
|
```
|
||||||
|
|
||||||
|
Edit compose.yml file if you use repository folked from this repository.
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
services:
|
||||||
|
runner:
|
||||||
|
privileged: true
|
||||||
|
build:
|
||||||
|
context: .
|
||||||
|
dockerfile: Dockerfile
|
||||||
|
args:
|
||||||
|
REPOSITORY_URL: [Target Repository URL]
|
||||||
|
PERSONAL_ACCESS_TOKEN: $PERSONAL_ACCESS_TOKEN
|
||||||
|
GH_RUNNER_VERSION: 2.324.0
|
||||||
|
RUNNER_NAME: self-hosted-runner
|
||||||
|
RUNNER_GROUP: default
|
||||||
|
RUNNER_LABELS: self-hosted
|
||||||
|
TARGET_ARCH: x64
|
||||||
|
```
|
||||||
|
|
||||||
|
Then, build and run the docker container.
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Build and run
|
||||||
|
docker compose build
|
||||||
|
docker compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
## Packages
|
## Packages
|
||||||
|
|
||||||
### Linux x86_64
|
### Linux x86_64
|
||||||
@@ -1129,59 +1184,6 @@ Skip for experimental reasons.
|
|||||||
| -------------------------- | ---------- | ---------------------------------------- | ---------------------- |
|
| -------------------------- | ---------- | ---------------------------------------- | ---------------------- |
|
||||||
| 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 |
|
| 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 |
|
||||||
|
|
||||||
## Self build runner
|
|
||||||
|
|
||||||
If you want to build the wheels yourself, you can fork this repository and run the build workflow.
|
|
||||||
|
|
||||||
1. Fork this repository
|
|
||||||
2. Edit workflow file `.github/workflows/build.yml` to set the version you want to build.
|
|
||||||
3. Add tag `v*.*.*` to trigger the build workflow.
|
|
||||||
|
|
||||||
### Self-Hosted Runner Build
|
|
||||||
|
|
||||||
In some version combinations, you cannot build wheels on GitHub-hosted runners due to job time limitations.
|
|
||||||
To build the wheels for these versions, you can use self-hosted runners.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
git clone https://github.com/mjun0812/flash-attention-prebuild-wheels.git
|
|
||||||
cd self-hosted-runner
|
|
||||||
cp env.template env
|
|
||||||
```
|
|
||||||
|
|
||||||
Edit `env` file to set the environment variables.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Edit env
|
|
||||||
PERSONAL_ACCESS_TOKEN=[Github Personal Access Token]
|
|
||||||
```
|
|
||||||
|
|
||||||
Edit compose.yml file if you use repository folked from this repository.
|
|
||||||
|
|
||||||
```yaml
|
|
||||||
services:
|
|
||||||
runner:
|
|
||||||
privileged: true
|
|
||||||
build:
|
|
||||||
context: .
|
|
||||||
dockerfile: Dockerfile
|
|
||||||
args:
|
|
||||||
REPOSITORY_URL: [Target Repository URL]
|
|
||||||
PERSONAL_ACCESS_TOKEN: $PERSONAL_ACCESS_TOKEN
|
|
||||||
GH_RUNNER_VERSION: 2.324.0
|
|
||||||
RUNNER_NAME: self-hosted-runner
|
|
||||||
RUNNER_GROUP: default
|
|
||||||
RUNNER_LABELS: self-hosted
|
|
||||||
TARGET_ARCH: x64
|
|
||||||
```
|
|
||||||
|
|
||||||
Then, build and run the docker container.
|
|
||||||
|
|
||||||
```bash
|
|
||||||
# Build and run
|
|
||||||
docker compose build
|
|
||||||
docker compose up -d
|
|
||||||
```
|
|
||||||
|
|
||||||
## Original Repository
|
## Original Repository
|
||||||
|
|
||||||
[repo](https://github.com/Dao-AILab/flash-attention)
|
[repo](https://github.com/Dao-AILab/flash-attention)
|
||||||
|
|||||||
Reference in New Issue
Block a user