diff --git a/README.md b/README.md index 15ef2eb..fe510bf 100644 --- a/README.md +++ b/README.md @@ -15,11 +15,11 @@ The built packages are available on the [release page](https://github.com/mjun08 ## Table of Contents - [Install](#install) +- [Self-build runner](#self-build) - [Packages](#packages) - [Linux x86_64](#linux-x86_64) - [Windows x86_64](#windows-x86_64) - [History](#history) -- [Self-build runner](#self-build-runner) - [Original Repository](#original-repository) ## Install @@ -33,7 +33,7 @@ flash_attn-[flash_attn Version]+cu[CUDA Version]torch[PyTorch Version]-cp[Python flash_attn-2.6.3+cu124torch2.5-cp312-cp312-linux_x86_64.whl ``` -2. Find the corresponding version of a wheel from the below table and [releases](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases) +2. Find the corresponding version of a wheel from the below [Package section](#packages) and [releases](https://github.com/mjun0812/flash-attention-prebuild-wheels/releases) 3. Direct Install or Download and Local Install @@ -46,6 +46,61 @@ wget https://github.com/mjun0812/flash-attention-prebuild-wheels/releases/downlo pip install ./flash_attn-2.6.3+cu124torch2.5-cp312-cp312-linux_x86_64.whl ``` +## Self build + +If you cannot find the version you are looking for, you can fork this repository and create a wheel on GitHub Actions. + +1. Fork this repository +2. Edit workflow file [`.github/workflows/build.yml`](https://github.com/mjun0812/flash-attention-prebuild-wheels/blob/main/.github/workflows/build.yml) to set the version you want to build. +3. Add tag `v*.*.*` to trigger the build workflow. + +Please note that depending on the combination of versions, it may not be possible to build. + +### Self-Hosted Runner Build + +In some version combinations, you cannot build wheels on GitHub-hosted runners due to job time limitations. +To build the wheels for these versions, you can use self-hosted runners. + +```bash +git clone https://github.com/mjun0812/flash-attention-prebuild-wheels.git +cd self-hosted-runner +cp env.template env +``` + +Edit `env` file to set the environment variables. + +```bash +# Edit env +PERSONAL_ACCESS_TOKEN=[Github Personal Access Token] +``` + +Edit compose.yml file if you use repository folked from this repository. + +```yaml +services: + runner: + privileged: true + build: + context: . + dockerfile: Dockerfile + args: + REPOSITORY_URL: [Target Repository URL] + PERSONAL_ACCESS_TOKEN: $PERSONAL_ACCESS_TOKEN + GH_RUNNER_VERSION: 2.324.0 + RUNNER_NAME: self-hosted-runner + RUNNER_GROUP: default + RUNNER_LABELS: self-hosted + TARGET_ARCH: x64 +``` + +Then, build and run the docker container. + +```bash +# Build and run +docker compose build +docker compose up -d +``` + ## Packages ### Linux x86_64 @@ -1129,59 +1184,6 @@ Skip for experimental reasons. | -------------------------- | ---------- | ---------------------------------------- | ---------------------- | | 2.4.3, 2.5.6, 2.5.9, 2.6.3 | 3.11, 3.12 | 2.0.1, 2.1.2, 2.2.2, 2.3.1, 2.4.1, 2.5.0 | 11.8.0, 12.1.1, 12.4.1 | -## Self build runner - -If you want to build the wheels yourself, you can fork this repository and run the build workflow. - -1. Fork this repository -2. Edit workflow file `.github/workflows/build.yml` to set the version you want to build. -3. Add tag `v*.*.*` to trigger the build workflow. - -### Self-Hosted Runner Build - -In some version combinations, you cannot build wheels on GitHub-hosted runners due to job time limitations. -To build the wheels for these versions, you can use self-hosted runners. - -```bash -git clone https://github.com/mjun0812/flash-attention-prebuild-wheels.git -cd self-hosted-runner -cp env.template env -``` - -Edit `env` file to set the environment variables. - -```bash -# Edit env -PERSONAL_ACCESS_TOKEN=[Github Personal Access Token] -``` - -Edit compose.yml file if you use repository folked from this repository. - -```yaml -services: - runner: - privileged: true - build: - context: . - dockerfile: Dockerfile - args: - REPOSITORY_URL: [Target Repository URL] - PERSONAL_ACCESS_TOKEN: $PERSONAL_ACCESS_TOKEN - GH_RUNNER_VERSION: 2.324.0 - RUNNER_NAME: self-hosted-runner - RUNNER_GROUP: default - RUNNER_LABELS: self-hosted - TARGET_ARCH: x64 -``` - -Then, build and run the docker container. - -```bash -# Build and run -docker compose build -docker compose up -d -``` - ## Original Repository [repo](https://github.com/Dao-AILab/flash-attention)