- Rename docs/ to doc/ (contains packages.md, release_history.md, etc.)
- Rename pages/ to docs/ (contains search page index.html)
- Update all references in README.md, workflows, and Python scripts
GitHub Pages only supports / or /docs as the source directory.
- Add interactive search page to find and filter prebuilt wheels
- Support filtering by Platform, Flash-Attention, Python, PyTorch, and CUDA versions
- Include copy buttons for install commands and URLs
- Auto-select when only one result matches filters
- Cache API responses in localStorage for 1 hour
- Add search page link to README.md
- Add CUDA 12.9 and 13.1 support to Linux self-hosted matrix
- Add Python 3.14 support for Linux self-hosted runners
- Add several version exclusions for specific Torch and CUDA combinations
- Temporarily disable Torch 2.10.0 in all build matrices
- Enable Linux self-hosted runner builds in the main configuration
Apply the same fix to _build_windows.yml and _build_windows_code_build.yml
to ensure gh CLI uses the correct repository context when uploading
release assets.
Remove working-directory from Upload Release Asset step to ensure
gh CLI uses the correct repository (flash-attention-prebuild-wheels)
instead of the cloned flash-attention repository.
- Replace manual GITHUB_REF string manipulation with github.ref_name in _build_windows.yml, _build_windows_code_build.yml, and _build_windows_self_host.yml.
- Add _build_windows_self_host.yml for self-hosted Windows wheel builds.
- Integrate self-hosted Windows build job into main build.yml workflow.
- Update create_matrix.py to include and enable Windows self-hosted build matrix.
- Implement comprehensive cleanup steps in the self-hosted runner workflow to ensure a clean state for subsequent runs.
- Fix wheel path issue caused by build_windows.ps1 changing to
flash-attention directory (was causing double path like
flash-attention/flash-attention/dist)
- Add working-directory to Install Test and Upload steps for explicit
directory control
- Add log grouping (::group::) in build_windows.ps1 for collapsible
logs in GitHub Actions
- Suppress verbose output with pip -q, git clone -q, and
NINJA_STATUS=""
- Add pwsh and vswhere to prerequisites list
- Increase timeout to 2160 minutes for long builds
- Improve CUDA cleanup using proper Windows uninstaller
- Update README platform table and manylinux compatibility note
- Add .github/workflows/test-windows-self-hosted.yml for Windows self-hosted runner testing.
- Update README.md with comprehensive self-hosted runner setup guides for Linux, ARM64, and Windows.
- Update self-hosted-runner/compose.yml to enable both x86_64 and ARM64 runner services.
- Add a note about manylinux2_28 and update the sponsor list in README.md.