mirror of
https://github.com/BillyOutlast/rocm-automated.git
synced 2026-02-04 03:51:19 +01:00
main
Some checks failed
Daily ROCm Container Build (Pure Shell) / prepare (push) Successful in 15s
Daily ROCm Container Build (Pure Shell) / build-base-images (map[context:. dockerfile:Dockerfile.comfyui-rocm7.1 name:comfyui-rocm7.1]) (push) Failing after 0s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1030) (push) Failing after 1s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1100) (push) Failing after 2s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1101) (push) Failing after 2s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1150) (push) Failing after 1s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1151) (push) Failing after 2s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1200) (push) Failing after 1s
Daily ROCm Container Build (Pure Shell) / build-stable-diffusion-variants (gfx1201) (push) Failing after 1s
Daily ROCm Container Build (Pure Shell) / build-base-images (map[context:. dockerfile:Dockerfile.stable-diffusion.cpp-rocm7.1 name:stable-diffusion.cpp-rocm7.1]) (push) Failing after 17s
Daily ROCm Container Build (Pure Shell) / test-compose (push) Has been skipped
Daily ROCm Container Build (Pure Shell) / notify (push) Successful in 0s
Daily ROCm Container Build (Pure Shell) / cleanup (push) Failing after 12s
ROCm 7.1 Automated Docker Environment
A comprehensive Docker-based environment for running AI workloads on AMD GPUs with ROCm 7.1 support. This project provides optimized containers for Ollama LLM inference and Stable Diffusion image generation.
Sponsored by https://shad-base.com
🚀 Features
- ROCm 7.1 Support: Latest AMD GPU compute platform
- Ollama Integration: Optimized LLM inference with ROCm backend
- Stable Diffusion: AI image generation with AMD GPU acceleration
- Multi-GPU Support: Automatic detection and utilization of multiple AMD GPUs
- Performance Optimized: Tuned for maximum throughput and minimal latency
- Easy Deployment: One-command setup with Docker Compose
📋 Prerequisites
Hardware Requirements
- AMD GPU: RDNA 2/3 architecture (RX 6000/7000 series or newer)
- Memory: 16GB+ system RAM recommended
- VRAM: 8GB+ GPU memory for large models
Software Requirements
- Linux Distribution: Ubuntu 22.04+, Fedora 38+, or compatible
- Docker: 24.0+ with BuildKit support
- Docker Compose: 2.20+
- Podman (alternative): 4.0+
Supported GPUs
- Radeon RX 7900 XTX/XT
- Radeon RX 7800/7700 XT
- Radeon RX 6950/6900/6800/6700 XT
- AMD APUs with RDNA graphics (limited performance)
🛠️ Installation
1. Clone Repository
git clone https://github.com/BillyOutlast/rocm-automated.git
cd rocm-automated
2. Set GPU Override (if needed)
For newer or unsupported GPU architectures:
# Check your GPU architecture
rocminfo | grep "Name:"
# Set override for newer GPUs (example for RX 7000 series)
export HSA_OVERRIDE_GFX_VERSION=11.0.0
3. Download and Start Services
# Pull the latest prebuilt images and start all services
docker-compose up -d
# View logs
docker-compose logs -f
Alternative: Build Images Locally
If you prefer to build the images locally instead of using prebuilt ones:
# Make build script executable
chmod +x build.sh
# Build all Docker images
./build.sh
# Then start services
docker-compose up -d
🐳 Docker Images
Available Prebuilt Images
getterup/ollama-rocm7.1:latest: Ollama with ROCm 7.1 backend for LLM inferencegetterup/stable-diffusion.cpp-rocm7.1:gfx1151: Stable Diffusion with ROCm 7.1 accelerationgetterup/comfyui:rocm7.1: ComfyUI with ROCm 7.1 supportghcr.io/open-webui/open-webui:main: Web interface for Ollama
What's Included
These prebuilt images come with:
- ROCm 7.1 runtime libraries
- GPU-specific optimizations
- Performance tuning for inference workloads
- Ready-to-run configurations
Build Process (Optional)
The automated build script can create custom images with:
- ROCm 7.1 runtime libraries
- GPU-specific optimizations
- Performance tuning for inference workloads
📊 Services
Ollama LLM Service
Port: 11434
Container: ollama
Features:
- Multi-model support (Llama, Mistral, CodeLlama, etc.)
- ROCm-optimized inference engine
- Flash Attention support
- Quantized model support (Q4, Q8)
Usage Examples
# Pull a model
docker exec ollama ollama pull llama3.2
# Run inference
curl -X POST http://localhost:11434/api/generate \
-H "Content-Type: application/json" \
-d '{"model": "llama3.2", "prompt": "Hello, world!"}'
# Chat interface
curl -X POST http://localhost:11434/api/chat \
-H "Content-Type: application/json" \
-d '{"model": "llama3.2", "messages": [{"role": "user", "content": "Hi there!"}]}'
Stable Diffusion Service
Port: 7860
Container: stable-diffusion.cpp
Features:
- Text-to-image generation
- ROCm acceleration
- Multiple model formats
- Customizable parameters
⚙️ Configuration
Environment Variables
Ollama Service
environment:
- OLLAMA_DEBUG=1 # Debug level (0-2)
- OLLAMA_FLASH_ATTENTION=true # Enable flash attention
- OLLAMA_KV_CACHE_TYPE="q8_0" # KV cache quantization
- ROCR_VISIBLE_DEVICES=0 # GPU selection
- OLLAMA_KEEP_ALIVE=-1 # Keep models loaded
- OLLAMA_MAX_LOADED_MODELS=1 # Max concurrent models
GPU Configuration
environment:
- HSA_OVERRIDE_GFX_VERSION="11.5.1" # GPU architecture override
- HSA_ENABLE_SDMA=0 # Disable SDMA for stability
Volume Mounts
volumes:
- ./ollama:/root/.ollama:Z # Model storage
- ./stable-diffusion.cpp:/app:Z # SD model storage
Device Access
devices:
- /dev/kfd:/dev/kfd # ROCm compute device
- /dev/dri:/dev/dri # GPU render nodes
group_add:
- video # Video group access
🔧 Performance Tuning
GPU Selection
For multi-GPU systems, specify the preferred device:
# List available GPUs
rocminfo
# Set specific GPU
export ROCR_VISIBLE_DEVICES=0
Memory Optimization
# For large models, increase system memory limits
echo 'vm.max_map_count=262144' | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
Model Optimization
- Use quantized models (Q4_K_M, Q8_0) for better performance
- Enable flash attention for transformer models
- Adjust context length based on available VRAM
🚨 Troubleshooting
Common Issues
GPU Not Detected
# Check ROCm installation
rocminfo
# Verify device permissions
ls -la /dev/kfd /dev/dri/
# Check container access
docker exec ollama rocminfo
Memory Issues
# Check VRAM usage
rocm-smi
# Monitor system memory
free -h
# Reduce model size or use quantization
Performance Issues
# Enable performance mode
sudo sh -c 'echo performance > /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor'
# Check GPU clocks
rocm-smi -d 0 --showclocks
Debug Commands
# View Ollama logs
docker-compose logs -f ollama
# Check GPU utilization
watch -n 1 rocm-smi
# Test GPU compute
docker exec ollama rocminfo | grep "Compute Unit"
📁 Project Structure
rocm-automated/
├── build.sh # Automated build script
├── docker-compose.yaml # Service orchestration
├── Dockerfile.rocm-7.1 # Base ROCm image
├── Dockerfile.ollama-rocm-7.1 # Ollama with ROCm
├── Dockerfile.stable-diffusion.cpp-rocm7.1-gfx1151 # Stable Diffusion
├── ollama/ # Ollama data directory
└── stable-diffusion.cpp/ # SD model storage
🤝 Contributing
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
📝 License
This project is licensed under the MIT License - see the LICENSE file for details.
🙏 Acknowledgments
- ROCm Platform - AMD's open-source GPU compute platform
- Ollama - Local LLM inference engine
- Stable Diffusion CPP - Efficient SD implementation
- rjmalagon/ollama-linux-amd-apu - AMD APU optimizations
- ComfyUI - Advanced node-based interface for Stable Diffusion workflows
- phueper/ollama-linux-amd-apu - Enhanced Ollama build with ROCm 7 optimizations
📞 Support
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- ROCm Documentation: AMD ROCm Docs
🏷️ Version History
- v1.0.0: Initial release with ROCm 7.1 support
- v1.1.0: Added Ollama integration and multi-GPU support
- v1.2.0: Performance optimizations and Stable Diffusion support
⚠️ Known Hardware Limitations
External GPU Enclosures
- AOOSTAR AG02 EGPU: ASM246X chipset is known to have compatiblity issues with linux and may downgrade to 8 GT/s PCIe x1 (tested on Fedora 42). This may impact performance with large models requiring significant VRAM transfers.
Mini PCs
- Minisforum MS-A1: Tested by Level1Techs, shown to have resizable BAR issues with eGPUs over USB4 connections. May result in reduced performance or compatibility problems with ROCm workloads.
⭐ Star this repository if it helped you! ⭐
Made with ❤️ for the AMD GPU community
Description
Languages
Shell
84.8%
Dockerfile
9.9%
Roff
5.3%