Proxmox ROCm LXC Toolkit
Toolkit for building an unprivileged Ubuntu 24.04 LXC on Proxmox and installing ROCm 7.2 with AMD's official Ubuntu package-manager method.
Hardware-specific tuning guide:
Troubleshooting guide:
What this includes
scripts/create_rocm_lxc.sh- Creates an unprivileged Ubuntu 24.04 container using community-scripts
ct/ubuntu.sh.
- Creates an unprivileged Ubuntu 24.04 container using community-scripts
scripts/configure_gpu_passthrough.sh- Adds
/dev/kfd+/dev/dripassthrough and cgroup permissions in LXC config.
- Adds
scripts/install_rocm_in_ct.sh- Registers ROCm 7.2
nobleapt repos and installs a chosen ROCm meta package.
- Registers ROCm 7.2
scripts/create_pytorch_venv_in_ct.sh- Creates a Python 3.12 venv in the CT and installs AMD ROCm 7.2 PyTorch wheels.
scripts/manage_llm_backends_in_ct.sh- Installs/updates vLLM, Ollama, and llama.cpp and configures systemd services under a nologin service user.
scripts/test_llm_backends_in_ct.sh- Runs health checks for vLLM, Ollama, and llama.cpp services/endpoints.
scripts/expose_ollama_in_ct.sh- Sets Ollama to listen on a network address/port via systemd override.
scripts/close_ollama_network_in_ct.sh- Removes the Ollama network override and returns to service defaults.
scripts/set_ollama_memory_profile_in_ct.sh- Applies a safe Ollama memory profile (context/parallel/model limits) and restarts service.
Requirements
- Proxmox VE host with a working AMD GPU stack exposing:
/dev/kfd/dev/dri
- Template and container storage names available in Proxmox (for example
localandlocal-lvm). - Run scripts on the Proxmox host as
root.
Quick start
- Create unprivileged container:
chmod +x scripts/*.sh
sudo bash ./scripts/create_rocm_lxc.sh \
--ctid 120 \
--hostname rocm-ct \
--template-storage local \
--container-storage local-lvm
This script uses:
https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/ubuntu.sh- (internally by that script)
https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/install/ubuntu-install.sh
- Configure GPU passthrough on host:
sudo bash ./scripts/configure_gpu_passthrough.sh --ctid 120
- Install ROCm in container:
sudo bash ./scripts/install_rocm_in_ct.sh --ctid 120 --package rocm
- Optional manual checks:
pct exec 120 -- bash -lc '/opt/rocm/bin/rocminfo | head -n 40'
pct exec 120 -- bash -lc '/opt/rocm/bin/rocm-smi || true'
- Create PyTorch ROCm venv in CT:
sudo bash ./scripts/create_pytorch_venv_in_ct.sh --ctid 120 --venv-path /opt/rocm-pytorch-venv
- Install LLM backends + services in CT:
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend all --venv-path /opt/rocm-pytorch-venv
By default, this creates and uses llm-svc (/usr/sbin/nologin) for service execution.
You can override with --service-user <name>.
Update later:
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action update --backend all --venv-path /opt/rocm-pytorch-venv
Backend-specific examples:
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend vllm --venv-path /opt/rocm-pytorch-venv
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend ollama
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend llama-cpp
- Test backend services/endpoints:
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend all
Verbose diagnostics on failures:
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend all --verbose
Test one backend only:
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend vllm
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend ollama
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend llama-cpp
- Apply safe Ollama memory profile (recommended for large models):
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120
Use built-in presets:
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --preset safe
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --preset balanced
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --preset max
Preset selection quick guide:
| Model size (typical) | Suggested preset | Notes |
|---|---|---|
| 7B to 14B | max |
Highest throughput, more aggressive memory/concurrency. |
| 20B to 32B | balanced |
Best first choice for stability on large models. |
| 30B+ with load failures/OOM | safe |
Uses lower context, KEEP_ALIVE=0, and higher GPU overhead reserve. |
If a model fails to load, step down from max → balanced → safe before changing manual values.
Custom values example:
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --context-length 4096 --num-parallel 1 --max-loaded-models 1 --flash-attention false --keep-alive 0 --gpu-overhead-bytes 2147483648
Expose Ollama to LAN
Use the helper script:
sudo bash ./scripts/expose_ollama_in_ct.sh --ctid 120
Optional custom bind/port:
sudo bash ./scripts/expose_ollama_in_ct.sh --ctid 120 --listen 0.0.0.0 --port 11434
Revert (remove network override):
sudo bash ./scripts/close_ollama_network_in_ct.sh --ctid 120
By default, Ollama may bind to localhost only. To expose it to your network from inside the CT:
pct exec 120 -- bash -lc 'install -d /etc/systemd/system/ollama.service.d'
pct exec 120 -- bash -lc 'cat >/etc/systemd/system/ollama.service.d/network.conf <<"EOF"
[Service]
Environment=OLLAMA_HOST=0.0.0.0:11434
EOF'
pct exec 120 -- bash -lc 'systemctl daemon-reload && systemctl restart ollama'
Verify bind address and test from another machine:
pct exec 120 -- bash -lc 'ss -ltnp | grep 11434 || true'
curl http://<ct-ip>:11434/api/tags
If unreachable, allow TCP/11434 in Proxmox firewall (Datacenter/Node/CT) and CT firewall (ufw) as needed.
Ollama has no built-in auth by default, so only expose on trusted networks or behind an authenticated reverse proxy.
ROCm package options
install_rocm_in_ct.sh defaults to rocm, but you can pass alternatives, for example:
rocm-hip-runtimerocm-opencl-runtimerocm-ml-libraries
Example:
sudo bash ./scripts/install_rocm_in_ct.sh --ctid 120 --package rocm-hip-runtime
Notes for unprivileged LXC
- Device passthrough to unprivileged containers can be sensitive to host kernel/driver updates.
- If your workload runs as a non-root user inside the CT, ensure that user is in
video/rendergroups:
pct exec 120 -- bash -lc 'usermod -aG video,render <your-user>'
- A CT restart is often required after changing LXC device mappings.
Community scripts (direct usage)
If you want to run the upstream script directly (interactive), use:
bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/ubuntu.sh)"
Repository:
Alignment with AMD docs
ROCm install flow follows AMD’s Ubuntu package-manager guidance for ROCm 7.2 and Ubuntu 24.04 (noble):
- GPG key to
/etc/apt/keyrings/rocm.gpg rocm/apt/7.2+graphics/7.2/ubuntuapt repos- apt preference pin (
Pin-Priority: 600)
Reference:
- https://rocm.docs.amd.com/projects/install-on-linux/en/latest/install/install-methods/package-manager/package-manager-ubuntu.html
- https://rocm.docs.amd.com/projects/radeon-ryzen/en/latest/docs/install/installryz/native_linux/install-pytorch.html
- https://docs.vllm.ai/en/latest/getting_started/installation/gpu/
- https://docs.ollama.com/linux
- https://github.com/ggml-org/llama.cpp