Proxmox ROCm LXC Toolkit

Toolkit for building an unprivileged Ubuntu 24.04 LXC on Proxmox and installing ROCm 7.2 with AMD's official Ubuntu package-manager method.

Hardware-specific tuning guide:

Troubleshooting guide:

What this includes

  • scripts/create_rocm_lxc.sh
    • Creates an unprivileged Ubuntu 24.04 container using community-scripts ct/ubuntu.sh.
  • scripts/configure_gpu_passthrough.sh
    • Adds /dev/kfd + /dev/dri passthrough and cgroup permissions in LXC config.
  • scripts/install_rocm_in_ct.sh
    • Registers ROCm 7.2 noble apt repos and installs a chosen ROCm meta package.
  • scripts/create_pytorch_venv_in_ct.sh
    • Creates a Python 3.12 venv in the CT and installs AMD ROCm 7.2 PyTorch wheels.
  • scripts/manage_llm_backends_in_ct.sh
    • Installs/updates vLLM, Ollama, and llama.cpp and configures systemd services under a nologin service user.
  • scripts/test_llm_backends_in_ct.sh
    • Runs health checks for vLLM, Ollama, and llama.cpp services/endpoints.
  • scripts/expose_ollama_in_ct.sh
    • Sets Ollama to listen on a network address/port via systemd override.
  • scripts/close_ollama_network_in_ct.sh
    • Removes the Ollama network override and returns to service defaults.
  • scripts/set_ollama_memory_profile_in_ct.sh
    • Applies a safe Ollama memory profile (context/parallel/model limits) and restarts service.

Requirements

  • Proxmox VE host with a working AMD GPU stack exposing:
    • /dev/kfd
    • /dev/dri
  • Template and container storage names available in Proxmox (for example local and local-lvm).
  • Run scripts on the Proxmox host as root.

Quick start

  1. Create unprivileged container:
chmod +x scripts/*.sh

sudo bash ./scripts/create_rocm_lxc.sh \
	--ctid 120 \
	--hostname rocm-ct \
	--template-storage local \
	--container-storage local-lvm

This script uses:

  • https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/ubuntu.sh
  • (internally by that script) https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/install/ubuntu-install.sh
  1. Configure GPU passthrough on host:
sudo bash ./scripts/configure_gpu_passthrough.sh --ctid 120
  1. Install ROCm in container:
sudo bash ./scripts/install_rocm_in_ct.sh --ctid 120 --package rocm
  1. Optional manual checks:
pct exec 120 -- bash -lc '/opt/rocm/bin/rocminfo | head -n 40'
pct exec 120 -- bash -lc '/opt/rocm/bin/rocm-smi || true'
  1. Create PyTorch ROCm venv in CT:
sudo bash ./scripts/create_pytorch_venv_in_ct.sh --ctid 120 --venv-path /opt/rocm-pytorch-venv
  1. Install LLM backends + services in CT:
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend all --venv-path /opt/rocm-pytorch-venv

By default, this creates and uses llm-svc (/usr/sbin/nologin) for service execution. You can override with --service-user <name>.

Update later:

sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action update --backend all --venv-path /opt/rocm-pytorch-venv

Backend-specific examples:

sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend vllm --venv-path /opt/rocm-pytorch-venv
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend ollama
sudo bash ./scripts/manage_llm_backends_in_ct.sh --ctid 120 --action install --backend llama-cpp
  1. Test backend services/endpoints:
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend all

Verbose diagnostics on failures:

sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend all --verbose

Test one backend only:

sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend vllm
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend ollama
sudo bash ./scripts/test_llm_backends_in_ct.sh --ctid 120 --backend llama-cpp
  1. Apply safe Ollama memory profile (recommended for large models):
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120

Use built-in presets:

sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --preset safe
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --preset balanced
sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --preset max

Preset selection quick guide:

Model size (typical) Suggested preset Notes
7B to 14B max Highest throughput, more aggressive memory/concurrency.
20B to 32B balanced Best first choice for stability on large models.
30B+ with load failures/OOM safe Uses lower context, KEEP_ALIVE=0, and higher GPU overhead reserve.

If a model fails to load, step down from maxbalancedsafe before changing manual values.

Custom values example:

sudo bash ./scripts/set_ollama_memory_profile_in_ct.sh --ctid 120 --context-length 4096 --num-parallel 1 --max-loaded-models 1 --flash-attention false --keep-alive 0 --gpu-overhead-bytes 2147483648

Expose Ollama to LAN

Use the helper script:

sudo bash ./scripts/expose_ollama_in_ct.sh --ctid 120

Optional custom bind/port:

sudo bash ./scripts/expose_ollama_in_ct.sh --ctid 120 --listen 0.0.0.0 --port 11434

Revert (remove network override):

sudo bash ./scripts/close_ollama_network_in_ct.sh --ctid 120

By default, Ollama may bind to localhost only. To expose it to your network from inside the CT:

pct exec 120 -- bash -lc 'install -d /etc/systemd/system/ollama.service.d'
pct exec 120 -- bash -lc 'cat >/etc/systemd/system/ollama.service.d/network.conf <<"EOF"
[Service]
Environment=OLLAMA_HOST=0.0.0.0:11434
EOF'
pct exec 120 -- bash -lc 'systemctl daemon-reload && systemctl restart ollama'

Verify bind address and test from another machine:

pct exec 120 -- bash -lc 'ss -ltnp | grep 11434 || true'
curl http://<ct-ip>:11434/api/tags

If unreachable, allow TCP/11434 in Proxmox firewall (Datacenter/Node/CT) and CT firewall (ufw) as needed. Ollama has no built-in auth by default, so only expose on trusted networks or behind an authenticated reverse proxy.

ROCm package options

install_rocm_in_ct.sh defaults to rocm, but you can pass alternatives, for example:

  • rocm-hip-runtime
  • rocm-opencl-runtime
  • rocm-ml-libraries

Example:

sudo bash ./scripts/install_rocm_in_ct.sh --ctid 120 --package rocm-hip-runtime

Notes for unprivileged LXC

  • Device passthrough to unprivileged containers can be sensitive to host kernel/driver updates.
  • If your workload runs as a non-root user inside the CT, ensure that user is in video/render groups:
pct exec 120 -- bash -lc 'usermod -aG video,render <your-user>'
  • A CT restart is often required after changing LXC device mappings.

Community scripts (direct usage)

If you want to run the upstream script directly (interactive), use:

bash -c "$(curl -fsSL https://raw.githubusercontent.com/community-scripts/ProxmoxVE/main/ct/ubuntu.sh)"

Repository:

Alignment with AMD docs

ROCm install flow follows AMDs Ubuntu package-manager guidance for ROCm 7.2 and Ubuntu 24.04 (noble):

  • GPG key to /etc/apt/keyrings/rocm.gpg
  • rocm/apt/7.2 + graphics/7.2/ubuntu apt repos
  • apt preference pin (Pin-Priority: 600)

Reference:

S
Description
No description provided
Readme 66 KiB
Languages
Shell 100%