Skip to content

10.6. Docker Deployment

Deploy Backend.AI GO Server in containerized environments for cloud servers, CI/CD pipelines, and Kubernetes clusters.

The aigo-server binary is the headless server component of Backend.AI GO. It runs without a desktop GUI and exposes a Management API plus an OpenAI-compatible inference API.

Quick Start

Run the server with a single command:

docker run -d \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  ghcr.io/lablup/backend-ai-go-server:latest

Then open http://localhost:8001 in your browser to access the web UI.

Prerequisites

  • Docker Engine 24.0 or later
  • At least 8 GB of RAM (16+ GB recommended for larger models)
  • For GPU acceleration: see the GPU Setup section

Initial Setup

1. Generate an Admin API Key

Before starting the server for the first time, generate a master key:

docker run --rm ghcr.io/lablup/backend-ai-go-server:latest --generate-admin-key

Save the generated key — it is displayed only once.

2. Start with the Master Key

Pass the key as an environment variable:

docker run -d \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -e AIGO_MASTER_KEY=sk-admin-your-key-here \
  -v aigo-models:/data/models \
  -v aigo-data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest

Docker Compose

The project includes a docker-compose.yml for a complete setup:

# Clone the repository (or download docker-compose.yml)
git clone https://github.com/lablup/backend.ai-go.git
cd backend.ai-go

# Generate admin key and save it
docker compose run --rm aigo-server --generate-admin-key

# Set the master key in your environment
export AIGO_MASTER_KEY=sk-admin-your-generated-key

# Start the server
docker compose up -d

# View logs
docker compose logs -f

# Stop
docker compose down

Access the web UI at http://localhost:8001.

Configuration

Environment Variables

All configuration options can be set via environment variables:

Variable Default Description
AIGO_HOST 0.0.0.0 Bind address
AIGO_PORT 8001 Management API port
AIGO_ROUTER_PORT 8000 Inference router port
AIGO_MODELS_DIR /data/models Models storage directory
AIGO_ENGINES_DIR /data/engines Inference engine directory
AIGO_STATIC_DIR /var/www/aigo Web UI static files directory
AIGO_MASTER_KEY (none) Master API key for initial admin setup
AIGO_LOG_LEVEL info Log level: trace, debug, info, warn, error
AIGO_CONFIG (auto) Path to TOML configuration file

Configuration File

For more advanced configuration, mount a TOML file:

# Generate example configuration
docker run --rm ghcr.io/lablup/backend-ai-go-server:latest --generate-config > aigo-server.toml

# Edit the configuration
nano aigo-server.toml

# Start with the config file
docker run -d \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v $(pwd)/aigo-server.toml:/data/config/config.toml:ro \
  -v aigo-models:/data/models \
  -v aigo-data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest \
  --config /data/config/config.toml --external

Validate Configuration

Test your configuration before starting:

docker run --rm \
  -v $(pwd)/aigo-server.toml:/data/config/config.toml:ro \
  ghcr.io/lablup/backend-ai-go-server:latest \
  --validate-config --config /data/config/config.toml

Persistent Storage

Model files are large (4–80+ GB each). Always use named volumes or bind mounts to persist them across container restarts.

docker volume create aigo-models
docker volume create aigo-data

docker run -d \
  -v aigo-models:/data/models \
  -v aigo-data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest

Bind Mounts (for direct filesystem access)

docker run -d \
  -v /mnt/storage/models:/data/models \
  -v /mnt/storage/data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest

GPU Setup

NVIDIA GPU

Install the NVIDIA Container Toolkit, then use the GPU-enabled compose file:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

Or with plain Docker:

docker run -d \
  --gpus all \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  ghcr.io/lablup/backend-ai-go-server:latest

Verify GPU access:

docker exec aigo-server nvidia-smi

AMD GPU (ROCm)

For AMD GPU support, expose the ROCm device files:

docker run -d \
  --device /dev/kfd:/dev/kfd \
  --device /dev/dri:/dev/dri \
  --group-add video \
  --group-add render \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  ghcr.io/lablup/backend-ai-go-server:latest

Apple Silicon

Apple Silicon (Metal) acceleration is available only in the native macOS desktop application. The Docker image targets Linux (amd64) and does not support Metal.

TLS / HTTPS

The server does not terminate TLS directly. Use a reverse proxy in front of it.

nginx

server {
    listen 443 ssl http2;
    server_name aigo.example.com;

    ssl_certificate     /etc/ssl/certs/aigo.crt;
    ssl_certificate_key /etc/ssl/private/aigo.key;

    location / {
        proxy_pass         http://localhost:8001;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
        proxy_read_timeout 300s;
        proxy_send_timeout 300s;
    }
}

Caddy (automatic HTTPS)

aigo.example.com {
    reverse_proxy localhost:8001
}

Traefik (with Docker labels)

Add labels to the aigo-server service in docker-compose.yml:

labels:
  - "traefik.enable=true"
  - "traefik.http.routers.aigo.rule=Host(`aigo.example.com`)"
  - "traefik.http.routers.aigo.entrypoints=websecure"
  - "traefik.http.routers.aigo.tls.certresolver=myresolver"
  - "traefik.http.services.aigo.loadbalancer.server.port=8001"

Health Monitoring

Health Check Endpoint

# Check server health
curl http://localhost:8001/api/v1/health

# Expected response:
# {"status":"ok","version":"1.2.0"}

Docker Health Status

docker inspect aigo-server --format='{{.State.Health.Status}}'

Updating

Pull New Image

docker compose pull
docker compose up -d

Backup Before Update

# Backup data volume
docker run --rm \
  -v aigo-data:/data \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/aigo-data-$(date +%Y%m%d).tar.gz /data

# Backup models list (the models themselves are large; back up metadata)
docker exec aigo-server ls /data/models > models-list-$(date +%Y%m%d).txt

Kubernetes

For production Kubernetes deployments, use the manifests in the k8s/ directory:

# Create namespace and deploy
kubectl apply -f k8s/persistent-volumes.yaml
kubectl apply -f k8s/secrets.yaml     # Edit with your actual key first
kubectl apply -f k8s/deployment.yaml

# Check status
kubectl get pods -n aigo
kubectl logs -n aigo deploy/aigo-server

# Access via port-forward (for testing)
kubectl port-forward -n aigo svc/aigo-server 8001:8001

For HTTPS with an Ingress controller:

kubectl apply -f k8s/ingress.yaml

For NVIDIA GPU support in Kubernetes:

kubectl apply -f k8s/nvidia-gpu-deployment.yaml

Troubleshooting

Container exits immediately

# Check the container logs
docker logs aigo-server

# Common causes:
# - Port already in use: change -p 8001:8001 to -p 8002:8001
# - Permission denied on volume: check volume ownership
# - Invalid configuration: run with --validate-config first

Models directory permission error

# Fix volume ownership
docker run --rm -u root \
  -v aigo-models:/data/models \
  debian:bookworm-slim \
  chown -R 1000:1000 /data/models

Cannot connect from outside the container

Ensure the container is bound to 0.0.0.0 (not 127.0.0.1):

docker run -d \
  -e AIGO_HOST=0.0.0.0 \
  -p 8001:8001 \
  ghcr.io/lablup/backend-ai-go-server:latest --external

Check your firewall allows traffic on port 8001.

GPU not detected

# NVIDIA: verify toolkit installation
nvidia-container-cli --version
docker run --rm --gpus all nvidia/cuda:12.3-base-ubuntu22.04 nvidia-smi

# AMD: verify device access
ls -la /dev/kfd /dev/dri