Skip to content

12.6. Docker Deployment

Deploy Backend.AI GO Server in containerized environments for cloud servers, CI/CD pipelines, and Kubernetes clusters.

The aigo-server binary is the headless server component of Backend.AI GO. It runs without a desktop GUI and exposes a Management API plus an OpenAI-compatible inference API.

Quick Start

Run the server with a single command:

docker run -d \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  ghcr.io/lablup/backend-ai-go-server:latest

Then open http://localhost:8001 in your browser to access the web UI.

Prerequisites

  • Docker Engine 24.0 or later
  • At least 8 GB of RAM (16+ GB recommended for larger models)
  • For GPU acceleration: see the GPU Setup section

Initial Setup

1. Start the Server

docker run -d \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  -v aigo-data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest

2. Create the Admin Account

Open http://localhost:8001 in your browser. The server detects that no admin account exists and shows the Initial Setup screen. Enter a username and a strong password (at least 12 characters, with a digit or symbol). The server creates the admin account and logs you in immediately.

After that, the server redirects every subsequent browser visit to the Sign in screen. SDK and curl clients use X-API-Key or Authorization: Bearer with access keys you create from Settings > API Keys once signed in.

Docker Compose

The project includes a docker-compose.yml for a complete setup:

# Clone the repository (or download docker-compose.yml)
git clone https://github.com/lablup/backend.ai-go.git
cd backend.ai-go

# Start the server
docker compose up -d

# View logs
docker compose logs -f

# Stop
docker compose down

Access the web UI at http://localhost:8001. On first visit, the server shows the Initial Setup screen where you create the admin account with a username and password.

Configuration

Environment Variables

All configuration options can be set via environment variables:

Variable Default Description
AIGO_HOST 0.0.0.0 Bind address
AIGO_PORT 8001 Management API port
AIGO_ROUTER_PORT 8000 Inference router port
AIGO_MODELS_DIR /data/models Models storage directory
AIGO_ENGINES_DIR /data/engines Inference engine directory
AIGO_STATIC_DIR /var/www/aigo Web UI static files directory
AIGO_LOG_LEVEL info Log level: trace, debug, info, warn, error
AIGO_CONFIG (auto) Path to TOML configuration file

Configuration File

For more advanced configuration, mount a TOML file:

# Generate example configuration
docker run --rm ghcr.io/lablup/backend-ai-go-server:latest --generate-config > aigo-server.toml

# Edit the configuration
nano aigo-server.toml

# Start with the config file
docker run -d \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v $(pwd)/aigo-server.toml:/data/config/config.toml:ro \
  -v aigo-models:/data/models \
  -v aigo-data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest \
  --config /data/config/config.toml --external

Validate Configuration

Test your configuration before starting:

docker run --rm \
  -v $(pwd)/aigo-server.toml:/data/config/config.toml:ro \
  ghcr.io/lablup/backend-ai-go-server:latest \
  --validate-config --config /data/config/config.toml

Persistent Storage

Model files are large (4–80+ GB each). Always use named volumes or bind mounts to persist them across container restarts.

docker volume create aigo-models
docker volume create aigo-data

docker run -d \
  -v aigo-models:/data/models \
  -v aigo-data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest

Bind Mounts (for direct filesystem access)

docker run -d \
  -v /mnt/storage/models:/data/models \
  -v /mnt/storage/data:/data \
  ghcr.io/lablup/backend-ai-go-server:latest

GPU Setup

NVIDIA GPU

Install the NVIDIA Container Toolkit, then use the GPU-enabled compose file:

docker compose -f docker-compose.yml -f docker-compose.gpu.yml up -d

Or with plain Docker:

docker run -d \
  --gpus all \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  ghcr.io/lablup/backend-ai-go-server:latest

Verify GPU access:

docker exec aigo-server nvidia-smi

AMD GPU (ROCm)

For AMD GPU support, expose the ROCm device files:

docker run -d \
  --device /dev/kfd:/dev/kfd \
  --device /dev/dri:/dev/dri \
  --group-add video \
  --group-add render \
  --name aigo-server \
  -p 8001:8001 \
  -p 8000:8000 \
  -v aigo-models:/data/models \
  ghcr.io/lablup/backend-ai-go-server:latest

Apple Silicon

Apple Silicon (Metal) acceleration is available only in the native macOS desktop application. The Docker image targets Linux (amd64) and does not support Metal.

TLS / HTTPS

The server does not terminate TLS directly. Use a reverse proxy in front of it.

nginx

server {
    listen 443 ssl http2;
    server_name aigo.example.com;

    ssl_certificate     /etc/ssl/certs/aigo.crt;
    ssl_certificate_key /etc/ssl/private/aigo.key;

    location / {
        proxy_pass         http://localhost:8001;
        proxy_set_header   Host $host;
        proxy_set_header   X-Real-IP $remote_addr;
        proxy_set_header   X-Forwarded-For $proxy_add_x_forwarded_for;
        proxy_set_header   X-Forwarded-Proto $scheme;
        proxy_read_timeout 300s;
        proxy_send_timeout 300s;
    }
}

Caddy (automatic HTTPS)

aigo.example.com {
    reverse_proxy localhost:8001
}

Traefik (with Docker labels)

Add labels to the aigo-server service in docker-compose.yml:

labels:
  - "traefik.enable=true"
  - "traefik.http.routers.aigo.rule=Host(`aigo.example.com`)"
  - "traefik.http.routers.aigo.entrypoints=websecure"
  - "traefik.http.routers.aigo.tls.certresolver=myresolver"
  - "traefik.http.services.aigo.loadbalancer.server.port=8001"

Health Monitoring

Health Check Endpoint

# Check server health
curl http://localhost:8001/api/v1/health

# Expected response:
# {"status":"ok","version":"1.2.0"}

Docker Health Status

docker inspect aigo-server --format='{{.State.Health.Status}}'

Updating

Pull New Image

docker compose pull
docker compose up -d

Backup Before Update

# Backup data volume
docker run --rm \
  -v aigo-data:/data \
  -v $(pwd)/backups:/backup \
  alpine tar czf /backup/aigo-data-$(date +%Y%m%d).tar.gz /data

# Backup models list (the models themselves are large; back up metadata)
docker exec aigo-server ls /data/models > models-list-$(date +%Y%m%d).txt

Kubernetes

For production Kubernetes deployments, use the manifests in the k8s/ directory:

# Create namespace and deploy
kubectl apply -f k8s/persistent-volumes.yaml
kubectl apply -f k8s/secrets.yaml
kubectl apply -f k8s/deployment.yaml

# Check status
kubectl get pods -n aigo
kubectl logs -n aigo deploy/aigo-server

# Access via port-forward (for testing)
kubectl port-forward -n aigo svc/aigo-server 8001:8001

For HTTPS with an Ingress controller:

kubectl apply -f k8s/ingress.yaml

For NVIDIA GPU support in Kubernetes:

kubectl apply -f k8s/nvidia-gpu-deployment.yaml

Troubleshooting

Container exits immediately

# Check the container logs
docker logs aigo-server

# Common causes:
# - Port already in use: change -p 8001:8001 to -p 8002:8001
# - Permission denied on volume: check volume ownership
# - Invalid configuration: run with --validate-config first

Models directory permission error

# Fix volume ownership
docker run --rm -u root \
  -v aigo-models:/data/models \
  debian:bookworm-slim \
  chown -R 1000:1000 /data/models

Cannot connect from outside the container

Ensure the container is bound to 0.0.0.0 (not 127.0.0.1):

docker run -d \
  -e AIGO_HOST=0.0.0.0 \
  -p 8001:8001 \
  ghcr.io/lablup/backend-ai-go-server:latest --external

Check your firewall allows traffic on port 8001.

GPU not detected

# NVIDIA: verify toolkit installation
nvidia-container-cli --version
docker run --rm --gpus all nvidia/cuda:12.3-base-ubuntu22.04 nvidia-smi

# AMD: verify device access
ls -la /dev/kfd /dev/dri