Skip to content

9.9. Container Troubleshooting Guide

This guide covers common issues with container execution and multi-channel messaging in Backend.AI GO.

Container Runtime Issues

Runtime Not Detected

Symptom: Settings > Container shows "Not available" for the runtime.

Diagnosis:

curl http://localhost:55765/api/v1/container/runtime
# Expected: "available": true

Solutions:

  1. Verify Apple Container is installed:

    which container
    container --version
    
  2. Start the container system service:

    container system start
    
  3. If the command is not found, reinstall from the Apple Container GitHub releases.

  1. Verify Docker is running:

    docker version
    
  2. If Docker is not running, start Docker Desktop (macOS/Windows) or the Docker daemon (Linux):

    # Linux only
    sudo systemctl start docker
    sudo systemctl enable docker
    
  3. Verify your user is in the docker group (Linux):

    groups $USER | grep docker
    # If not in group:
    sudo usermod -aG docker $USER
    newgrp docker
    

Runtime Detected but Commands Fail

Symptom: Runtime shows as available but container operations fail.

Solutions:

  • Restart the container daemon: Docker Desktop → Quit and reopen; container system stop && container system start for Apple Container
  • Check available disk space (containers need disk to run)
  • Review the Backend.AI GO application logs for detailed error messages

Image Build Issues

Build Fails Immediately

Symptom: Image build starts but fails right away.

Diagnosis:

curl http://localhost:55765/api/v1/container/image/status

Solutions:

  • Ensure the container runtime is running before attempting a build
  • Check available disk space (the agent runner image is approximately 1-2 GB)
  • Look at the output field in the build status response for the error

Build Hangs Indefinitely

Symptom: Build progress indicator runs for a long time with no result.

Solutions:

  • Network issues can cause base image downloads to hang; check your internet connection
  • If using a proxy, ensure Docker/Apple Container is configured to use it
  • Cancel the build and try again after verifying network connectivity

Image Becomes Outdated

Symptom: Container agents fail with missing tools or incompatible dependencies.

Solution: Rebuild the image after a Backend.AI GO update:

  • Settings > Container > Image → Click Rebuild Image

Mount Security Issues

Mount Rejected: "Path contains blocked pattern"

Symptom: Mount validation fails with a blocked pattern error.

Cause: The host path contains a blocked pattern (e.g., .ssh, .env, .aws).

Solution: Use a different path that does not contain sensitive directory names, or copy only the necessary files to an approved location.

Mount Rejected: "Path not under any allowed root"

Symptom: Mount validation fails even though the path does not contain blocked patterns.

Cause: The path is not under any configured allowed root.

Solution:

  1. Go to Settings > Container > Mount Security.

  2. Add the parent directory of the path you want to use as an allowed root.

  3. Be specific — avoid adding broad roots like your home directory.

Symptom: A path is rejected even though the literal path looks safe.

Cause: The path resolves via symlinks to a blocked location.

Solution: Use the canonical (real) path, not a symlink. The mount validator canonicalizes all paths before checking.


Credential Proxy Issues

API Calls Fail Inside Container

Symptom: Container agents report authentication errors when calling the API.

Diagnosis Steps:

  1. Verify the proxy is running:

    curl http://localhost:55765/api/v1/container/credential-proxy/status
    # Expected: "running": true
    
  2. Verify the credential mapping exists:

    curl http://localhost:55765/api/v1/container/credential-proxy/status
    # Check "mappingCount" > 0
    
  3. Check the container environment:

    • ANTHROPIC_BASE_URL should point to http://host-gateway:3001
    • ANTHROPIC_API_KEY should be CREDENTIAL_PROXY_PLACEHOLDER

Solutions:

  • Restart the credential proxy: Settings > Container > Credential Proxy → Toggle off and on
  • Verify the upstream URL in the credential mapping is correct and reachable
  • Check that the real API key is valid

Proxy Starts But Requests Fail

Symptom: Proxy status shows running, but API calls from containers still fail.

Solutions:

  • Verify the host-gateway address is correct for your platform:
    • Apple Container: typically 192.168.64.1
    • Docker on Linux: typically 172.17.0.1
    • Docker on macOS/Windows: host-docker-internal or check docker network inspect bridge
  • Check firewall rules are not blocking port 3001

IPC Communication Issues

Container Does Not Respond to Follow-Up Messages

Symptom: Steering messages sent during Cowork or Squad execution are not received by the container.

Diagnosis:

Check if the IPC directory structure exists:

DATA_DIR/sessions/{group}/{session}/ipc/input/

Solutions:

  • Verify the container is still running (check run history)
  • Ensure the IPC directories were created: use POST /api/v1/container/ipc/directories
  • Check that the container has the IPC directory mounted at /workspace/ipc

Container Sends Messages But They Are Not Delivered

Symptom: Containers write to ipc/messages/ but messages do not appear in the channel.

Solutions:

  • Verify the channel is connected: Settings > Channels > {Channel}
  • Check the message router status in the application logs
  • Verify the chat JID in the message matches a connected channel's JID format

Channel Connection Issues

Telegram Bot Does Not Respond

Solutions:

  1. Verify the bot token is valid: Settings > Channels > Telegram > Validate Token

  2. Check the connection status:

    curl http://localhost:55765/api/v1/channels/telegram
    
  3. In group chats, ensure the bot is a member of the group and has message reading permissions.

  4. Test with a direct message to the bot first.

Slack Bot Not Receiving Messages

Solutions:

  1. Verify both tokens are valid (bot token xoxb-... and app token xapp-...)

  2. Verify Socket Mode is enabled in the Slack app configuration

  3. Check that the required bot scopes are granted:

    • app_mentions:read, channels:history, chat:write, groups:history, im:history, im:read, im:write
  4. Ensure the bot is installed to your workspace and added to the channels you want it to monitor

Discord Bot Not Responding in Channels

Solutions:

  1. Verify the Message Content Intent is enabled in the Discord Developer Portal

  2. Ensure the bot is invited to the server with the correct permissions

  3. In text channels, the bot only responds to @mentions — verify you are mentioning the bot

  4. Check the gateway connection status:

    curl http://localhost:55765/api/v1/channels/discord
    

WhatsApp Webhook Verification Fails

Solutions:

  1. Verify the webhook URL is publicly accessible (HTTPS required):

    curl https://your-domain.com/api/v1/channels/whatsapp/webhook?hub.mode=subscribe&hub.verify_token=your-token&hub.challenge=test
    # Should return: test
    
  2. Ensure the verify token in Backend.AI GO matches the one you entered in Meta's developer portal

  3. Check that your HTTPS certificate is valid (self-signed certificates are not accepted by WhatsApp)


Task Scheduling Issues

Scheduled Tasks Do Not Run

Diagnosis:

curl http://localhost:55765/api/v1/container/schedules
# Check "enabled": true and "nextRun" timestamp

Solutions:

  • Verify enabled is true on the schedule
  • Check that the container runtime is available
  • Review the schedule's run logs for errors:

    curl http://localhost:55765/api/v1/container/schedules/{id}/logs
    

Cron Tasks Fire at Wrong Times

Solutions:

  • Verify your system timezone is configured correctly
  • Use a cron expression validator (e.g., crontab.guru) to check your expression
  • Remember that cron uses local system time, not UTC

Interval Tasks Drift Over Time

Solutions:

  • The scheduler uses drift-prevention: next_run is computed from the scheduled time, not execution time
  • Verify system clock is accurate (sync with NTP)
  • Check for excessively long task execution times that might delay the scheduler

Security Event Issues

Unexpected "violation" Events in Audit Log

Symptom: The audit log shows permission violations for container operations you expect to be allowed.

Diagnosis:

curl "http://localhost:55765/api/v1/container/audit-log?severity=violation"

Solutions:

  • Check if the container is in a non-main group and is trying to send messages to a different group's chat JID (this is expected behavior and the block is correct)
  • If the violation is unexpected, review the group namespace configuration

Security Audit Marks Containers as Non-Compliant

Symptom: Periodic security audits report containers as non-compliant.

Solutions:

  • Verify you have not modified the container environment variables (especially ANTHROPIC_API_KEY)
  • Check that mounts have not changed since the container started
  • Review the audit log details for the specific check that failed

Getting Help

If you cannot resolve an issue using this guide:

  1. Enable Debug Logging in Settings > Advanced

  2. Collect the application logs:

    aigo logs --last 500
    
  3. Export the container audit log:

    curl "http://localhost:55765/api/v1/container/audit-log?limit=500" > audit.json
    
  4. Report the issue with the collected logs at the Backend.AI GO GitHub repository.