CLI Reference¶
The backend-ai-go CLI tool provides command-line access to the Backend.AI GO Management API. Use this tool to manage local models, control inference servers, monitor system resources, and interact with loaded models from the terminal.
Installation¶
The CLI is included with the Backend.AI GO distribution. If you are building from source:
Usage¶
Global Options¶
| Option | Short | Environment Variable | Description |
|---|---|---|---|
--endpoint | -e | BACKEND_AI_GO_ENDPOINT | Management API endpoint (URL or configured name). |
--token | -t | BACKEND_AI_GO_TOKEN | API authentication token. |
--output | -o | BACKEND_AI_GO_OUTPUT | Output format: console, json, yaml. |
--quiet | -q | Suppress non-essential output. | |
--verbose | -v | Enable verbose output. | |
--no-verify-ssl | Skip SSL certificate verification. |
Commands¶
config - Configuration Management¶
Manage CLI configuration settings.
backend-ai-go config path: Show configuration file path.backend-ai-go config get <KEY>: Get a configuration value.backend-ai-go config set <KEY> <VALUE>: Set a configuration value.backend-ai-go config list: List all configuration values.backend-ai-go config reset: Reset configuration to defaults.
model - Local Model Management¶
Manage models stored on the local disk.
backend-ai-go model list: List all local models.backend-ai-go model info <MODEL_ID>: Get detailed information about a specific model.backend-ai-go model refresh: Refresh the model index (scan for new files).
loaded - Loaded Model Operations¶
Control models currently loaded into memory for inference.
backend-ai-go loaded list: List currently loaded models.backend-ai-go loaded info <ID>: Get details of a loaded model instance.backend-ai-go loaded load [OPTIONS] <MODEL_ID>: Load a model into memory.- Options:
-c, --context-length <INT>: Override context length.-g, --gpu-layers <INT>: Number of layers to offload to GPU (-1 for all).-t, --threads <INT>: Number of threads to use.-a, --alias <STRING>: Model alias for routing.--tool-calling: Enable tool calling capabilities.--mmproj <PATH>: Path to mmproj file for vision models.
- Options:
backend-ai-go loaded unload <ID>: Unload a model to free resources.backend-ai-go loaded health <ID>: Check the health status of a loaded model.
router - Router Control¶
Manage the Continuum Router service.
backend-ai-go router status: Get the current status of the router.backend-ai-go router start: Start the router service.backend-ai-go router stop: Stop the router service.backend-ai-go router restart: Restart the router service.
system - System Monitoring¶
Monitor hardware resources and API status.
backend-ai-go system info: Get general system information (OS, Architecture).backend-ai-go system metrics: Get current system metrics (CPU, RAM usage).backend-ai-go system gpu: Get detailed GPU information.backend-ai-go system health: Check the overall API health.backend-ai-go system version: Get the API server version.
Examples¶
List all available models in JSON format:
Load a model with custom GPU layers:
Check system GPU status: