8.5. Remote Model Control¶
Backend.AI GO allows the head node to load and unload models on connected remote nodes without requiring direct access to those machines. This completes the distributed inference workflow: you can connect nodes, route inference requests across them, and now manage what models are running on each node — all from a single interface.
Overview¶
When a remote node is registered, the head node stores its base URL and API key. Remote model control uses this information to send model management commands to the remote node's Management API on your behalf.
The following operations are supported:
- Load a model — instruct a remote node to start serving a model
- Unload a model — stop a model that is running on a remote node
- List loaded models — query which models are currently active on a remote node
API Reference¶
All three operations are available as both Tauri IPC commands (desktop app) and REST API endpoints (headless mode).
Load a Model on a Remote Node¶
Tauri IPC: load_model_on_remote_node
REST: POST /api/v1/nodes/{fingerprint}/models/load
Request body:
{
"modelId": "llama-3-8b-q4",
"modelPath": "/path/to/model.gguf",
"contextLength": 4096,
"gpuLayers": -1
}
| Field | Type | Required | Description |
|---|---|---|---|
modelId | string | Yes | Model identifier to load |
modelPath | string | No | Absolute path to the model file on the remote node |
contextLength | number | No | Override the default context length |
gpuLayers | number | No | Number of GPU layers (-1 for all) |
Response:
On failure, success is false and an error field describes the problem.
Unload a Model from a Remote Node¶
Tauri IPC: unload_model_on_remote_node
REST: POST /api/v1/nodes/{fingerprint}/models/unload
Request body:
Response:
List Models on a Remote Node¶
Tauri IPC: get_remote_node_models
REST: GET /api/v1/nodes/{fingerprint}/models
Response:
{
"data": [
{
"id": "llama-3-8b-q4",
"path": "/models/llama-3-8b-q4.gguf",
"endpoint": "http://127.0.0.1:8081",
"healthy": true
}
]
}
Real-Time Updates¶
When a model is successfully loaded or unloaded on a remote node, the head node emits a distributed-pool-changed event. The frontend nodeStore subscribes to this event and automatically:
- Refreshes the remote model list for the affected node (
fetchRemoteModels). - Refreshes the overall distributed pool status (
getDistributedPoolStatus).
This keeps the UI in sync without requiring manual page refreshes.
In headless mode the same event is broadcast over the Server-Sent Events (SSE) stream at GET /api/v1/events.
Timeouts¶
Operations use conservative timeouts to account for large models:
| Operation | Timeout |
|---|---|
| Load model | 600 seconds (10 minutes) |
| Unload model | 60 seconds |
| List models | 30 seconds |
If a timeout occurs, the response contains "success": false with an error message indicating the timeout duration.
Error Handling¶
| Condition | Behavior |
|---|---|
| Node fingerprint not registered | Returns 404 Not Found |
| Remote node unreachable (connection refused) | Returns error with connection message |
| Remote node request times out | Returns error with timeout duration |
| Remote node returns non-2xx response | Parses the remote error body and returns it |
| Empty or path-traversal model ID | Rejected before the request is sent |
Model IDs are validated before any network request is made. Empty model IDs and IDs containing .. sequences are rejected to prevent URL path manipulation.
Frontend Store¶
The useNodeStore Zustand store exposes the following selectors and actions for remote model control:
| Export | Type | Description |
|---|---|---|
useRemoteModels(fingerprint) | selector | Cached models for a specific node |
useAllRemoteModels() | selector | All remote models keyed by fingerprint |
useIsRemoteModelOperating(fingerprint) | selector | Whether a load/unload is in progress |
loadModelOnNode(fingerprint, request) | action | Load a model on a remote node |
unloadModelOnNode(fingerprint, modelId) | action | Unload a model from a remote node |
fetchRemoteModels(fingerprint) | action | Refresh the model list for a node |
Unload performs an optimistic update: the model is removed from the local cache immediately, and the distributed-pool-changed event triggers a full refresh from the backend.
Security¶
All requests to remote nodes use Authorization: Bearer <api_key> headers. The API key is stored in the local node registry and never exposed in logs or the UI. Model IDs are percent-encoded in URLs to prevent path injection.
Related Pages¶
- Distributed Routing — Automatic request routing across nodes
- Manual Registration — Connect to remote nodes
- Auto-Discovery — Discover nodes on the local network