10.5. Remote Model Control¶

Backend.AI GO allows the head node to load and unload models on connected remote nodes without requiring direct access to those machines. This completes the distributed inference workflow: you can connect nodes, route inference requests across them, and now manage what models are running on each node — all from a single interface.

Overview¶

When a remote node is registered, the head node stores its base URL and API key. Remote model control uses this information to send model management commands to the remote node's Management API on your behalf.

The following operations are supported:

Load a model — instruct a remote node to start serving a model
Unload a model — stop a model that is running on a remote node
List loaded models — query which models are currently active on a remote node

API Reference¶

All three operations are available as both Tauri IPC commands (desktop app) and REST API endpoints (headless mode).

Load a Model on a Remote Node¶

Tauri IPC: load_model_on_remote_node

REST: POST /api/v1/nodes/{fingerprint}/models/load

Request body:

{
  "modelId": "llama-3-8b-q4",
  "modelPath": "/path/to/model.gguf",
  "contextLength": 4096,
  "gpuLayers": -1
}

Field	Type	Required	Description
`modelId`	string	Yes	Model identifier to load
`modelPath`	string	No	Absolute path to the model file on the remote node
`contextLength`	number	No	Override the default context length
`gpuLayers`	number	No	Number of GPU layers (-1 for all)

Response:

{
  "success": true,
  "modelId": "llama-3-8b-q4",
  "nodeFingerprint": "fp_abc123"
}

On failure, success is false and an error field describes the problem.

Unload a Model from a Remote Node¶

Tauri IPC: unload_model_on_remote_node

REST: POST /api/v1/nodes/{fingerprint}/models/unload

Request body:

{
  "modelId": "llama-3-8b-q4"
}

Response:

{
  "success": true,
  "modelId": "llama-3-8b-q4",
  "nodeFingerprint": "fp_abc123"
}

List Models on a Remote Node¶

Tauri IPC: get_remote_node_models

REST: GET /api/v1/nodes/{fingerprint}/models

Response:

{
  "data": [
    {
      "id": "llama-3-8b-q4",
      "path": "/models/llama-3-8b-q4.gguf",
      "endpoint": "http://127.0.0.1:8081",
      "healthy": true
    }
  ]
}

Real-Time Updates¶

When a model is successfully loaded or unloaded on a remote node, the head node emits a distributed-pool-changed event. The frontend nodeStore subscribes to this event and automatically:

Refreshes the remote model list for the affected node (fetchRemoteModels).
Refreshes the overall distributed pool status (getDistributedPoolStatus).

This keeps the UI in sync without requiring manual page refreshes.

In headless mode the same event is broadcast over the Server-Sent Events (SSE) stream at GET /api/v1/events.

Timeouts¶

Operations use conservative timeouts to account for large models:

Operation	Timeout
Load model	600 seconds (10 minutes)
Unload model	60 seconds
List models	30 seconds

If a timeout occurs, the response contains "success": false with an error message indicating the timeout duration.

Error Handling¶

Condition	Behavior
Node fingerprint not registered	Returns 404 Not Found
Remote node unreachable (connection refused)	Returns error with connection message
Remote node request times out	Returns error with timeout duration
Remote node returns non-2xx response	Parses the remote error body and returns it
Empty or path-traversal model ID	Rejected before the request is sent

Model IDs are validated before any network request is made. Empty model IDs and IDs containing .. sequences are rejected to prevent URL path manipulation.

Frontend Store¶

The useNodeStore Zustand store exposes the following selectors and actions for remote model control:

Export	Type	Description
`useRemoteModels(fingerprint)`	selector	Cached models for a specific node
`useAllRemoteModels()`	selector	All remote models keyed by fingerprint
`useIsRemoteModelOperating(fingerprint)`	selector	Whether a load/unload is in progress
`loadModelOnNode(fingerprint, request)`	action	Load a model on a remote node
`unloadModelOnNode(fingerprint, modelId)`	action	Unload a model from a remote node
`fetchRemoteModels(fingerprint)`	action	Refresh the model list for a node

Unload performs an optimistic update: the model is removed from the local cache immediately, and the distributed-pool-changed event triggers a full refresh from the backend.

Security¶

All requests to remote nodes use Authorization: Bearer <api_key> headers. The API key is stored in the local node registry and never exposed in logs or the UI. Model IDs are percent-encoded in URLs to prevent path injection.

Distributed Routing — Automatic request routing across nodes
Manual Registration — Connect to remote nodes
Auto-Discovery — Discover nodes on the local network

10.5. Remote Model Control¶

Overview¶

API Reference¶

Load a Model on a Remote Node¶

Unload a Model from a Remote Node¶

List Models on a Remote Node¶

Real-Time Updates¶

Timeouts¶

Error Handling¶

Frontend Store¶

Security¶

Related Pages¶