Skip to content

8.5. Remote Model Control

Backend.AI GO allows the head node to load and unload models on connected remote nodes without requiring direct access to those machines. This completes the distributed inference workflow: you can connect nodes, route inference requests across them, and now manage what models are running on each node — all from a single interface.

Overview

When a remote node is registered, the head node stores its base URL and API key. Remote model control uses this information to send model management commands to the remote node's Management API on your behalf.

The following operations are supported:

  • Load a model — instruct a remote node to start serving a model
  • Unload a model — stop a model that is running on a remote node
  • List loaded models — query which models are currently active on a remote node

API Reference

All three operations are available as both Tauri IPC commands (desktop app) and REST API endpoints (headless mode).

Load a Model on a Remote Node

Tauri IPC: load_model_on_remote_node

REST: POST /api/v1/nodes/{fingerprint}/models/load

Request body:

{
  "modelId": "llama-3-8b-q4",
  "modelPath": "/path/to/model.gguf",
  "contextLength": 4096,
  "gpuLayers": -1
}
Field Type Required Description
modelId string Yes Model identifier to load
modelPath string No Absolute path to the model file on the remote node
contextLength number No Override the default context length
gpuLayers number No Number of GPU layers (-1 for all)

Response:

{
  "success": true,
  "modelId": "llama-3-8b-q4",
  "nodeFingerprint": "fp_abc123"
}

On failure, success is false and an error field describes the problem.

Unload a Model from a Remote Node

Tauri IPC: unload_model_on_remote_node

REST: POST /api/v1/nodes/{fingerprint}/models/unload

Request body:

{
  "modelId": "llama-3-8b-q4"
}

Response:

{
  "success": true,
  "modelId": "llama-3-8b-q4",
  "nodeFingerprint": "fp_abc123"
}

List Models on a Remote Node

Tauri IPC: get_remote_node_models

REST: GET /api/v1/nodes/{fingerprint}/models

Response:

{
  "data": [
    {
      "id": "llama-3-8b-q4",
      "path": "/models/llama-3-8b-q4.gguf",
      "endpoint": "http://127.0.0.1:8081",
      "healthy": true
    }
  ]
}

Real-Time Updates

When a model is successfully loaded or unloaded on a remote node, the head node emits a distributed-pool-changed event. The frontend nodeStore subscribes to this event and automatically:

  1. Refreshes the remote model list for the affected node (fetchRemoteModels).
  2. Refreshes the overall distributed pool status (getDistributedPoolStatus).

This keeps the UI in sync without requiring manual page refreshes.

In headless mode the same event is broadcast over the Server-Sent Events (SSE) stream at GET /api/v1/events.

Timeouts

Operations use conservative timeouts to account for large models:

Operation Timeout
Load model 600 seconds (10 minutes)
Unload model 60 seconds
List models 30 seconds

If a timeout occurs, the response contains "success": false with an error message indicating the timeout duration.

Error Handling

Condition Behavior
Node fingerprint not registered Returns 404 Not Found
Remote node unreachable (connection refused) Returns error with connection message
Remote node request times out Returns error with timeout duration
Remote node returns non-2xx response Parses the remote error body and returns it
Empty or path-traversal model ID Rejected before the request is sent

Model IDs are validated before any network request is made. Empty model IDs and IDs containing .. sequences are rejected to prevent URL path manipulation.

Frontend Store

The useNodeStore Zustand store exposes the following selectors and actions for remote model control:

Export Type Description
useRemoteModels(fingerprint) selector Cached models for a specific node
useAllRemoteModels() selector All remote models keyed by fingerprint
useIsRemoteModelOperating(fingerprint) selector Whether a load/unload is in progress
loadModelOnNode(fingerprint, request) action Load a model on a remote node
unloadModelOnNode(fingerprint, modelId) action Unload a model from a remote node
fetchRemoteModels(fingerprint) action Refresh the model list for a node

Unload performs an optimistic update: the model is removed from the local cache immediately, and the distributed-pool-changed event triggers a full refresh from the backend.

Security

All requests to remote nodes use Authorization: Bearer <api_key> headers. The API key is stored in the local node registry and never exposed in logs or the UI. Model IDs are percent-encoded in URLs to prevent path injection.