Skip to content

Multi-Node Overview

Backend.AI GO is designed not just as a standalone application but as a node in a larger computational mesh. This architecture allows you to scale beyond the limits of your local hardware by connecting to other Backend.AI GO instances, Continuum Routers, or enterprise-grade Backend.AI Clusters.

Why Multi-Node?

Local inference is great for privacy and low latency, but it is constrained by your hardware (VRAM, compute power). Multi-node capabilities allow you to:

  • Access Larger Models: Run models that don't fit in your local GPU by offloading them to a server or cluster.
  • Scale Throughput: Distribute inference requests across multiple nodes to handle higher concurrency.
  • Centralized Management: Use Backend.AI GO as a client to manage and utilize resources from a powerful central server.

Node Types

In the Backend.AI GO ecosystem, there are three primary types of connections:

  1. Local Node: Your current machine running Backend.AI GO. It runs its own Continuum Router and manages local models.
  2. Peer Node (Backend.AI GO): Another computer running Backend.AI GO. You can connect to it directly to use its loaded models.
  3. Backend.AI Cluster: An enterprise-grade cluster managed by Backend.AI Core. This provides massive scalability, user management, and security features.

How It Works

Backend.AI GO uses a Mesh Networking approach.

  1. The Continuum Router: At the heart of every node is the Continuum Router. It acts as an API gateway that routes your prompts to the appropriate backend (local process, remote peer, or cloud API).
  2. Unified API: Whether the model is running locally on your laptop or on an H100 cluster in a data center, Backend.AI GO treats it the same way. You simply select the model from the dropdown, and the system handles the routing.
  3. Security: Connections to Clusters are secured using HMAC-SHA256 authentication (Access Key & Secret Key).

Getting Started

To expand your compute capabilities, you can: