Multi-Node Overview¶
Backend.AI GO is designed not just as a standalone application but as a node in a larger computational mesh. This architecture allows you to scale beyond the limits of your local hardware by connecting to other Backend.AI GO instances, Continuum Routers, or enterprise-grade Backend.AI Clusters.
Why Multi-Node?¶
Local inference is great for privacy and low latency, but it is constrained by your hardware (VRAM, compute power). Multi-node capabilities allow you to:
- Access Larger Models: Run models that don't fit in your local GPU by offloading them to a server or cluster.
- Scale Throughput: Distribute inference requests across multiple nodes to handle higher concurrency.
- Centralized Management: Use Backend.AI GO as a client to manage and utilize resources from a powerful central server.
Node Types¶
In the Backend.AI GO ecosystem, there are three primary types of connections:
- Local Node: Your current machine running Backend.AI GO. It runs its own Continuum Router and manages local models.
- Peer Node (Backend.AI GO): Another computer running Backend.AI GO. You can connect to it directly to use its loaded models.
- Backend.AI Cluster: An enterprise-grade cluster managed by Backend.AI Core. This provides massive scalability, user management, and security features.
How It Works¶
Backend.AI GO uses a Mesh Networking approach.
- The Continuum Router: At the heart of every node is the Continuum Router. It acts as an API gateway that routes your prompts to the appropriate backend (local process, remote peer, or cloud API).
- Unified API: Whether the model is running locally on your laptop or on an H100 cluster in a data center, Backend.AI GO treats it the same way. You simply select the model from the dropdown, and the system handles the routing.
- Security: Connections to Clusters are secured using HMAC-SHA256 authentication (Access Key & Secret Key).
Getting Started¶
To expand your compute capabilities, you can:
- Manually Register a remote node or cluster.
- Use Auto-Discovery to find resources (services) available on connected nodes.