Skip to content

7.3. Auto-Discovery

Backend.AI GO uses mDNS (Multicast DNS) — also known as Bonjour on macOS — to automatically announce and discover instances on your local network. When mDNS advertising is enabled, peers can see your node and initiate a connection without any manual IP configuration.

How mDNS Discovery Works

mDNS is a zero-configuration protocol that operates over UDP port 5353 using multicast addresses (224.0.0.251 for IPv4, ff02::fb for IPv6). Nodes send announcements to the multicast group, and all other nodes on the same subnet receive them without a central DNS server.

Backend.AI GO registers under the service type _bago._tcp.local.. Every instance on the network that is browsing for this type will see the announcement.

Node A (advertiser)                  Node B (discoverer)
        |                                    |
        |--- mDNS announcement ----------->  |  (UDP 5353 multicast)
        |    _bago._tcp.local.               |
        |    TXT: name, version,             |
        |         fingerprint, port,         |
        |         engines, models,           |
        |         accelerator                |
        |                                    |
        |                    [Node B sees A] |
        |                                    |
        |<-- Re-announcement every ~20-30s --| (daemon keeps it alive)
        |                                    |
        |--- Graceful departure -----------> | (ServiceRemoved event)
        |    (on shutdown)                   |

Each instance also runs a discovery listener, so the same node can advertise and discover peers simultaneously.

TXT Record Format

When advertising, Backend.AI GO includes the following data in mDNS TXT records:

Field Required Description Example
name Yes Human-readable node name My GO Node
fingerprint Yes Unique node identifier fp_a1b2c3...
port Yes Management API port 11434
version Yes Application version 0.9.0
engines No Comma-separated installed engines llama-cpp,mlx
models No Count of locally available models 5
accelerator No Primary hardware accelerator metal, cuda, rocm, cpu

The engines and models fields are omitted when no engines are installed or when the engine manager is unavailable. The accelerator field is omitted when no accelerator is detected.

Capability data updates automatically when:

  • An engine is installed or removed (triggered by the engines-changed event).
  • The model library changes (triggered by the models-updated event).

Setting Up mDNS Advertising

Enable Advertising on This Node

  1. Open Settings in Backend.AI GO.
  2. Navigate to the Nodes tab.
  3. Under Node Sharing, find the Node Information section and set a Node Name (required before advertising can be enabled).
  4. Under Network Advertising, toggle Advertise on local network (Bonjour/mDNS) to ON.

Once enabled, this instance announces itself every 20–30 seconds. Other Backend.AI GO instances on the same subnet will see it appear in their discovery list.

Discover Nodes on the Network

  1. Open Settings > Nodes > Connected Nodes.
  2. Click Add Node.
  3. The Add Node dialog shows a live list of discovered nodes.
  4. Select a node and click Connect, then complete the connection using the connection key or QR code from the target node.

Discovered nodes display:

  • Node name and IP address
  • Software version
  • Installed inference engines (e.g., llama-cpp, mlx)
  • Number of locally available models
  • Hardware accelerator type (e.g., metal, cuda, rocm, cpu)

Self-Node Filtering

Each node has a unique fingerprint generated at first launch. The discovery service uses this fingerprint to filter out its own mDNS announcement. Your own node never appears in the discovered-nodes list even if it is advertising.

Stale Node Detection

If a node shuts down abruptly without sending a departure announcement, the discovery service marks it stale and removes it automatically.

  • Staleness threshold: A node is considered stale if it has not been re-announced within 90 seconds. mDNS daemons typically re-announce every 20–30 seconds, so 90 seconds allows three missed announcements before a node is treated as offline.
  • Periodic cleanup: Every 30 seconds, the discovery service removes stale entries from the node registry.
  • Read-time filtering: get_discovered_nodes() also filters stale nodes on every call, providing a safety net between scheduled cleanups.

When a node comes back online it re-announces and immediately reappears in the peer list.

Troubleshooting Discovery

Nodes Not Appearing

  • Same subnet required: mDNS multicast does not cross router boundaries. Both nodes must be on the same Layer 2 network segment. VLANs or routed subnets will block discovery.
  • Firewall rules: Ensure UDP port 5353 is open for multicast traffic on all participating machines. Many operating system firewalls block multicast by default on non-trusted network profiles.
  • mDNS advertising not enabled: Confirm the Advertise on local network toggle is ON in Settings > Nodes > Node Sharing for the node you expect to discover.
  • Node name not set: The advertising toggle is disabled until a node name is configured. Check that a name is set before enabling advertising.

Nodes Disappearing Unexpectedly

  • A node vanishes from the peer list if its mDNS announcements are not received within 90 seconds. This can happen when:
  • A firewall blocks outgoing multicast packets on the advertising node.
  • The node's network interface changes (e.g., switching from Wi-Fi to Ethernet).
  • The system enters sleep and the mDNS daemon pauses.

Platform-Specific Notes

Platform mDNS Implementation Notes
macOS Bonjour (built-in) Fully supported. No additional software required.
Linux Avahi (common) or systemd-resolved Install avahi-daemon if mDNS is not working. Ensure port 5353 is allowed in ufw or firewalld.
Windows Windows mDNS (built-in since Windows 10) Supported. Ensure the Network Discovery feature is enabled in Network & Sharing Center.

Connection Still Requires Approval

mDNS discovery only announces that a node exists. Connecting to a discovered node still requires a connection key or QR code from the target node (generated in Settings > Nodes > Node Sharing > Generate Connection Key). Discovery lowers the barrier by eliminating manual IP lookup, but does not bypass the authorization step.

Remote Model Discovery After Connection

Once two nodes are connected, Backend.AI GO performs a catalog exchange:

  1. Handshake: On connection, the remote node sends its list of active backends (loaded models and services).
  2. Dynamic updates: When a model is loaded or unloaded on the remote node, your local instance is notified on the next refresh.
  3. Model selector: Remote models appear in the chat interface's model selector, grouped by node name.

Local Model Discovery

On the local machine, Backend.AI GO also monitors the file system and active processes:

  • Folder monitoring: When a model is downloaded to the models/ directory, Backend.AI GO detects the file type (GGUF, safetensors) and determines the appropriate runner (llama.cpp, MLX).
  • Process binding: When a model loads, it registers with the local router, which exposes the capability to the frontend UI.