Skip to content

9.8. Offline-Only Setup

Some environments—classified networks, industrial control rooms, healthcare facilities—have no internet access by design. Backend.AI GO can run entirely offline once you have the required models and engines on the machine.

This guide walks through the complete workflow: preparing model packages on a connected machine, transferring them to the air-gapped target, importing them, and configuring Backend.AI GO for fully offline operation.

Overview

graph LR
    A[Online Machine] -->|Export .baimodel| B[USB / Internal Network]
    B -->|Transfer| C[Air-Gapped Machine]
    C -->|Import .baimodel| D[Backend.AI GO<br/>Offline]

The workflow has three phases:

  1. Prepare — On an internet-connected machine, download models and engines, then export models as .baimodel packages.
  2. Transfer — Move the .baimodel files and engine binaries to the air-gapped machine via USB drive, internal file server, or any approved transfer method.
  3. Import & Configure — On the air-gapped machine, import the packages and disable online features.

Prerequisites

  • Backend.AI GO installed on both the online and offline machines
  • A transfer medium (USB drive, internal network share, etc.) with enough capacity for your model files (typically 2–20 GB per model)
  • Administrative access on both machines to install engines and manage files

Phase 1: Prepare on the Online Machine

Step 1: Download Models

  1. Open Backend.AI GO on the internet-connected machine.

  2. Go to the Search page (Hugging Face icon in the sidebar).

  3. Find and download the models you need. For offline use, consider:

    • General chat: Llama 3.2 3B Instruct, Qwen 2.5 7B Instruct
    • Translation: TranslateGemma 4B Instruct
    • Coding: Qwen 2.5 Coder 7B Instruct
    • Image generation: Stable Diffusion models (GGUF format)
  4. Wait for each download to complete. You can monitor progress in the floating Download Queue panel.

Choose the Right Size

Smaller models (1–4B parameters) require less RAM and disk space, making them practical for constrained offline environments. Larger models (7–14B) produce better results but need more resources.

Step 2: Install Engines

Engines are the inference backends that run models. You need to install them on the online machine first, then transfer them along with the model packages.

  1. Go to the Engines page in the sidebar.

  2. Download the engine variant that matches the offline machine's hardware:

    Offline Machine Hardware Engine Variant
    NVIDIA GPU llama.cpp with CUDA
    AMD GPU llama.cpp with ROCm/HIP
    Apple Silicon llama.cpp with Metal, MLX, or MLXcel
    CPU only llama.cpp CPU
  3. Note the engine version and variant for later reference.

Engine Compatibility

The engine installed on the offline machine must match its hardware exactly. An engine built for CUDA will not work on a machine without an NVIDIA GPU.

Step 3: Export Models as .baimodel Packages

The .baimodel format is a portable ZIP archive containing the model file, metadata, and integrity checksums. It enables model distribution without requiring internet access on the target machine.

  1. Go to the Models page (local models list).

  2. Find the model you want to export and click the three-dot menu (or right-click).

  3. Select Export Model.

  4. In the Export dialog:

    • Review the model name, format, and size.
    • If the model has a vision projector (mmproj), choose whether to include it.
    • Note the Total package size — you will need at least this much space on your transfer medium.
  5. Click Export and choose a save location (e.g., a USB drive or network share).

  6. Wait for the export to complete. The progress bar shows the hashing, packaging, and verification phases.

  7. Repeat for each model you want to transfer.

What's Inside a .baimodel Package

A .baimodel file is a ZIP archive (using STORE compression for speed) containing:

  • manifest.json — Package metadata: model name, format, checksums, creation info
  • metadata.json — Backend.AI GO model metadata (optional)
  • model/ — The model file(s) (GGUF, mmproj, etc.)

SHA256 checksums are calculated during export and verified on import to ensure file integrity.

Phase 2: Transfer to the Air-Gapped Machine

Gather the following files for transfer:

Item Location Purpose
.baimodel packages Exported in Step 3 Model files with metadata
Backend.AI GO installer Official download site Application installer (if not already installed)

Transfer Methods

  • USB drive — Copy files to a USB drive, scan for malware per your organization's policy, then connect to the air-gapped machine.
  • Internal file server — Upload to an approved internal network share accessible from the air-gapped network segment.
  • Optical media — Burn to DVD/Blu-ray for write-once, tamper-evident transfers.

Verify File Integrity

After transferring, verify that file sizes match the originals. The .baimodel import process includes SHA256 checksum verification, so corrupted files will be detected automatically during import.

Phase 3: Import and Configure on the Offline Machine

Step 1: Install Backend.AI GO

If Backend.AI GO is not already installed on the air-gapped machine, install it using the transferred installer package. See the Installation guide for platform-specific instructions.

Step 2: Install Engines

Before importing models, ensure the correct inference engine is available on the offline machine. If the offline machine was set up while it still had internet access, the engines may already be installed. Check the Engines page to verify.

If engines are not installed, you will need to transfer and install them manually. The engine files are located in the Backend.AI GO application data directory on the online machine.

Step 3: Import .baimodel Packages

  1. Open Backend.AI GO on the air-gapped machine.

  2. Go to the Models page.

  3. Click the Import button (or use the menu).

  4. In the Import dialog, click Select File and navigate to your .baimodel file.

  5. The dialog validates the package and shows a preview:

    • Model name, format, and quantization
    • Publisher and repository information
    • File count and total size
    • Any validation warnings
  6. If the model already exists on this machine, toggle Replace if model already exists to overwrite it.

  7. Click Import to begin the extraction.

  8. The progress bar shows extracting and verifying phases. SHA256 checksums are verified automatically.

  9. Once complete, the model appears in your local models list, ready to be loaded.

  10. Repeat for each .baimodel package.

Drag and Drop

You can also drag a .baimodel file directly onto the Models page to start the import process.

Step 4: Configure for Offline Operation

With no internet connection, certain features that depend on external services should be disabled to avoid unnecessary connection attempts and error messages.

Disable Update Checks

  1. Go to Settings > Advanced.

  2. Set the Update channel to a configuration that avoids reaching external update servers, or simply dismiss any update notifications that appear.

Since the machine has no internet access, update checks will time out silently, but disabling them avoids unnecessary network requests.

Avoid Online-Dependent Features

The following features require an internet connection and will not function on the air-gapped machine:

Feature Location Behavior When Offline
Hugging Face Search Search page Search results will not load
Model Downloads Search page Downloads will fail
Engine Registry Engines page > Available Engines Available engines list will not load
Cloud Providers Cloud Integration Cloud model requests will fail
App Updates Automatic Update checks will time out

These features degrade gracefully — the application continues to work normally with locally available models and engines.

Use Local Models Only

  1. Load a model from the Models page by clicking its Load button.

  2. Once loaded, all core features work offline:

    • Chat: Full conversation capabilities with the loaded model
    • Translation: Text and file translation (with a translation-capable model)
    • Image Generation: Create images (with a Stable Diffusion model)
    • API Server: OpenAI-compatible API accessible from local applications
    • Cowork: Tool calling and agent workflows (with local tools only)

Example: Setting Up a Secure Translation Workstation

Here is a complete example of setting up an air-gapped translation workstation:

On the online machine:

  1. Download TranslateGemma 4B Instruct from the Search page.
  2. Download the llama.cpp engine matching the offline machine's hardware.
  3. Export the model as TranslateGemma-4B-Instruct.baimodel.
  4. Copy the .baimodel file to a USB drive.

On the air-gapped machine:

  1. Install Backend.AI GO (if needed).
  2. Verify the llama.cpp engine is installed in the Engines page.
  3. Import TranslateGemma-4B-Instruct.baimodel from the USB drive.
  4. Load the model from the Models page.
  5. Open the Translation page and begin translating documents.

All translation happens locally — no data leaves the machine.

Keeping Offline Machines Updated

To update models or the application on the air-gapped machine:

  1. New models: Export new .baimodel packages from the online machine and repeat the import process.
  2. Application updates: Download the new installer on the online machine, transfer it, and install on the air-gapped machine.
  3. Engine updates: Download updated engine packages on the online machine, transfer, and install.

Version Tracking

Keep a record of which model versions and engine versions are installed on each air-gapped machine. The .baimodel manifest includes version information that you can reference.

Troubleshooting

Problem Solution
Import fails with "validation failed" The .baimodel file may be corrupted during transfer. Re-export from the online machine and transfer again.
Import fails with "checksum verification failed" File integrity was compromised during transfer. Verify the file size matches the original and re-transfer.
Model loads but inference is slow Check that the correct engine variant is installed for the machine's hardware (e.g., CUDA for NVIDIA GPUs). CPU-only inference is significantly slower.
"Model already exists" error Enable the Replace if model already exists toggle in the Import dialog.
Engine not available The engine must be installed before importing models. Check the Engines page and install the appropriate engine for your hardware.
Features show loading or errors Online-dependent features (Search, Cloud Integration, Available Engines) require internet access and will not function on an air-gapped machine. Use only local models and installed engines.