9.8. Offline-Only Setup¶

Some environments—classified networks, industrial control rooms, healthcare facilities—have no internet access by design. Backend.AI GO can run entirely offline once you have the required models and engines on the machine.

This guide walks through the complete workflow: preparing model packages on a connected machine, transferring them to the air-gapped target, importing them, and configuring Backend.AI GO for fully offline operation.

Overview¶

graph LR
    A[Online Machine] -->|Export .baimodel| B[USB / Internal Network]
    B -->|Transfer| C[Air-Gapped Machine]
    C -->|Import .baimodel| D[Backend.AI GO<br/>Offline]

The workflow has three phases:

Prepare — On an internet-connected machine, download models and engines, then export models as .baimodel packages.
Transfer — Move the .baimodel files and engine binaries to the air-gapped machine via USB drive, internal file server, or any approved transfer method.
Import & Configure — On the air-gapped machine, import the packages and disable online features.

Prerequisites¶

Backend.AI GO installed on both the online and offline machines
A transfer medium (USB drive, internal network share, etc.) with enough capacity for your model files (typically 2–20 GB per model)
Administrative access on both machines to install engines and manage files

Phase 1: Prepare on the Online Machine¶

Step 1: Download Models¶

Open Backend.AI GO on the internet-connected machine.
Go to the Search page (Hugging Face icon in the sidebar).
Find and download the models you need. For offline use, consider:
- General chat: Llama 3.2 3B Instruct, Qwen 2.5 7B Instruct
- Translation: TranslateGemma 4B Instruct
- Coding: Qwen 2.5 Coder 7B Instruct
- Image generation: Stable Diffusion models (GGUF format)
Wait for each download to complete. You can monitor progress in the floating Download Queue panel.

Choose the Right Size

Smaller models (1–4B parameters) require less RAM and disk space, making them practical for constrained offline environments. Larger models (7–14B) produce better results but need more resources.

Step 2: Install Engines¶

Engines are the inference backends that run models. You need to install them on the online machine first, then transfer them along with the model packages.

Go to the Engines page in the sidebar.
Download the engine variant that matches the offline machine's hardware:

Offline Machine Hardware Engine Variant

NVIDIA GPU llama.cpp with CUDA

AMD GPU llama.cpp with ROCm/HIP

Apple Silicon llama.cpp with Metal, MLX, or MLXcel

CPU only llama.cpp CPU
Note the engine version and variant for later reference.

Engine Compatibility

The engine installed on the offline machine must match its hardware exactly. An engine built for CUDA will not work on a machine without an NVIDIA GPU.

Step 3: Export Models as .baimodel Packages¶

The .baimodel format is a portable ZIP archive containing the model file, metadata, and integrity checksums. It enables model distribution without requiring internet access on the target machine.

Go to the Models page (local models list).
Find the model you want to export and click the three-dot menu (or right-click).
Select Export Model.
In the Export dialog:
- Review the model name, format, and size.
- If the model has a vision projector (mmproj), choose whether to include it.
- Note the Total package size — you will need at least this much space on your transfer medium.
Click Export and choose a save location (e.g., a USB drive or network share).
Wait for the export to complete. The progress bar shows the hashing, packaging, and verification phases.
Repeat for each model you want to transfer.

What's Inside a .baimodel Package

A .baimodel file is a ZIP archive (using STORE compression for speed) containing:

manifest.json — Package metadata: model name, format, checksums, creation info
metadata.json — Backend.AI GO model metadata (optional)
model/ — The model file(s) (GGUF, mmproj, etc.)

SHA256 checksums are calculated during export and verified on import to ensure file integrity.

Phase 2: Transfer to the Air-Gapped Machine¶

Gather the following files for transfer:

Item	Location	Purpose
`.baimodel` packages	Exported in Step 3	Model files with metadata
Backend.AI GO installer	Official download site	Application installer (if not already installed)

Transfer Methods¶

USB drive — Copy files to a USB drive, scan for malware per your organization's policy, then connect to the air-gapped machine.
Internal file server — Upload to an approved internal network share accessible from the air-gapped network segment.
Optical media — Burn to DVD/Blu-ray for write-once, tamper-evident transfers.

Verify File Integrity

After transferring, verify that file sizes match the originals. The .baimodel import process includes SHA256 checksum verification, so corrupted files will be detected automatically during import.

Phase 3: Import and Configure on the Offline Machine¶

Step 1: Install Backend.AI GO¶

If Backend.AI GO is not already installed on the air-gapped machine, install it using the transferred installer package. See the Installation guide for platform-specific instructions.

Step 2: Install Engines¶

Before importing models, ensure the correct inference engine is available on the offline machine. If the offline machine was set up while it still had internet access, the engines may already be installed. Check the Engines page to verify.

If engines are not installed, you will need to transfer and install them manually. The engine files are located in the Backend.AI GO application data directory on the online machine.

Step 3: Import .baimodel Packages¶

Open Backend.AI GO on the air-gapped machine.
Go to the Models page.
Click the Import button (or use the menu).
In the Import dialog, click Select File and navigate to your .baimodel file.
The dialog validates the package and shows a preview:
- Model name, format, and quantization
- Publisher and repository information
- File count and total size
- Any validation warnings
If the model already exists on this machine, toggle Replace if model already exists to overwrite it.
Click Import to begin the extraction.
The progress bar shows extracting and verifying phases. SHA256 checksums are verified automatically.
Once complete, the model appears in your local models list, ready to be loaded.
Repeat for each .baimodel package.

Drag and Drop

You can also drag a .baimodel file directly onto the Models page to start the import process.

Step 4: Configure for Offline Operation¶

With no internet connection, certain features that depend on external services should be disabled to avoid unnecessary connection attempts and error messages.

Disable Update Checks¶

Go to Settings > Advanced.
Set the Update channel to a configuration that avoids reaching external update servers, or simply dismiss any update notifications that appear.

Since the machine has no internet access, update checks will time out silently, but disabling them avoids unnecessary network requests.

Avoid Online-Dependent Features¶

The following features require an internet connection and will not function on the air-gapped machine:

Feature	Location	Behavior When Offline
Hugging Face Search	Search page	Search results will not load
Model Downloads	Search page	Downloads will fail
Engine Registry	Engines page > Available Engines	Available engines list will not load
Cloud Providers	Cloud Integration	Cloud model requests will fail
App Updates	Automatic	Update checks will time out

These features degrade gracefully — the application continues to work normally with locally available models and engines.

Use Local Models Only¶

Load a model from the Models page by clicking its Load button.
Once loaded, all core features work offline:
- Chat: Full conversation capabilities with the loaded model
- Translation: Text and file translation (with a translation-capable model)
- Image Generation: Create images (with a Stable Diffusion model)
- API Server: OpenAI-compatible API accessible from local applications
- Cowork: Tool calling and agent workflows (with local tools only)

Example: Setting Up a Secure Translation Workstation¶

Here is a complete example of setting up an air-gapped translation workstation:

On the online machine:

Download TranslateGemma 4B Instruct from the Search page.
Download the llama.cpp engine matching the offline machine's hardware.
Export the model as TranslateGemma-4B-Instruct.baimodel.
Copy the .baimodel file to a USB drive.

On the air-gapped machine:

Install Backend.AI GO (if needed).
Verify the llama.cpp engine is installed in the Engines page.
Import TranslateGemma-4B-Instruct.baimodel from the USB drive.
Load the model from the Models page.
Open the Translation page and begin translating documents.

All translation happens locally — no data leaves the machine.

Keeping Offline Machines Updated¶

To update models or the application on the air-gapped machine:

New models: Export new .baimodel packages from the online machine and repeat the import process.
Application updates: Download the new installer on the online machine, transfer it, and install on the air-gapped machine.
Engine updates: Download updated engine packages on the online machine, transfer, and install.

Version Tracking

Keep a record of which model versions and engine versions are installed on each air-gapped machine. The .baimodel manifest includes version information that you can reference.

Troubleshooting¶

Problem	Solution
Import fails with "validation failed"	The `.baimodel` file may be corrupted during transfer. Re-export from the online machine and transfer again.
Import fails with "checksum verification failed"	File integrity was compromised during transfer. Verify the file size matches the original and re-transfer.
Model loads but inference is slow	Check that the correct engine variant is installed for the machine's hardware (e.g., CUDA for NVIDIA GPUs). CPU-only inference is significantly slower.
"Model already exists" error	Enable the Replace if model already exists toggle in the Import dialog.
Engine not available	The engine must be installed before importing models. Check the Engines page and install the appropriate engine for your hardware.
Features show loading or errors	Online-dependent features (Search, Cloud Integration, Available Engines) require internet access and will not function on an air-gapped machine. Use only local models and installed engines.

Downloading Models — Browse and download models from Hugging Face
Running Models — Load and manage local models
Engine Management — Install and manage inference engines
Private Document Translation — Translate documents locally
Settings — Configure application preferences

Offline Machine Hardware	Engine Variant
NVIDIA GPU	llama.cpp with CUDA
AMD GPU	llama.cpp with ROCm/HIP
Apple Silicon	llama.cpp with Metal, MLX, or MLXcel
CPU only	llama.cpp CPU

9.8. Offline-Only Setup¶

Overview¶

Prerequisites¶

Phase 1: Prepare on the Online Machine¶

Step 1: Download Models¶

Step 2: Install Engines¶

Step 3: Export Models as .baimodel Packages¶

Phase 2: Transfer to the Air-Gapped Machine¶

Transfer Methods¶

Phase 3: Import and Configure on the Offline Machine¶

Step 1: Install Backend.AI GO¶

Step 2: Install Engines¶

Step 3: Import .baimodel Packages¶

Step 4: Configure for Offline Operation¶

Disable Update Checks¶

Avoid Online-Dependent Features¶

Use Local Models Only¶

Example: Setting Up a Secure Translation Workstation¶

Keeping Offline Machines Updated¶

Troubleshooting¶

Related Pages¶