2.2. Downloading Models¶

Backend.AI GO allows you to explore and download the latest open-source models directly from Hugging Face, the world's largest AI model repository.

What is Hugging Face?¶

Hugging Face is like the "GitHub of AI." It's a hub where researchers and developers share their trained models. Backend.AI GO integrates directly with Hugging Face, allowing you to find the models you need without leaving the app.

Hugging Face model browser

Searching for Models¶

Open the Search (Hugging Face icon) tab in the sidebar.
Use the search bar at the top to enter a specific model or organization name (e.g., Meta-Llama, Qwen).
Format Filtering: Use tags to find compatible models. Look for:
- GGUF: The optimized format for various hardware (based on llama.cpp).
- MLX: Native format for Apple Silicon Macs.

Filtering Models¶

Backend.AI GO provides powerful filtering options to help you find the right model quickly.

Capability Filters¶

The filter bar at the top shows model counts by capability:

All: Shows all available models with total count.
Text: Text generation and processing models.
Vision: Models that can process images and visual input.
Image: Image generation models (like DALL-E style).
Code: Code generation and understanding models.
Embedding: Text embedding and vector generation models.
TTS/STT: Text-to-speech and speech-to-text models.

You can select multiple capabilities to narrow down your search. The count next to each filter shows how many models match that capability.

Advanced Filters¶

Click Advanced Filters to access additional filtering options:

Size Ranges: Filter models by file size:
- < 4GB: Small models suitable for devices with limited memory.
- 4-8GB: Medium-sized models with good performance.
- > 8GB: Large models for high-end hardware.
Quantization Types: Filter by quantization format:
- Q4KM, Q4KS: 4-bit quantization (smallest, fastest).
- Q5KM, Q5KS: 5-bit quantization (balanced).
- Q6K, Q80: Higher precision (better quality).
- F16, F32: Full/half precision (highest quality, largest size).

Filter Presets¶

Save your frequently used filter combinations as presets:

Configure your desired filters (capabilities, size ranges, quantization types).
Click Save as preset... in the Advanced Filters panel.
Enter a name for your preset and click Save.
Your saved presets appear in the Presets section for quick access.

To delete a preset, click the trash icon next to it.

Choosing the Right Variant (Quantization)¶

Models often come in multiple "Quantization" levels (e.g., Q4_K_M, Q8_0).

What is Quantization?: A compression technique that reduces model size and increases speed with minimal loss in quality.
Recommendation: For most users, Q4_K_M or Q5_K_M offers the "Golden Balance" between performance and intelligence.
RAM Requirements: Ensure the file size of the chosen variant is smaller than your computer's available RAM (or GPU VRAM).

Managing Downloads¶

Queue: You can start multiple downloads at once. They will be queued and processed sequentially.
Progress: Check the Downloads tab to see real-time progress, speed, and estimated time remaining.
Location: By default, models are saved in the application data directory. You can change this path in Settings.

Disk Space Management¶

Backend.AI GO provides comprehensive disk space management features to ensure you always have enough storage for your models.

Pre-Download Space Validation¶

Before starting any download, the application automatically checks if you have sufficient disk space:

Automatic Check: The system validates available space against the required size (plus a 10% safety margin) before initiating a download.
Clear Warnings: If there is insufficient space, you will receive a clear notification showing the required space, available space, and the shortfall.

Disk Usage Panel¶

The Settings page includes a Disk Usage panel that provides a visual overview of your storage:

Segmented Progress Bar: Shows the breakdown of disk usage with color-coded segments for downloaded models and other files.
Storage Breakdown: Displays the total size of all downloaded models and the number of model files.
Mount Point: Shows which disk volume is being used for model storage.

Model Directory Migration¶

If you need to move your models to a different location (e.g., to an external drive or a larger partition), Backend.AI GO supports smart migration:

Same-Volume Move: When the source and destination are on the same disk volume, the move is instant (file system rename operation).
Cross-Volume Copy: When moving to a different volume, files are copied with progress tracking, then the originals are removed after verification.
Progress Tracking: During cross-volume migrations, you can monitor the progress including bytes transferred, files processed, and current file being copied.

Download Retry and Resume¶

Backend.AI GO includes robust handling for download interruptions:

Auto-Retry: If a download fails due to a transient network error (timeouts, connection resets, or server errors), the system automatically retries up to 3 times with exponential backoff delays.
Resume Downloads: If a download is interrupted (even after closing the app), you can resume it from where it left off. Click the Retry button in the notification center to continue downloading.
Manual Retry: For permanent failures, you can manually retry a download from the notification center. The download will resume from any existing partial file.
Progress Preservation: Partial downloads are preserved, so you never lose progress on large model files.

Importing Local Files¶

If you already have a .gguf file downloaded elsewhere, Backend.AI GO provides two convenient ways to import it:

Using the Import Dialog¶

Go to the Models tab.
Click the Import button in the header.
An import dialog will open with two options:
- Drag and Drop: Simply drag your .gguf file from your file manager and drop it onto the drop zone.
- Browse Files: Click the drop zone or use the Browse Files button to open a file picker.
The file will be validated and copied to your models directory with progress indication.

Drag and Drop Features¶

The import dialog provides visual feedback throughout the process:

Drop Zone Animation: A pulsing dashed border indicates when you're dragging a file over the drop zone.
File Type Validation: Only .gguf files are accepted. Invalid file types show a clear error message.
Progress Tracking: Watch the upload progress with a visual progress bar for each file.
Error Handling: If something goes wrong, you'll see a clear error message with the option to try again.

Keyboard Accessibility¶

The drop zone is fully keyboard accessible:

Press Tab to focus the drop zone.
Press Enter or Space to open the file picker.