Skip to content

Troubleshooting Guide

This comprehensive guide addresses common issues you might encounter while using Backend.AI GO.

1. Installation & Startup

App does not open (No response)

  • Check Processes: Sometimes a background process from a previous session might be stuck. Open Task Manager (Windows) or Activity Monitor (macOS) and kill any backend-ai-go or llama-server processes.

  • Reinstall: The installation might be corrupted. Reinstall the latest version.

"App is damaged and can't be opened" (macOS)

This is a common macOS security message for apps not downloaded from the App Store.

  • Solution: Right-click the app icon and select Open. Click Open again in the dialog box.

  • Terminal Fix: If that fails, run this command in Terminal:

    xattr -cr /Applications/Backend.AI\ GO.app
    

"Windows protected your PC" (SmartScreen)

  • Solution: Click More info, then click the Run anyway button.

App crashes immediately on launch

  • Config Reset: Your configuration file might be corrupted. Try renaming or deleting the config folder:
    • Windows: %APPDATA%\backend.ai.go
    • macOS: ~/Library/Application Support/backend.ai.go
    • Linux: ~/.config/backend.ai.go

UI layout appears broken or behaves unexpectedly

  • Reset UI State: If sidebar positions, tab selections, or scroll positions seem corrupted, you can reset all UI preferences:
    1. Navigate to Settings > UI
    2. Find the UI State section
    3. Click Reset UI State to restore default layout

"Port 8000/8001 is already in use"

Backend.AI GO uses port 8000 for the API server and 8001 for management.

  • Identify Conflict: Another app (like a local web server or another AI tool) might be using these ports.

  • Change Port: (Future feature) Currently, you must stop the conflicting application.

"Binary not found" or "Inference Engine Not Found" (Windows)

This error occurs when the inference engine (llama-server) cannot be located. This can happen due to:

  • Antivirus Interference: Some antivirus software may quarantine or delete the llama-server-x86_64-pc-windows-msvc.exe binary, mistaking it for a threat.

    • Solution: Add an exception for the Backend.AI GO installation directory in your antivirus settings.
    • Common paths to whitelist:
      • %LOCALAPPDATA%\Programs\backend.ai.go\
      • %LOCALAPPDATA%\backend.ai.go\engines\
      • %APPDATA%\backend.ai.go\
  • Incomplete Installation: The installation might have been interrupted.

    • Solution: Reinstall Backend.AI GO from the official release.
  • Missing Engine: The bundled engine might not be present. Backend.AI GO now supports downloadable engines.

    • Solution: Navigate to Settings > Engines and download a compatible inference engine for your system.
  • File Permission Issues: Windows security policies might prevent access to the binary.

    • Solution: Try running the application as Administrator once, or check folder permissions.

Troubleshooting Steps:

  1. Check the notification center for an "Inference Engine Not Found" message
  2. Click "Download Engine" to navigate to the Engines page
  3. Download a compatible engine (e.g., llama-cpp with CUDA or CPU support)
  4. Retry loading your model

Debug Information: The error message will show the expected binary name and searched paths to help identify where the engine should be located.

2. Model Download & Management

Download stuck at 0%

  • Firewall: Check if your firewall or antivirus is blocking the connection to Hugging Face.

  • Proxy: If you are behind a corporate proxy, ensure HTTP_PROXY and HTTPS_PROXY env vars are set correctly.

"Download Failed (Network Error)"

  • Resume: Click the retry button. The download supports resuming.

  • Disk Space: Verify you have enough space. A "Network Error" can sometimes mask a "Disk Full" error.

Slow Download Speed

  • Hugging Face Mirror: In some regions, the connection to Hugging Face might be slow.

  • External Download: You can download the .gguf file using a browser or download manager, then use the Import feature.

Model not showing in Library

  • Refresh: Click the refresh icon in the library header.

  • File Extension: Ensure the file ends in .gguf.

  • Location: Verify the file is in the correct models directory (visible in Settings).

Cannot delete model (File Locked)

  • Unload First: Ensure the model is not currently loaded.

  • Restart App: Fully quit Backend.AI GO to release any file locks.

Multi-part model shows "Incomplete" badge

Large models (70B+ parameters) are often split into multiple files (e.g., model.gguf-00001-of-00006.gguf). If you see an "Incomplete" warning badge on a model card:

  • Check Missing Parts: Hover over the badge to see which parts are missing (e.g., "Missing parts: 2, 3, 5 of 6").

  • Resume Download: Go to the Downloads tab and resume any paused or failed downloads for the missing parts.

  • Manual Download: If the download was interrupted, you may need to re-download the missing parts from Hugging Face.

  • File Verification: Ensure all part files are in the same directory and follow the naming pattern model.gguf-NNNNN-of-NNNNN.gguf.

  • Cannot Load: Incomplete multi-part models cannot be loaded until all parts are present. The Load button will be disabled.

3. Loading & Inference

App freezes during loading

  • Large Model: Loading a Solar-Open-100B or gpt-oss-120B model into RAM can take time. Wait for 2-3 minutes.

  • System RAM: If your system runs out of RAM, the OS will start swapping, which causes extreme slowdowns. Ensure you are not loading a model larger than your physical RAM.

"Out of Memory (OOM)" Error

  • Reduce Context: Lower the Context Length in Model Settings (e.g., from 8192 to 4096).

  • More GPU Offloading: If using CPU+GPU, try offloading fewer layers to GPU if VRAM is the bottleneck, or more if system RAM is the bottleneck.

  • Smaller Quantization: Switch from Q8_0 to Q4_K_M.

Using CPU instead of GPU

  • Check Settings: In Model Settings, ensure GPU Layers is not set to 0. Set it to Max or -1.

  • Drivers:

    • Windows: Install the latest NVIDIA Studio or Game Ready drivers.
    • Linux: Ensure CUDA toolkit is installed and nvidia-smi works.

MLX model fails to load (macOS)

  • Chip Compatibility: MLX requires Apple Silicon (M1/M2/M3/M4/M5). It does not work on Intel Macs.

  • OS Version: Ensure you are on macOS 13.0 (Ventura) or later.

Model outputs gibberish

  • Repetition Penalty: If it repeats words, increase Frequency Penalty.

  • Temperature: If it speaks nonsense, lower Temperature (e.g., to 0.7).

  • Template: The internal prompt template might be mismatched. Try manually selecting the correct template (ChatML, Llama 3, etc.) in settings.

Slow generation speed (Low TPS)

  • Power Mode: Ensure your laptop is plugged in.

  • Background Apps: Close web browsers or Electron apps (Slack, Discord) to free up memory bandwidth.

4. Chat & Features

Enter key does not send message

  • Loading State: Check if the model is still loading or generating a response.

  • Generation: You cannot send a message while the AI is typing. Click Stop first.

Missing chat history

  • Database: Chat history is stored locally. If you cleared your app data or reinstalled cleanly, history might be gone.

  • Search: Use the search bar in the sidebar; the chat might just be scrolled out of view.

Image attachment fails

  • Model Support: Ensure you are using a Multimodal model (e.g., Llama-3.2-Vision, Qwen2-VL). Standard text models cannot "see" images.

"Thinking Block" not visible

  • Model Support: Only reasoning models (DeepSeek-R1, Qwen3-Thinking) output thinking traces. Standard models will just give the answer directly.

Tool calling errors

  • Model Capability: Ensure the model supports tool calling. Small models (< 7B) often struggle with tool formatting.

  • Permission: Did you deny the permission request? Check the chat for a permission prompt.

Answering in English instead of User Language

  • System Prompt: Add a system prompt: "You are a helpful assistant. Always answer in the user's language."

  • Model Bias: Some smaller models are heavily English-biased. Try a larger model or one fine-tuned for your language.

5. Cloud Integration

"Invalid Key" error

  • Whitespace: Check for hidden spaces at the start or end of your API key.

  • Quota: Check if you have run out of credits/quota on the provider's platform (OpenAI, Anthropic).

Cloud models not listed

  • Refresh: It might take a moment to fetch the list.

  • Key Permissions: Ensure your API key has permissions to list models.

Remote vLLM connection failed

  • Network: Can you ping the server IP?

  • Firewall: Ensure port 8000 is open on the remote server.

  • CORS: If running vLLM, ensure you launched it with --cors-allow-origins "*".

Viewing Provider Connection Errors

If a cloud provider fails to connect or returns errors, you can view detailed error logs in the Logs panel:

  1. Open the Logs panel from the sidebar
  2. Filter by API source to see provider-related messages
  3. Look for error entries that include:
    • Connection failures (timeout, unreachable host)
    • Authentication errors (401, 403)
    • Rate limiting (429)
    • Server errors (5xx)

Credential Safety

Connection error logs automatically sanitize URLs to remove any embedded credentials, ensuring your API keys are never exposed in log files.


6. macOS Specific Issues

"System Not Supported" on macOS

  • Intel Macs: Backend.AI GO requires Apple Silicon (M1/M2/M3/M4/M5). Intel-based Macs are not supported. If you need to run AI on an Intel Mac, consider using the Cloud Integration features.

For general questions about features or usage, please visit the FAQ page.

Still stuck? Join our Discord community or open an issue on GitHub.