Skip to content

fix: improve error message when LlamaModel fails to load#2187

Open
Anai-Guo wants to merge 1 commit intoabetlen:mainfrom
Anai-Guo:fix-improve-model-load-error
Open

fix: improve error message when LlamaModel fails to load#2187
Anai-Guo wants to merge 1 commit intoabetlen:mainfrom
Anai-Guo:fix-improve-model-load-error

Conversation

@Anai-Guo
Copy link
Copy Markdown

Summary

Fixes #2145

When llama_model_load_from_file returns None, the previous error was:

ValueError: Failed to load model from file: /path/to/model.gguf

With verbose=False (the default), the llama.cpp log is suppressed, so users have no way to know why loading failed. This is particularly confusing for large BF16 models where the silent failure is typically OOM.

Changes

The error message now includes:

  1. File size in GB — makes OOM the obvious first suspect
  2. Common causes — RAM/VRAM, unsupported format, corrupt file
  3. Actionable tipsverbose=True to see the llama.cpp log, n_gpu_layers=-1 for GPU offloading

Before

ValueError: Failed to load model from file: /root/.cache/.../Qwen3.5-35B-A3B-Uncensored-HauhauCS-Aggressive-BF16.gguf

After

ValueError: Failed to load model from file: /root/.cache/.../model.gguf (file size: 67.5 GB).
Common causes: insufficient RAM or VRAM for the model size, unsupported quantization format, or corrupt file.
Tip: set verbose=True to see the full llama.cpp log, or use n_gpu_layers=-1 to offload layers to GPU.

When llama_model_load_from_file returns None, the error was a generic
'Failed to load model from file'. With verbose=False (the default), the
llama.cpp log is suppressed, leaving users with no actionable information.

The error message now:
- includes the file size in GB to help diagnose OOM
- lists common causes (insufficient RAM/VRAM, unsupported format, corrupt file)
- suggests verbose=True to see the full llama.cpp error log
- suggests n_gpu_layers=-1 for GPU offloading

Fixes abetlen#2145
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Improve error messages

1 participant