Ollama Model Setup Guide

This guide covers setting up the LLM and embedding models for SEA-Forge™.

Quick Start

1
2
3
4
5
6
# Install Ollama (if not already installed)
curl -fsSL https://ollama.com/install.sh | sh

# Pull required models
ollama pull llama3.2              # Chat/completion model
ollama pull embeddinggemma:300m   # Embedding model (recommended)

Default Configuration

The default .env configuration uses Ollama’s native models:

1
2
3
OLLAMA_BASE_URL=http://localhost:11434
LLM_DEFAULT_MODEL=ollama/llama3.2
LLM_DEFAULT_EMBEDDING_MODEL=embeddinggemma:300m

Low-Memory Systems (Q8_0 Quantized)

For systems with limited RAM/VRAM, use the Q8_0 quantized embedding model:

1
just ollama-import-gguf

This automatically downloads and imports unsloth/EmbeddingGemma-300M-GGUF Q8_0.

Option 2: Manual Import

If you already have the GGUF file:

1
2
3
4
5
6
7
8
9
10
11
12
13
# 1. Navigate to your GGUF location
cd models/models

# 2. Create a Modelfile
cat > Modelfile.embeddinggemma <<EOF
FROM ./embeddinggemma-300M-Q8_0.gguf
EOF

# 3. Import into Ollama
ollama create embeddinggemma:300m-q8_0 -f Modelfile.embeddinggemma

# 4. Verify
ollama list | grep embeddinggemma

Option 3: Download from HuggingFace

1
2
3
4
5
# Download the GGUF file
wget -P models/models \
  "https://huggingface.co/unsloth/EmbeddingGemma-300M-GGUF/resolve/main/EmbeddingGemma-300M-Q8_0.gguf"

# Then follow Option 2 steps

Configure Your .env

After importing the Q8_0 model, update your .env:

1
2
3
4
5
# Default (Ollama native - higher quality)
LLM_DEFAULT_EMBEDDING_MODEL=embeddinggemma:300m

# Override for low-memory systems
LLM_DEFAULT_EMBEDDING_MODEL=embeddinggemma:300m-q8_0

Model Specifications

Model Dimensions Size Use Case
embeddinggemma:300m 384 ~600MB Default, highest quality
embeddinggemma:300m-q8_0 384 ~300MB Low-memory systems
llama3.2 N/A ~2GB Chat/completion

Note: Both embedding models output 384 dimensions, compatible with PGVECTOR_DIM=384.

Verify Setup

1
2
3
4
5
6
7
8
# Check Ollama is running
curl http://localhost:11434/api/tags

# List available models
ollama list

# Test embedding model
ollama run embeddinggemma:300m-q8_0 "Test embedding"

Troubleshooting

Ollama not running

1
ollama serve &

Model not found

1
2
3
# Re-import the model
cd models/models
ollama create embeddinggemma:300m-q8_0 -f Modelfile.embeddinggemma

Out of memory

File Locations

1
2
3
4
models/
└── models/
    ├── embeddinggemma-300M-Q8_0.gguf    # Downloaded GGUF file
    └── Modelfile.embeddinggemma          # Ollama import config

See Also: