ADR-007: Walking Skeleton Vector Memory

Status: Accepted Version: 1.0 Date: 2026-01-01 Supersedes: N/A Related ADRs: ADR-006 (Ingest Pipeline) Related PRDs: PRD-MEMORY-001


Context

The Walking Skeleton requires semantic search capability to retrieve relevant policies based on query similarity. After ingesting policies (S1A), we need a vector memory layer that:

  1. Stores embeddings (384-dimensional vectors)
  2. Performs similarity search (cosine distance)
  3. Supports the downstream RAG query flow (S1D)

Per P001-SKELETON, this is the second component in the golden thread: “Index it, and query it.”

Decision

Implement a minimal vector memory service for Cycle S1B with:

  1. Embedding Model: EmbeddingGemma via llama.cpp (local, deterministic)
  2. Vector Store: pgvector extension in PostgreSQL
  3. Search Algorithm: Cosine similarity with HNSW indexing
  4. API: Simple query interface (text → top-k similar policies)

Rationale

Local-First Stack

Alternatives Considered

Alternative Rejected Because
OpenAI Embeddings API Non-deterministic, requires internet, cost, API keys
Sentence Transformers Heavier Python runtime, slower than llama.cpp
Milvus / Qdrant Over-engineered for skeleton, extra service overhead
FAISS Requires separate index management, pgvector simpler

Consequences

Positive

Negative

Implementation Notes

Success Criteria


Next Steps: