ADR-009: Walking Skeleton RAG Query Orchestration

Status: Accepted Version: 1.0 Date: 2026-01-01 Supersedes: N/A Related ADRs: ADR-006 (Ingest), ADR-007 (Memory), ADR-008 (Governance) Related PRDs: PRD-QUERY-001


Context

The Walking Skeleton requires end-to-end query orchestration to complete the golden thread. After ingesting policies (S1A), enabling retrieval (S1B), and governance (S1C), we need an orchestration layer that:

  1. Accepts natural language queries
  2. Retrieves relevant policies via semantic search
  3. Enforces governance checks
  4. Synthesizes answers using RAG

Per P001-SKELETON, this is the final component: “Ask ‘What is this policy?’ → Retrieve → Synthesize answer”

Decision

Implement a minimal RAG query service for Cycle S1D with:

  1. Orchestration: Semantic Kernel (SK) framework
  2. Query Flow: NL query → embeddings → similarity search → governance → synthesis
  3. LLM: Local model via llama.cpp (Gemma-2B or Phi-3)
  4. Response: Structured answer with sources and confidence

Rationale

Semantic Kernel as Orchestrator

Alternatives Considered

Alternative Rejected Because
LangChain Heavier, Python-only, more complex
LlamaIndex More opinionated, steeper learning curve
Custom orchestration Reinventing the wheel, harder to maintain
OpenAI API Non-local, non-deterministic, requires API keys

Consequences

Positive

Negative

Implementation Notes

Success Criteria


Next Steps: