Implementation Plan: Walking Skeleton Runtime

Phase 2 Artifact — Implements the runtime components for the Walking Skeleton validated in P001-SKELETON.

Purpose

Wire the validated Walking Skeleton specs (ingest, memory, governance, query) to real runtime implementations: local embeddings via llama.cpp, vector storage via pgvector, policy enforcement via OPA, and RAG orchestration via Semantic Kernel.

Prerequisite: P001-SKELETON (Walking Skeleton) — ✅ Complete


Upstream/Downstream Dependencies

Dependency Plan Reason
⬆️ Depends on P001-SKELETON SEA-DSL specs validated
⬆️ Depends on P005 Knowledge Graph Runtime Oxigraph adapter
⬆️ Depends on P019b LLM Provider Abstraction LiteLLM for fallback providers
⬇️ Needed by P020 Cognitive Extension Layer Full RAG pipeline
⬇️ Needed by P015 Agent Society Runtime Governance integration

Scope

In Scope (This Plan)

Component Technology Purpose
Embedding generator llama.cpp + llama_cpp_python Local-first embeddings (Gemma 2B)
Vector store adapter pgvector Semantic search on policy embeddings
Knowledge graph adapter Oxigraph RDF triple storage and SPARQL queries
Policy engine adapter OPA (Rego) Access control enforcement
RAG orchestrator Semantic Kernel Query → retrieve → synthesize pipeline
Docker Compose infra/docker/ Dev stack for all services

Out of Scope (Covered by Other Plans)


Pre-Flight Validation

Spec Validation

Check Requirement Pass
S1A ingest.sea sea validate docs/specs/ingest/ingest.sea [x]
S1B memory.sea sea validate docs/specs/memory/memory.sea [x]
S1C governance.sea sea validate docs/specs/governance/governance.sea [x]
S1D query.sea sea validate docs/specs/query/query.sea [x]

Infrastructure Prerequisites

Check Command Pass
Docker available docker --version [x]
PostgreSQL + pgvector docker compose up pgvector [x]
Oxigraph running docker compose up oxigraph [x]
OPA running docker compose up opa [x]
Ollama or llama.cpp ollama --version OR llama-cli --version [x]

Proposed Cycles (Worktree-First)

Cycle Worktree Branch Wave Implements
C1A ../SEA-p2-c1A cycle/p2-c1A-embedding-port 1 Embedding port + llama.cpp adapter
C1B ../SEA-p2-c1B cycle/p2-c1B-vector-port 1 Vector store port + pgvector adapter
C1C ../SEA-p2-c1C cycle/p2-c1C-kg-port 1 Knowledge graph port + Oxigraph adapter
C1D ../SEA-p2-c1D cycle/p2-c1D-policy-port 1 Policy port + OPA adapter
C2A ../SEA-p2-c2A cycle/p2-c2A-docker-stack 2 Docker Compose for dev stack
C3A ../SEA-p2-c3A cycle/p2-c3A-rag-orchestrator 3 Semantic Kernel RAG pipeline
C4A ../SEA-p2-c4A cycle/p2-c4A-e2e-integration 4 E2E integration test

Wave 1 (Parallel — Ports + Adapters)


Wave 2 (Depends on Wave 1)


Wave 3 (Depends on Wave 2)


Wave 4 (Final Validation)


Dependencies Introduced

Dependency Type Version Package Justification
llama-cpp-python Python 0.3.x llama-cpp-python Local llama.cpp bindings
semantic-kernel Python 1.x semantic-kernel RAG orchestration framework
pg Node 8.x pg PostgreSQL client for pgvector
@pgtyped/runtime Node 2.x @pgtyped/runtime Type-safe SQL queries
GGUF model Model gemma-2-2b-it.Q4_K_M.gguf Local embedding model

Expected Filetree After Implementation

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
libs/skeleton/
├── embedding/
│   ├── ports/src/lib/embedding-generator.port.ts
│   └── adapters/src/lib/
│       ├── llama-cpp.adapter.ts
│       └── fake.adapter.ts
├── vector/
│   ├── ports/src/lib/vector-store.port.ts
│   └── adapters/src/lib/
│       ├── pgvector.adapter.ts
│       └── fake.adapter.ts
├── graph/
│   ├── ports/src/lib/triple-store.port.ts
│   └── adapters/src/lib/
│       ├── oxigraph.adapter.ts
│       └── fake.adapter.ts
├── policy/
│   ├── ports/src/lib/policy-engine.port.ts
│   └── adapters/src/lib/
│       ├── opa.adapter.ts
│       └── fake.adapter.ts
└── query/
    └── application/src/lib/rag-orchestrator.service.ts

services/
├── embedding/
│   ├── src/main.py
│   └── Dockerfile
└── query/
    ├── src/main.py
    └── Dockerfile

infra/docker/
├── docker-compose.skeleton.yml
├── postgres/init.sql
├── oxigraph/Dockerfile
└── opa/policies/skeleton.rego

tests/skeleton/
└── test_walking_skeleton_runtime.py

Validation & Verification

Pre-Merge Checks (Per Cycle)

E2E Integration Test (C4A)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Start full skeleton stack
just skeleton-up

# Run E2E test
python3 tests/skeleton/test_walking_skeleton_runtime.py

# Expected output:
# ✓ Parse policy.sea file
# ✓ Generate embedding (384 dimensions)
# ✓ Store in pgvector
# ✓ Store RDF triples in Oxigraph
# ✓ Query: "What is this policy?"
# ✓ OPA policy check: ALLOW
# ✓ RAG response synthesized

Manual Verification Checklist

  1. curl http://localhost:8002/v1/embeddings returns 384-dim vector
  2. curl http://localhost:7878/query?query=SELECT... returns SPARQL results
  3. curl http://localhost:8181/v1/data/skeleton/allow returns policy decision
  4. curl http://localhost:8003/query returns RAG-synthesized answer

Risks & Mitigations

Risk Likelihood Impact Mitigation
llama.cpp build failure Medium High Provide pre-built wheels in CI
GGUF model download slow Low Medium Cache in Docker volume
pgvector dimension mismatch Low High Validate dimensions at adapter layer
OPA policy too restrictive Low Medium Start with permissive policy, iterate

Open Questions

  1. GGUF model hosting: Download on-demand or bundle in Docker image?
  2. Semantic Kernel language: Python or TypeScript SDK?

References

Document Purpose
P001-SKELETON Validated specs
P019b LLM Provider LiteLLM integration
P005 Knowledge Graph Oxigraph patterns
dependency_gap_analysis.md Technology selection