Implementation Plan: Semantic Core Services

Implement the Semantic Core pillar: SEA-DSL registry/validation, policy evaluation, Knowledge Graph service, and Temporal DB integration for long-term semantic memory.

Provenance & Traceability

Architectural Decisions (ADRs)

ADR ID Decision Title Impact on This Plan
ADR-004 Semantic Core Formalization SEA-DSL as source of truth.
ADR-006 Knowledge Layer Graph KGS is a first-class projection of semantic core.

Product Requirements (PRDs)

PRD ID Requirement Title Satisfied By (SDS) Acceptance Criteria
PRD-001 Unified Business Semantics SDS-002 DSL parsing + validation
PRD-002 Automated Rule Enforcement SDS-002 Policy evaluation
PRD-003 Semantic Context for AI SDS-003 Queryable semantic context
PRD-004 Dynamic Knowledge Discovery SDS-003 SPARQL query support

Software Design Specifications (SDS)

SDS ID Service/Component Bounded Context SEA-DSL Spec File Implementation Status
SDS-002 Semantic Core DSL Service semantic-core N/A MVP (doc)
SDS-003 Knowledge Graph Service semantic-core N/A MVP (doc)
SDS-015 Temporal Database Service semantic-core N/A Designed

Technical Stack Selection

Updated 2025-12-30: Selected local-first stack for sovereignty and governance. Added installation details per gap analysis section 1.3.

Capability Selection Version Justification
Knowledge Graph oxigraph (Rust) 0.4.x High-performance RDF store with SPARQL/SHACL support.
Vector Store pgvector (PostgreSQL) 0.8.x ACID guarantees, relational joins, reliable scaling.
Time-Series DB timescaledb (PostgreSQL) 2.17.x Track metric evolution and pattern detection with native PostgreSQL integration.
Embedding Model EmbeddingGemma 300M 4-bit GGUF via llama.cpp Deterministic, local-first embeddings (<100ms inference).

Installation Commands

1
2
3
4
5
6
7
8
9
# PostgreSQL extensions (in docker-compose or migration)
CREATE EXTENSION vector;       -- pgvector
CREATE EXTENSION timescaledb;  -- timescaledb

# llama.cpp for embeddings
mise use llama-cpp@latest

# Python bindings
pip install pgvector psycopg[binary] llama-cpp-python

Model Configuration

Model File Dimensions Notes
EmbeddingGemma 300M 4-bit EmbeddingGemma-300M-Q4_K_M.gguf 768 Store in models/ directory. Pin seed for determinism.

Architecture and Design

Components & Ports (No Technical Debt)

To prevent coupling to specific drivers (like pgvector or Oxigraph), all domain services communicate via Hexagonal Ports.

1
2
3
4
5
6
7
8
9
10
11
// Semantic Memory Port (Wraps pgvector + Gemma)
interface SemanticMemoryPort {
  storeObservation(content: string, metadata: Metadata): Promise<VectorId>;
  findSimilar(query: string, limit: number): Promise<ScoredObservation[]>;
}

// Knowledge Graph Port (Wraps Oxigraph)
interface KnowledgeGraphPort {
  querySparql(query: string): Promise<SparqlResult>;
  validateShacl(shapeId: string, dataId: string): Promise<ValidationReport>;
}

Design Principles Applied

Dependency Justification

Dependency Type Version Justification
oxigraph Driver latest production-grade RDF store
pgvector Driver 0.5+ vector storage in Postgres
llama.cpp Runtime latest run EmbeddingGemma locally

Proposed Cycles

Cycle Branch Wave Files Modified Files Created Specs Implemented
C1A cycle/p019-c1a-dsl-service 1 docs/specs/semantic-core/sds/002-semantic-core-dsl-service.md DSL registry + policy evaluation
C1B cycle/p019-c1b-kgs-service 1 docs/specs/semantic-core/sds/003-knowledge-graph-service.md SPARQL queries + snapshots
C2A cycle/p019-c2a-temporal-db 2 docs/specs/semantic-core/sds/015-temporal-database-service.md Pattern Oracle integration

Task Breakdown

Wave 1 (Parallel)

Wave 2 (Depends on Wave 1)


Validation & Verification

Spec Validation

Implementation Validation


Risks & Mitigation

Risk Likelihood Impact Mitigation Strategy
Graph queries become too slow Medium Medium Start with SPARQL-first + materialized projections.
Embedding model drift Low High Version control GGUF model files; pin seeds.