SDS-015: Temporal Database Service

sds: id: SDS-015 title: Temporal Database Service bounded_context: semantic-core satisfies_prds: [PRD-020] satisfies_adrs: [ADR-004, ADR-029, ADR-030] version: 1.2.0 status: approved owners: [“sea-core-team”] created: 2025-12-01 updated: 2025-12-30 —

Document Type

Software Design Specification (SDS)

Purpose

Specifies the Temporal Database Subsystem (system.temporal.api), which functions as the “Semantic Memory” of the enterprise. It includes the Vector Store, Pattern Oracle, and Metric Indexer.

Dependencies

Dependency	Type	Version	Justification
pgvector	PostgreSQL Extension	0.8.x	ACID guarantees, relational joins, reliable scaling for vector storage
TimescaleDB	PostgreSQL Extension	2.17.x	Track metric evolution and pattern detection with native PostgreSQL integration
EmbeddingGemma	GGUF Model	300M-Q4_K_M	Deterministic, local-first embeddings (<100ms inference)
llama.cpp	Runtime	latest	Run EmbeddingGemma locally

Port References

This service integrates with the following hexagonal ports:

Port	Location	Purpose
`VectorStorePort`	libs/skeleton/vector/ports	Vector storage and similarity search
`EmbeddingGeneratorPort`	libs/skeleton/embedding/ports	Generate embeddings from text

VectorStorePort Methods

store(input: StoreVectorInput): Promise<void> — Store embedding with content
storeBatch(inputs: StoreVectorInput[]): Promise<void> — Batch store embeddings
similaritySearch(input: SimilaritySearchInput): Promise<VectorSearchResult[]> — Find similar vectors
delete(id: string): Promise<void> — Delete vector by ID
exists(id: string): Promise<boolean> — Check if vector exists

EmbeddingGeneratorPort Methods

generate(text: string): Promise<number[]> — Generate embedding from text
generateBatch(texts: string[]): Promise<number[][]> — Batch generate embeddings

API Specification: System.Temporal.Api

@namespace "system.temporal.api"
@version "1.2.0"
@owner "architect-sovereign"

// ==========================================================================
// ENTITIES: The Temporal Database Components
// ==========================================================================

Entity "VectorStore" in system.temporal.api
@rationale "The embedded storage engine (pgvector) for high-dimensional semantic embeddings."
// Core: implementation using embedding-gemma-300M and pgvector.

Entity "PatternOracle" in system.temporal.api
@rationale "The reasoning engine that recommends architectural and code patterns via similarity rankings."
// Core: uses multi-factor ranking (similarity + performance + governance).

Entity "MetricIndexer" in system.temporal.api
@rationale "The time-series processor that indexes operational telemetry into the pattern store."
// Core: ingests data from OpenObserve/Prometheus to compute success and error rates.

// ==========================================================================
// RESOURCES: Semantic Memory Artifacts
// ==========================================================================

Resource "PatternEmbedding" in system.temporal.api
@rationale "A vector representation of a spec, flow, or code snippet."

Resource "PatternRecommendation" in system.temporal.api
@rationale "A ranked set of candidate patterns with associated metadata and performance scores."

Resource "OperationalMetric" in system.temporal.api
@rationale "Time-series data points (latency, error_rate, drift) tied to a specific Identity Token."

// ==========================================================================
// PORTS: Hexagonal Boundaries
// ==========================================================================

Port "VectorStore" direction outbound
@rationale "Interface to pgvector for semantic storage and retrieval."

Port "EmbeddingGenerator" direction outbound
@rationale "Interface to local embedding model (EmbeddingGemma via llama.cpp)."

// ==========================================================================
// FLOWS: The Temporal API Surface
// ==========================================================================

// --- Vector Store (pgvector) Operations ---

/**
 * @cqrs { "kind": "command" }
 * @tx { "transactional": true }
 * @idempotency { "enabled": true, "key": "content_hash" }
 */
Flow "StoreEmbedding" from "SEACore" to "VectorStore"
@rationale "Indexes new semantic deltas or code patterns into the vector database."

/**
 * @cqrs { "kind": "query" }
 * @read_model { "name": "VectorSimilarityProjection" }
 */
Flow "RetrieveSimilar" from "PatternOracle" to "VectorStore"
@rationale "Performs a k-nearest neighbor search to find patterns with high semantic similarity."

// --- Pattern Oracle Operations ---

/**
 * @cqrs { "kind": "query" }
 * @read_model { "name": "PatternSearchProjection" }
 */
Flow "QueryPatterns" from "AI-Agent" to "PatternOracle"
@rationale "Allows agents to search for historical solutions using natural language or spec snippets."

/**
 * @cqrs { "kind": "event" }
 * @outbox { "mode": "required" }
 */
Flow "RecommendPatterns" from "PatternOracle" to "VibesPro™"
@rationale "Proactively suggests templates during the generation planning phase."

// --- Metric Indexer Operations ---

/**
 * @cqrs { "kind": "command" }
 * @tx { "transactional": true }
 * @idempotency { "enabled": true, "key": "metric_id+timestamp" }
 */
Flow "IndexTelemetry" from "GovernedSpeed™" to "MetricIndexer"
@rationale "Ingests runtime governance signals and performance metrics for pattern weighting."

/**
 * @cqrs { "kind": "event" }
 * @outbox { "mode": "required" }
 */
Flow "RefreshStats" from "MetricIndexer" to "PatternOracle"
@rationale "Updates the 'Pattern Oracle' with new success rates and latency distributions."

// ==========================================================================
// POLICIES: Temporal Invariants
// ==========================================================================

Policy "SemanticMemoryIntegrity" per Constraint Obligation priority 10
@rationale "Every pattern in the Vector Store must be uniquely identified by an Identity Token."
as: forall p in resources where p.name = "PatternEmbedding": exists_unique n in kgs.nodes: (n.id = p.token_id)

Policy "RecommendationSafetyGate" per Constraint Obligation priority 10
@rationale "Patterns with a success_rate below the threshold must be flagged as high-risk."
as: forall r in resources where r.name = "PatternRecommendation":
    if r.success_rate < 0.7 then r.risk_level = "High"

Policy "LocalFirstPrivacy" per Constraint Obligation priority 10
@rationale "Embeddings and metric indexing must occur on-device by default to maintain privacy."
as: forall f in flows where f.from = "MetricIndexer": (f.is_local = true)

Component Details

Vector Store (pgvector): PostgreSQL extension for vector storage with ACID guarantees.
Pattern Oracle: Recommender system calculating combinedScore from similarity + performance + governance.
Metric Indexer: Bridge between Runtime and Semantic Memory, ingesting telemetry from observability stack.

Model Configuration

Model	File	Dimensions	Notes
EmbeddingGemma 300M 4-bit	`EmbeddingGemma-300M-Q4_K_M.gguf`	768	Store in `models/` directory. Pin seed for determinism.

Installation Commands

# PostgreSQL extensions (in docker-compose or migration)
CREATE EXTENSION vector;       -- pgvector
CREATE EXTENSION timescaledb;  -- timescaledb

# llama.cpp for embeddings
mise use llama-cpp@latest

# Python bindings
pip install pgvector psycopg[binary] llama-cpp-python

CQRS Flow Summary

Flow	Kind	Key Annotations
StoreEmbedding	command	transactional, idempotent by content_hash
RetrieveSimilar	query	reads VectorSimilarityProjection
QueryPatterns	query	reads PatternSearchProjection
RecommendPatterns	event	outbox required for reliable delivery
IndexTelemetry	command	transactional, idempotent by metric_id+timestamp
RefreshStats	event	outbox required for reliable delivery

Analogy

If KGS is the “working memory”, Vector Store is the “long-term library”. Metric Indexer is the research assistant, and Pattern Oracle is the head librarian.

Testing Strategy

Unit Tests: Embedding generation determinism (fixed seed → identical output)
Integration Tests: Vector storage and retrieval with pgvector
Performance Tests: Similarity search latency under load
Determinism Tests: Same text → same embedding → same storage ID

<- Returns ` = `false` Source: transcripts/SEA-Forge™-notebookLM-notes.md –> <- Returns ` = `false` Updated: 2025-12-30 - Added @cqrs annotations per Flow Annotation Contract –>

References

SDS-050: Semantic Identity Provenance