PRD-INGEST-001: SEA-DSL Policy Ingestion
Type
Functional
Priority
Critical
MVP Status
✅ MVP (Walking Skeleton Cycle S1A)
When a user provides a SEA-DSL policy file, the system shall parse, validate, and index it for future retrieval and governance enforcement.
User Story
As a policy author, I want to ingest a SEA-DSL policy file into the system, so that the policy can be queried, enforced, and governed during runtime.
Acceptance Criteria
AC-001.1: Parse SEA-DSL File
- Given a valid
.sea policy file
- When the ingest service processes the file
- Then it shall produce a valid AST JSON representation
AC-001.2: Generate RDF Triples
- Given a valid AST JSON
- When the triple generator processes it
- Then it shall produce RDF triples in N-Triples format
AC-001.3: Store Triples in Oxigraph
- Given valid RDF triples
- When the storage adapter writes them
- Then they shall be queryable via SPARQL
AC-001.4: Generate Embeddings
- Given policy text content
- When the embedding generator processes it
- Then it shall produce a 384-dimensional vector using EmbeddingGemma
AC-001.5: Store Embeddings in pgvector
- Given a valid embedding vector
- When the storage adapter writes it
- Then it shall be searchable via cosine similarity
AC-001.6: Idempotent Re-Ingestion
- Given a previously ingested policy
- When the same policy is ingested again
- Then it shall update existing records without duplication
AC-001.7: Error Handling
- Given an invalid
.sea file
- When the ingest service processes it
- Then it shall return a clear parse error with line/column information
Dependencies
- tree-sitter-sea (SEA-DSL grammar)
- Oxigraph (RDF triple store)
- pgvector (vector storage extension)
- llama.cpp + EmbeddingGemma model
- PostgreSQL database
- ADRs: ADR-006 (Ingest Pipeline), ADR-004 (Semantic Core)
- SDS: SDS-INGEST-010 (Ingest Service)
- Plan: P001-SKELETON (Walking Skeleton)
Success Metrics
- Parse Success Rate: >99% for valid
.sea files
- Ingestion Latency: <500ms for typical policy file (p95)
- Storage Write Success: 100% (atomic transaction)
- Idempotency Check: 100% (no duplicate records)
Non-Functional Requirements
- NFR-001.1: Deterministic parsing (same input → same AST)
- NFR-001.2: Local-first (no external API dependencies)
- NFR-001.3: Atomic writes (both Oxigraph and pgvector succeed or fail together)
- NFR-001.4: Observability (emit structured logs + OpenTelemetry traces)
Out of Scope (for MVP)
- Multi-file ingestion batch processing
- Incremental policy updates
- Policy versioning and rollback
- Schema migration for policy changes
Next Steps:
- Design SDS-INGEST-010
- Implement Cycle S1A vertical slice
- Write integration tests per acceptance criteria