Implementation Plan: Simulation and Replay

Execute “what-if” scenarios against deterministic manifests and snapshots, replaying commands/events to compare simulated outcomes with observed production traces. Use results to feed governance (risk/evidence) and temporal memory (pattern oracle).

Provenance & Traceability

Architectural Decisions (ADRs)

ADR ID	Decision Title	Impact on This Plan
ADR-036	Simulation and Replay Architecture	Primary: Deterministic simulation kernel with snapshot isolation.
ADR-029	Observability Stack Architecture	Replay comparison depends on trace/log correlation and privacy rules.
ADR-032	NATS JetStream Messaging	Event replay should leverage versioned subjects and durable streams when applicable.

Product Requirements (PRDs)

PRD ID	Requirement Title	Satisfied By (SDS)	Acceptance Criteria
PRD-024	Simulation & Replay Platform	SDS-053	Deterministic simulation from snapshot + reproducible diffs

Software Design Specifications (SDS)

SDS ID	Service/Component	Bounded Context	SEA-DSL Spec File	Implementation Status
SDS-053	Simulation and Replay Service	`shared`	N/A	Draft

Dependencies (Existing Specs This Plan Builds On)

Snapshots and semantic state: docs/specs/semantic-core/sds/003-knowledge-graph-service.md
Manifest projection: docs/specs/shared/reference/011-manifest-schema.md
Observability envelope: docs/specs/shared/sds/030-semantic-observability.md
Evidence submission: docs/specs/shared/sds/047-governedspeed-governance-runtime.sds.yaml
Temporal memory: docs/specs/semantic-core/sds/015-temporal-database-service.md

Architecture and Design

Design Principles Applied

Isomorphism: Simulation inputs are (manifest + snapshot + scenario), and outputs are structured artifacts suitable for evidence submission.
Invariants: Simulation must be deterministic given identical inputs.
Idempotency: Replays are safe and deduped by scenario/run ID.

Expected Filetree

/
├── docs/specs/**                                  # (new) PRD/SDS for simulation & replay
├── docs/specs/shared/sds/030-semantic-observability.md
└── docs/specs/shared/sds/043-risk-evidence-service.md

Proposed Cycles

Cycle	Branch	Wave	Files Modified	Files Created	Specs Implemented
C1A	`cycle/p011-c1a-specs`	1	—	`docs/specs//prd/-simulation-and-replay.md`, `docs/specs//sds/-simulation-and-replay.md`	NEW PRD/SDS
C1B	`cycle/p011-c1b-scenario-schema`	1	`schemas/**`	`schemas/simulation/*`	Scenario + result schemas
C2A	`cycle/p011-c2a-evidence-integration`	2	`docs/specs/shared/sds/043-risk-evidence-service.md` (if needed)	—	Evidence bundle format for simulations

Task Breakdown

Wave 1 (Parallel)

C1A: Author missing PRD/SDS for simulation & replay ✅ PR #124
- Implements: Deterministic simulation contract, scenario schema, replay comparison strategy
- Satisfies: PRD-024
- Files: docs/specs/shared/adr/036-simulation-replay-architecture.md, docs/specs/shared/prd/024-simulation-replay-platform.md, docs/specs/shared/sds/053-simulation-replay-service.md (approved)
C1B: Define scenario input and simulation result schemas ✅ PR #126
- Implements: Scenario IDs, expected diffs, privacy/payload mode, correlation IDs
- Satisfies: Deterministic ingestion into evidence + temporal memory
- Files: schemas/simulation/*.schema.json + examples + README

Wave 2 (Depends on Wave 1)

C2A: Integrate simulation outputs as governance evidence artifacts ✅ PR #128
- Implements: Evidence submission and retrieval of simulation runs
- Satisfies: PRD-010 REQ-GS-003/004 style evidence capture (by analogy)
- Files: docs/specs/shared/sds/047-governedspeed-governance-runtime.sds.yaml

Validation & Verification

Spec Validation

Simulation inputs/outputs are schema-validated ✅ schemas/simulation/*.schema.json
Replay comparisons respect privacy and observability constraints (SDS-030) ✅ documented in SDS-053

Implementation Validation

Same snapshot + same scenario yields identical results (byte-identical where applicable)
Differences between simulated and observed traces are summarized as evidence artifacts

Open Questions

Where should the Simulation & Replay bounded context live (semantic-core, cognitive-extension, or shared)? ✅ Resolved: shared (per SDS-053)
What is the canonical “snapshot” representation to simulate against (KGS snapshot vs manifest-only vs both)? ✅ Resolved: Bundle (manifest + KG + config) - documented in snapshot.schema.json
Should replay source be production traces, JetStream streams, or both? ✅ Resolved: JetStream + traces (per SDS-053 integration points)

Risks & Mitigation

Risk	Likelihood	Impact	Mitigation Strategy
Simulation introduces non-determinism	Medium	High	Require deterministic inputs; record all random seeds; forbid time-based branching.
Replay leaks sensitive production data	Medium	High	Use payload modes and scrubbing; restrict access via SDS-031 authority rules.

Rollback Strategy

Restrict simulation to synthetic scenarios only until production replay is safe and governed.

Reference Documents

docs/specs/shared/reference/011-manifest-schema.md
docs/specs/shared/sds/030-semantic-observability.md