Implementation Plan: Simulation and Replay

Execute “what-if” scenarios against deterministic manifests and snapshots, replaying commands/events to compare simulated outcomes with observed production traces. Use results to feed governance (risk/evidence) and temporal memory (pattern oracle).

Provenance & Traceability

Architectural Decisions (ADRs)

ADR ID Decision Title Impact on This Plan
ADR-036 Simulation and Replay Architecture Primary: Deterministic simulation kernel with snapshot isolation.
ADR-029 Observability Stack Architecture Replay comparison depends on trace/log correlation and privacy rules.
ADR-032 NATS JetStream Messaging Event replay should leverage versioned subjects and durable streams when applicable.

Product Requirements (PRDs)

PRD ID Requirement Title Satisfied By (SDS) Acceptance Criteria
PRD-024 Simulation & Replay Platform SDS-053 Deterministic simulation from snapshot + reproducible diffs

Software Design Specifications (SDS)

SDS ID Service/Component Bounded Context SEA-DSL Spec File Implementation Status
SDS-053 Simulation and Replay Service shared N/A Draft

Dependencies (Existing Specs This Plan Builds On)


Architecture and Design

Design Principles Applied

Expected Filetree

1
2
3
4
/
├── docs/specs/**                                  # (new) PRD/SDS for simulation & replay
├── docs/specs/shared/sds/030-semantic-observability.md
└── docs/specs/shared/sds/043-risk-evidence-service.md

Proposed Cycles

Cycle Branch Wave Files Modified Files Created Specs Implemented
C1A cycle/p011-c1a-specs 1 docs/specs/*/prd/*-simulation-and-replay.md, docs/specs/*/sds/*-simulation-and-replay.md NEW PRD/SDS
C1B cycle/p011-c1b-scenario-schema 1 schemas/** schemas/simulation/* Scenario + result schemas
C2A cycle/p011-c2a-evidence-integration 2 docs/specs/shared/sds/043-risk-evidence-service.md (if needed) Evidence bundle format for simulations

Task Breakdown

Wave 1 (Parallel)

Wave 2 (Depends on Wave 1)


Validation & Verification

Spec Validation

Implementation Validation


Open Questions

  1. Where should the Simulation & Replay bounded context live (semantic-core, cognitive-extension, or shared)? ✅ Resolved: shared (per SDS-053)
  2. What is the canonical “snapshot” representation to simulate against (KGS snapshot vs manifest-only vs both)? ✅ Resolved: Bundle (manifest + KG + config) - documented in snapshot.schema.json
  3. Should replay source be production traces, JetStream streams, or both? ✅ Resolved: JetStream + traces (per SDS-053 integration points)

Risks & Mitigation

Risk Likelihood Impact Mitigation Strategy
Simulation introduces non-determinism Medium High Require deterministic inputs; record all random seeds; forbid time-based branching.
Replay leaks sensitive production data Medium High Use payload modes and scrubbing; restrict access via SDS-031 authority rules.

Rollback Strategy

  1. Restrict simulation to synthetic scenarios only until production replay is safe and governed.

Reference Documents