Runtime Behavior Correlation Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Build a production-grade runtime behavior correlation system that ties tri-signal telemetry (OTel traces/logs/metrics) to the ADR/PRD/SDS/SEA truth chain + manifests, detects behavioral drift, and surfaces results in Workbench UI and CI with high performance and privacy compliance.

Architecture: A telemetry ingest + normalization layer emits BehaviorEvidence, a correlation engine links evidence to provenance nodes, a drift classifier scores divergence, and results persist to the KG (authoritative) with a Postgres summary index. UI + CI consume summaries and drill into KG evidence.

Tech Stack: Python (workbench-bff), OTel/OTLP, OpenObserve, Knowledge Graph (Oxigraph), Postgres, React (Workbench), Nx, Vitest, Pytest.

Task 1: Define Spec Changes (ADR/PRD/SDS/SEA)

Files:

Modify: docs/specs/shared/adr/029-observability-stack-architecture.md
Modify: docs/specs/shared/sds/030-semantic-observability.md
Create/Modify: docs/specs/shared/sds/0xx-runtime-behavior-correlation.md (new SDS)
Modify: docs/specs/shared/prd/025-workbench-ui.md
Modify: docs/specs/shared/vision/001-sea-forge-master-vision.md (if required)

Step 1: Write the failing spec assertions

Add a new SDS section defining:

BehaviorEvidence schema
Correlation invariants
Drift severity policy
KG edge types
Retention requirements

Step 2: Validate specs

Run:

just sds-validate docs/specs/shared/sds/0xx-runtime-behavior-correlation.md

Expected: PASS

Step 3: Commit

git add docs/specs/shared/adr/029-observability-stack-architecture.md \
  docs/specs/shared/sds/030-semantic-observability.md \
  docs/specs/shared/sds/0xx-runtime-behavior-correlation.md \
  docs/specs/shared/prd/025-workbench-ui.md

git commit -m "spec: add runtime behavior correlation SDS"

Task 2: Update Generators (if new spec outputs required)

Files:

Modify: tools/ or generators/ templates (as required)

Step 1: Add new manifest/IR mapping rules

Update AST/IR/Manifest mapping to include correlation schema if needed.

Step 2: Regenerate outputs

Run:

just pipeline shared

Expected: Deterministic output

Step 3: Commit

git add generators/ tools/

git commit -m "feat: extend generators for behavior correlation"

Task 3: Implement Behavior Evidence Normalizer

Files:

Create: services/workbench-bff/src/adapters/behavior_normalizer.py
Test: services/workbench-bff/tests/test_behavior_normalizer.py

Step 1: Write the failing test

def test_normalize_trace_with_semantic_tags():
    normalizer = BehaviorNormalizer()
    evidence = normalizer.normalize_trace({"resource": {"attributes": {"sea.flow": "CreateOrder"}}})
    assert evidence.flow == "CreateOrder"

Step 2: Run test to verify it fails

Run:

cd services/workbench-bff && python -m pytest tests/test_behavior_normalizer.py -v

Expected: FAIL (BehaviorNormalizer not defined)

Step 3: Write minimal implementation

class BehaviorNormalizer:
    def normalize_trace(self, trace: dict) -> BehaviorEvidence:
        return BehaviorEvidence(flow=trace["resource"]["attributes"].get("sea.flow"))

Step 4: Run test to verify it passes

cd services/workbench-bff && python -m pytest tests/test_behavior_normalizer.py -v

Expected: PASS

Step 5: Commit

git add services/workbench-bff/src/adapters/behavior_normalizer.py \
  services/workbench-bff/tests/test_behavior_normalizer.py

git commit -m "feat: add behavior evidence normalizer"

Task 4: Implement Correlation Engine

Files:

Create: services/workbench-bff/src/adapters/behavior_correlator.py
Test: services/workbench-bff/tests/test_behavior_correlator.py

Step 1: Write the failing test

def test_correlate_exact_flow_match():
    correlator = BehaviorCorrelator()
    result = correlator.correlate(flow="CreateOrder", context="semantic-core")
    assert result.confidence == 1.0

Step 2: Run test to verify it fails

Run:

cd services/workbench-bff && python -m pytest tests/test_behavior_correlator.py -v

Expected: FAIL

Step 3: Write minimal implementation

class BehaviorCorrelator:
    def correlate(self, flow: str, context: str) -> CorrelationResult:
        return CorrelationResult(confidence=1.0, spec_node_id=f"sea:{context}:{flow}")

Step 4: Run test to verify it passes

cd services/workbench-bff && python -m pytest tests/test_behavior_correlator.py -v

Expected: PASS

Step 5: Commit

git add services/workbench-bff/src/adapters/behavior_correlator.py \
  services/workbench-bff/tests/test_behavior_correlator.py

git commit -m "feat: add behavior correlation engine"

Task 5: Implement Drift Classifier

Files:

Create: services/workbench-bff/src/adapters/behavior_drift_classifier.py
Test: services/workbench-bff/tests/test_behavior_drift_classifier.py

Step 1: Write the failing test

def test_classify_missing_db_span_is_medium():
    classifier = BehaviorDriftClassifier()
    severity = classifier.classify(expected=["db"], observed=["http"])
    assert severity == DriftSeverity.MEDIUM

Step 2: Run test to verify it fails

cd services/workbench-bff && python -m pytest tests/test_behavior_drift_classifier.py -v

Expected: FAIL

Step 3: Write minimal implementation

class BehaviorDriftClassifier:
    def classify(self, expected: list[str], observed: list[str]) -> DriftSeverity:
        return DriftSeverity.MEDIUM if "db" in expected and "db" not in observed else DriftSeverity.NONE

Step 4: Run test to verify it passes

cd services/workbench-bff && python -m pytest tests/test_behavior_drift_classifier.py -v

Expected: PASS

Step 5: Commit

git add services/workbench-bff/src/adapters/behavior_drift_classifier.py \
  services/workbench-bff/tests/test_behavior_drift_classifier.py

git commit -m "feat: add behavioral drift classifier"

Task 6: Persist Correlations (KG + Postgres)

Files:

Create: services/workbench-bff/src/adapters/behavior_kg_writer.py
Create: services/workbench-bff/src/adapters/behavior_indexer.py
Modify: services/workbench-bff/src/models.py

Step 1: Write failing tests

Add tests to assert KG writer creates observed_behavior edge and Postgres summary persists.

Step 2: Implement KG writer + Postgres indexer

Step 3: Run tests

cd services/workbench-bff && python -m pytest tests/test_behavior_storage.py -v

Step 4: Commit

git add services/workbench-bff/src/adapters/behavior_kg_writer.py \
  services/workbench-bff/src/adapters/behavior_indexer.py \
  services/workbench-bff/src/models.py

git commit -m "feat: persist behavior correlation results"

Task 7: Add API Routes

Files:

Create: services/workbench-bff/src/api/behavior_routes.py
Modify: services/workbench-bff/main.py

Step 1: Write failing tests

GET /behavior/summary -> 200 with list

Step 2: Implement routes

Step 3: Run tests

cd services/workbench-bff && python -m pytest tests/test_behavior_routes.py -v

Step 4: Commit

git add services/workbench-bff/src/api/behavior_routes.py \
  services/workbench-bff/main.py

git commit -m "feat: add behavior correlation API routes"

Task 8: Workbench UI

Files:

Create: apps/workbench/src/pages/RuntimeCorrelationDashboard.tsx
Create: apps/workbench/src/components/BehaviorDriftCard.tsx
Create: apps/workbench/src/types/behavior.ts
Create: apps/workbench/src/lib/behavior-api.ts
Modify: apps/workbench/src/pages/ProvenanceExplorer.tsx
Modify: apps/workbench/src/App.tsx

Step 1: Write failing tests

RuntimeCorrelationDashboard renders summary cards

Step 2: Implement UI

Step 3: Run tests

pnpm nx test workbench --testFile=RuntimeCorrelationDashboard

Step 4: Commit

git add apps/workbench/src/pages/RuntimeCorrelationDashboard.tsx \
  apps/workbench/src/components/BehaviorDriftCard.tsx \
  apps/workbench/src/types/behavior.ts \
  apps/workbench/src/lib/behavior-api.ts \
  apps/workbench/src/pages/ProvenanceExplorer.tsx \
  apps/workbench/src/App.tsx

git commit -m "feat: add runtime correlation UI"

Task 9: CI Drift Gate

Files:

Create: scripts/ci/behavior_drift_gate.sh
Modify: .github/workflows/ci.yml

Step 1: Add script

Implement --warn and --fail modes with thresholds.

Step 2: Wire CI

Add a step:

- name: Behavior Drift Gate
  run: ./scripts/ci/behavior_drift_gate.sh --warn

Step 3: Commit

git add scripts/ci/behavior_drift_gate.sh .github/workflows/ci.yml

git commit -m "chore: add behavior drift gate"

Task 10: Validation + Determinism

Step 1: Run spec guard

just spec-guard

Step 2: Run tests

just test-python
just test-ts

Step 3: Determinism check

just ci-determinism

Step 4: Commit (if needed)

git status --porcelain

Expected: clean

Execution Options

Plan complete and saved to docs/plans/2026-01-23-runtime-behavior-correlation.md.

Two execution options:

Subagent-Driven (this session) - I dispatch fresh subagent per task, review between tasks, fast iteration
Parallel Session (separate) - Open new session with executing-plans, batch execution with checkpoints

Which approach?