SEA-Forge Capability Report (Code-Backed)
Scope: This report is evidence-based and grounded in repository code/config as of this workspace. It does not rely on README claims.
SEA-DSL is treated as the canonical semantic code for SEA-Forge; projections and compiler outputs are derived artifacts.
1. Core Capabilities (Implemented)
Semantic modeling / specifications
- SEA DSL compilation artifacts (SEA → AST → IR → Manifest)
- What it does: Parses SEA DSL into AST JSON, compiles AST to IR, then IR to a manifest JSON used downstream.
- Evidence:
just/62-compiler.just (pipeline/parse/compile steps), tools/ast_to_ir.py, tools/ir_to_manifest.py.
- Status: End-to-end usable for generating
.ast.json, .ir.json, .manifest.json when a sea parser is installed.
- SDS YAML validation and SDS→manifest compilation
- What it does: Validates SDS YAML against schema and can compile SDS directly to manifest.
- Evidence:
tools/validate_sds.py, tools/codegen/sds_to_manifest.py, just/62-compiler.just (sds-validate, sds-compile).
- Status: End-to-end usable.
- Flow annotation linting (CQRS/runtime metadata)
- What it does: Lints SEA AST to enforce CQRS annotations and runtime metadata (tx, idempotency, outbox, read_model).
- Evidence:
tools/flow_lint.py.
- Status: End-to-end usable.
- Full spec pipeline (SDS/SEA → AST → IR → Manifest)
- What it does: Orchestrates validation + compilation steps per bounded context.
- Evidence:
just/62-compiler.just (pipeline, pipeline-all).
- Status: End-to-end usable.
- IR → Knowledge Graph snapshot
- What it does: Generates deterministic RDF triples and snapshot metadata from SEA IR, with optional SHACL validation.
- Evidence:
tools/ir_to_kgs.py, tools/kg_validate.py.
- Status: End-to-end usable (requires Python deps for RDF/SHACL).
Validation / governance / invariant checking
- Schema validation for AST/IR/manifest
- What it does: Enforces JSON schema constraints during compilation steps.
- Evidence:
tools/ast_to_ir.py, tools/ir_to_manifest.py, tools/schemas/*.json.
- Status: End-to-end usable.
- SHACL validation & shape linting
- What it does: Validates RDF graphs against SHACL, lints shapes, and performs shape impact analysis.
- Evidence:
tools/kg_validate.py, tools/shape_lint.py, tools/shape_impact_analysis.py, just/63-kg.just.
- Status: End-to-end usable.
- Governance gateway (OPA-backed)
- What it does: Evaluates LLM requests against policies, supports enforcement modes, audit logging, and policy reload.
- Evidence:
services/policy-gateway/main.py, services/policy-gateway/src/api/routes.py, infra/opa/.
- Status: End-to-end usable (depends on OPA + config).
Code / artifact generation
- Manifest → generated code
- What it does: Renders code into
src/gen/** from manifest using Jinja templates.
- Evidence:
tools/codegen/gen.py, tools/codegen/templates/*.
- Status: Partially implemented (file is handwritten with
--hand-write-ok bypass notice).
- Nx generators (bounded context, adapter, API surface)
- What it does: Scaffolds hexagonal architecture projects, adapters, and API surfaces.
- Evidence:
generators/ (e.g., generators/bounded-context, generators/adapter, generators/api-surface).
- Status: End-to-end usable.
- Post-codegen gap report
- What it does: Reports missing adapters, packages, env vars; can populate
.env and check versions online.
- Evidence:
tools/codegen/gap_report.py.
- Status: Partially implemented (handwritten with
--hand-write-ok bypass notice; uses online lookups).
- SEA-Forge CLI (Rust)
- What it does: Setup, validate specs, manage services (up/down/status/logs/watch), apply declared state.
- Evidence:
apps/sea-forge-cli/src/main.rs, apps/sea-forge-cli/src/commands/*.rs.
- Status: End-to-end usable (requires Rust build and
just/Docker tooling).
- Ops runner (Workbench)
- What it does: Allowlisted, auditable execution of
just commands (spec validation, pipeline, gap report), with logs and status.
- Evidence:
services/workbench-bff/src/adapters/ops_runner.py, services/workbench-bff/src/api/ops_routes.py.
- Status: End-to-end usable.
- CLI utilities for validation and drift
- What it does: Drift scan/remediate, KG validation, shape lint, schema compatibility, etc.
- Evidence:
tools/drift_heal.py, tools/kg_validate.py, tools/schema-compat-check.py.
- Status: End-to-end usable.
Persistence / provenance / history
- Provenance graph (spec → artifact lineage)
- What it does: Builds a DAG of ADR/PRD/SDS/SEA → manifest → artifact with hashes and timestamps.
- Evidence:
services/workbench-bff/src/adapters/provenance_registry.py, services/workbench-bff/src/api/provenance_routes.py.
- Status: Partially implemented (graph build works; historical compare is not implemented).
- Manifest registry and history
- What it does: Indexes manifests, validates payloads, serves versions and diffs.
- Evidence:
services/workbench-bff/src/adapters/manifest_registry.py, services/workbench-bff/src/api/routes.py.
- Status: End-to-end usable.
- Behavior correlation summaries (DB-backed)
- What it does: Stores behavior drift summaries in PostgreSQL for the Workbench UI.
- Evidence:
services/workbench-bff/src/adapters/behavior_indexer.py, services/workbench-bff/src/api/database.py.
- Status: End-to-end usable, but depends on DB setup.
AI / agent integration hooks
- A2A protocol gateway
- What it does: Agent card discovery, task lifecycle endpoints, OpenAI-compatible routes, MCP routes, and WebSocket updates.
- Evidence:
services/a2a/src/main.py, services/a2a/src/api/routes.py, services/a2a/src/api/openai_routes.py, services/a2a/src/api/mcp_routes.py.
- Status: End-to-end usable.
- LLM provider service (LiteLLM-based)
- What it does: OpenAI-compatible chat completions and embeddings, optional Policy Gateway routing.
- Evidence:
services/llm-provider/src/main.py, services/llm-provider/src/api/routes.py, services/llm-provider/src/adapters/litellm_adapter.py.
- Status: End-to-end usable (depends on model/provider credentials).
- Policy Gateway (OPA enforcement + audit)
- What it does: Evaluates prompt policies, proxies LLM requests, supports hot reload and audit queries.
- Evidence:
services/policy-gateway/src/api/routes.py, services/policy-gateway/main.py.
- Status: End-to-end usable (depends on OPA + config).
- Embedding service (local llama.cpp + fallback)
- What it does: OpenAI-compatible embeddings, uses local model or deterministic fallback.
- Evidence:
services/embedding/src/main.py.
- Status: End-to-end usable.
- Knowledge Graph service (Oxigraph-backed)
- What it does: SPARQL endpoint, event-to-RDF projection, snapshot retrieval, SHACL validation.
- Evidence:
services/knowledge-graph/main.py, services/knowledge-graph/src/api/routes.py.
- Status: End-to-end usable.
Infrastructure / orchestration
- Local orchestration and service recipes
- What it does: Docker compose definitions, OPA policies, NATS, OTEL, and
just recipes.
- Evidence:
infra/docker/docker-compose.*.yml, infra/opa/, infra/nats/, justfile and just/*.
- Status: End-to-end usable.
- Messaging worker (outbox/inbox/DLQ)
- What it does: NATS JetStream worker for outbox publishing, inbox consumption, and DLQ replay.
- Evidence:
apps/sea-mq-worker/src/main.rs and related modules.
- Status: End-to-end usable (depends on Postgres + NATS).
2. Supported Workflows
A. Spec pipeline: SDS/SEA → AST → IR → Manifest
- Inputs:
docs/specs/<ctx>/<ctx>.sea, optional docs/specs/<ctx>/<ctx>.sds.yaml.
- Outputs:
<ctx>.ast.json, <ctx>.ir.json, <ctx>.manifest.json in docs/specs/<ctx>/.
- How to run:
just pipeline <ctx>.
- Guarantees: Schema validation on AST/IR/manifest, deterministic compilation in
ast_to_ir.py/ir_to_manifest.py.
- Evidence:
just/62-compiler.just, tools/ast_to_ir.py, tools/ir_to_manifest.py.
B. SDS-only compile → manifest
- Inputs:
*.sds.yaml.
- Outputs:
*.manifest.json.
- How to run:
just sds-compile <file>.
- Guarantees: SDS schema validation (if run); deterministic output from compiler.
- Evidence:
tools/validate_sds.py, tools/codegen/sds_to_manifest.py, just/62-compiler.just.
C. Manifest inspection and diff (Workbench BFF)
- Inputs: manifest files and history/index (
artifacts/manifests/...).
- Outputs: manifest summaries, full payload, YAML render, version history, JSON patch diffs.
- How to run: Workbench BFF routes
/manifests, /manifests/{id}, /diff.
- Guarantees: Manifest payloads validated against schema.
- Evidence:
services/workbench-bff/src/api/routes.py, services/workbench-bff/src/adapters/manifest_registry.py.
D. Provenance graph queries
- Inputs: Specs and manifest files in
docs/specs/ and artifacts/manifests/.
- Outputs: lineage graph and impact lists.
- How to run: Workbench BFF
/provenance/* endpoints.
- Guarantees: Hashing for nodes; lineage based on file discovery rules.
- Evidence:
services/workbench-bff/src/adapters/provenance_registry.py, services/workbench-bff/src/api/provenance_routes.py.
- Inputs: Provenance graph + file hashes, manifests, local repo state.
- Outputs: drift reports, classifications, remediation suggestions, audit log entries, optional PR.
- How to run:
python tools/drift_heal.py scan/check/remediate, or Workbench BFF /drift/* endpoints.
- Guarantees: Hash-based drift detection with optional deep diff, safety policy for auto-fix.
- Evidence:
tools/drift_heal.py, services/workbench-bff/src/adapters/drift_detector.py, services/workbench-bff/src/adapters/remediation_engine.py, services/workbench-bff/src/api/drift_routes.py.
F. Knowledge Graph ingestion and querying
- Inputs: SPARQL queries or event projections.
- Outputs: SPARQL results, RDF triples, snapshot metadata.
- How to run: Knowledge Graph service endpoints
/kg/sparql, /kg/projection, /kg/snapshot/{id}.
- Guarantees: SHACL validation available for graphs; snapshot IDs are content-addressable.
- Evidence:
services/knowledge-graph/src/api/routes.py, tools/ir_to_kgs.py.
G. Governance evaluation and proxying of LLM calls
- Inputs: Prompt + context; or OpenAI-compatible chat/embeddings requests.
- Outputs: policy decision + audit entry; or proxied LLM responses.
- How to run: Policy Gateway
/policy/evaluate, /policy/chat/completions, /policy/embeddings, /governance/audit.
- Guarantees: OPA evaluation and enforcement mode, audit logging.
- Evidence:
services/policy-gateway/src/api/routes.py, services/policy-gateway/main.py.
H. Workbench Ops workflows
- Inputs: Ops action + context (validate specs, regenerate code, gap report).
- Outputs: run status, streamed logs, persisted run metadata.
- How to run: Workbench BFF
/ops/* endpoints and Workbench UI.
- Guarantees: Allowlisted commands only; timeouts and output caps.
- Evidence:
services/workbench-bff/src/adapters/ops_runner.py, services/workbench-bff/src/api/ops_routes.py, apps/workbench/src/pages/Operations.tsx.
3. Guardrails, Constraints, and Guarantees
- Schema gates: AST, IR, and manifest are validated against JSON schemas before proceeding. (
tools/ast_to_ir.py, tools/ir_to_manifest.py, tools/schemas/*)
- Flow annotation invariants: CQRS annotations and runtime metadata enforced by
tools/flow_lint.py (errors/warnings and GHA output supported).
- SDS schema validation: SDS YAML validated with JSON Schema (
tools/validate_sds.py).
- SHACL compliance: KG validation and shape linting for RDF graphs (
tools/kg_validate.py, tools/shape_lint.py).
- Ops safety: Workbench Ops only runs allowlisted commands with timeouts and log limits (
services/workbench-bff/src/adapters/ops_runner.py).
- Drift remediation safety policy: Disallows editing generated paths, limits file types and diff size before auto-fix (
services/workbench-bff/src/adapters/remediation_engine.py).
- Governance enforcement: Policy Gateway can block/log/pass based on OPA results; audit logging is enforced (
services/policy-gateway/main.py, services/policy-gateway/src/api/routes.py).
- Audit logging: Workbench BFF creates tamper-evident audit events for operations (
services/workbench-bff/src/api/audit.py).
4. Drift / Misalignment Handling
- Spec↔artifact drift detection: Hash-based comparisons with optional deep diffs and drift classification (benign/spec_stale/code_stale/semantic). (
services/workbench-bff/src/adapters/drift_detector.py)
- Provenance graph: DAG linking ADR→PRD→SDS→SEA→manifest→artifact, used by drift detector. (
services/workbench-bff/src/adapters/provenance_registry.py)
- Remediation engine: Suggests actions and can run
just pipeline <context>, optionally creating PRs. (services/workbench-bff/src/adapters/remediation_engine.py, services/workbench-bff/src/adapters/pr_creator.py)
- KG drift check: Warn-only diff of inferred triples vs baseline. (
tools/kg_drift_check.py)
Limitations / gaps:
- Historical provenance comparison is explicitly not implemented (throws
NotImplementedError). (services/workbench-bff/src/adapters/provenance_registry.py, services/workbench-bff/src/api/provenance_routes.py)
- Behavior correlation scan progress is simulated, and summary data is mocked (no live OpenObserve ingestion in the current code path). (
services/workbench-bff/src/api/behavior_routes.py)
5. What Is Clearly NOT Implemented (Yet)
- Historical provenance comparison:
compare_versions is not implemented and endpoints note placeholder behavior. (services/workbench-bff/src/adapters/provenance_registry.py, services/workbench-bff/src/api/provenance_routes.py)
- Workbench auth: BFF auth is a stub using headers; production OIDC is not implemented here. (
services/workbench-bff/src/api/auth.py)
- Runtime behavior correlation ingestion: Scan uses mocked summaries and simulated steps; no real OpenObserve queries in current implementation. (
services/workbench-bff/src/api/behavior_routes.py)
- Code generation pipeline completeness:
tools/codegen/gen.py and tools/codegen/gap_report.py are handwritten with --hand-write-ok bypass comments, indicating incomplete generator/spec alignment.
- LSP implementation:
just/30-cli.just references apps/sea-lsp/src, but no apps/sea-lsp/ directory exists in this repo.
- Policy Gateway evidence export:
/governance/export returns a zip built from in-memory mock data. (services/policy-gateway/src/api/routes.py)
6. One-Paragraph “What This System Really Is”
SEA-Forge today is a spec-first toolchain and runtime suite that compiles SEA DSL and SDS YAML into JSON artifacts (AST/IR/manifest), validates them via schema and linting rules, and exposes services for governance (OPA-backed Policy Gateway), LLM access (LiteLLM provider), a Knowledge Graph API, and a Workbench UI/BFF for manifest inspection, provenance, drift detection, and ops automation; it also includes Rust CLI tooling and a NATS/DB worker for outbox/inbox processing, with some areas (behavior correlation ingestion, historical provenance diffs, and generator completeness) explicitly stubbed or marked as handwritten.
7. Evidence-Backed Capability List
- SEA-Forge currently supports SEA DSL → AST → IR → Manifest compilation (
tools/ast_to_ir.py, tools/ir_to_manifest.py, just/62-compiler.just).
- SEA-Forge currently supports SDS YAML validation and SDS → manifest compilation (
tools/validate_sds.py, tools/codegen/sds_to_manifest.py).
- SEA-Forge currently supports Flow annotation linting for CQRS/runtime metadata (
tools/flow_lint.py).
- SEA-Forge currently supports manifest inspection and diffing via Workbench BFF (
services/workbench-bff/src/api/routes.py).
- SEA-Forge currently supports provenance graph construction and lineage queries (
services/workbench-bff/src/adapters/provenance_registry.py, services/workbench-bff/src/api/provenance_routes.py).
- SEA-Forge currently supports drift detection, analysis, and remediation workflows (
services/workbench-bff/src/adapters/drift_detector.py, services/workbench-bff/src/adapters/remediation_engine.py, tools/drift_heal.py).
- SEA-Forge currently supports Knowledge Graph queries and event projections (
services/knowledge-graph/src/api/routes.py).
- SEA-Forge currently supports OPA-backed governance enforcement and auditing (
services/policy-gateway/src/api/routes.py).
- SEA-Forge currently supports OpenAI-compatible LLM access via LiteLLM (
services/llm-provider/src/api/routes.py).
- SEA-Forge currently supports A2A agent gateway endpoints (
services/a2a/src/api/routes.py).
- SEA-Forge currently supports Ops execution with allowlisted commands and streamed logs (
services/workbench-bff/src/adapters/ops_runner.py).
- SEA-Forge currently supports NATS-backed outbox/inbox processing (
apps/sea-mq-worker/src/main.rs).
8. Notes for Documentation Accuracy
Observed mismatches or overstatements vs code:
- README: “Provenance and history for every change.”
- Code builds a provenance graph and manifest history, but historical lineage comparison is not implemented and relies on an index/history file rather than full git history. (
services/workbench-bff/src/adapters/provenance_registry.py)
- README: “Policy Gateway intercepts all LLM calls.”
- In code, Policy Gateway use is conditional (enabled by config). The LLM provider can bypass the gateway when disabled or when
PolicyGatewayBypassError occurs. (services/llm-provider/src/adapters/litellm_adapter.py)
- README: “Runtime behavior correlation / observed reality.”
- Current Workbench behavior correlation endpoints use mock summaries and simulated scan progress; no OpenObserve ingestion is wired in the scan path. (
services/workbench-bff/src/api/behavior_routes.py)
- README: “Generative Engine 100%”
- Core generator scripts
tools/codegen/gen.py and tools/codegen/gap_report.py are explicitly handwritten with --hand-write-ok bypass markers, indicating generator/spec alignment is incomplete.
If you want, I can cross-check other documentation files against these findings and annotate them directly.