ENGINEERING SOP

Structural-First, Port-Driven Development Process

Purpose This SOP defines the invariant development process for projects with strong structural dependencies (monorepos, generators, ports/adapters, feature flags, observability). The goal is repeatability, low ambiguity, and safe iteration from day one.

Phase 1 — Bootstrap the Development Environment

Objective: A clean machine can reach “tests pass + app boots” with minimal manual steps.

Steps

Pin runtimes and core tools (language, package manager).
Provision OS-level dependencies deterministically.
Auto-load environment configuration on directory entry.
Define canonical commands:
- setup – install toolchain + deps
- dev up – start local services
- test – run baseline tests
- dev – run primary app(s)

Acceptance Criteria

A fresh clone + setup works without tribal knowledge.
dev up starts all required local services.
test passes without external dependencies.
dev launches a runnable system (even if minimal).

Phase 2 — Establish the Repository Spine

Objective: Define the shape of the system before writing behavior.

Steps

Initialize workspace/monorepo.
Define top-level topology:
- apps (runtime entrypoints)
- libs/modules (shared, layered code)
Add formatting, linting, licensing, and repo policies.
Create a minimal README with the canonical commands.

Acceptance Criteria

The project graph is visible and understandable.
There is one obvious way to start developing.
Formatting and linting are enforced automatically.

Phase 3 — Lock Architecture & Dependency Rules

Objective: Make architectural violations hard or impossible.

Steps

Define architectural layers (e.g., domain, application, adapters, UI).
Enforce dependency constraints between layers.
Introduce ports as the only way core logic interacts with external systems.
Centralize contracts (API schemas, shared types, invariants).

Acceptance Criteria

Domain code cannot import infrastructure or frameworks.
UI cannot import backend internals.
Violations fail lint/build, not code review.

Phase 4 — Stand Up Local-First Runtime Dependencies

Objective: Local development mirrors production shape.

Steps

Define required services (DB, cache, queues, flags, storage, observability).
Script lifecycle commands: up, down, reset, seed.
Standardize configuration via env + typed validation.
Ensure full environment reset is cheap and reliable.

Feature Flags via OpenFeature + Flipt

Use the OpenFeature SDK with Flipt (binary, not Docker) for vendor-neutral feature flag management.

Install & Configure

# Install Flipt binary via mise -- version 2.4.0
mise use flipt 

# Install OpenFeature SDK -- 
pnpm add @openfeature/server-sdk @openfeature/flipt-provider

features.yaml Example

Create infra/flipt/features/features.yaml:

namespace: default
flags:
  - key: new-dashboard
    name: New Dashboard Experience
    type: BOOLEAN_FLAG_TYPE
    enabled: true

segments:
  - key: beta-users
    name: Beta Users
    match_type: ANY_MATCH_TYPE
    constraints:
      - type: STRING_CONSTRAINT_TYPE
        property: plan
        operator: eq
        value: beta

Provider Registration

import { OpenFeature } from '@openfeature/server-sdk';
import { FliptProvider } from '@openfeature/flipt-provider';

await OpenFeature.setProviderAndWait(new FliptProvider({
  url: process.env.FLIPT_URL ?? 'http://localhost:8080',
  namespace: 'default',
}));

export const featureClient = OpenFeature.getClient();

evalContext Mapping

targetingKey → Flipt’s entityId (required for rollouts)
All other properties → Flipt’s context map

Port Abstraction (Testability)

// Domain code uses port, not OpenFeature directly
export interface FeatureFlagsPort {
  isEnabled(flag: string, context: FlagContext): Promise<boolean>;
}

Acceptance Criteria

Local stack starts and stops predictably.
Resetting state is a supported operation.
No hardcoded endpoints or credentials.
Feature flags: Flipt binary starts with flags-up, reads from features.yaml.
OpenFeature: Provider registered at bootstrap, evalContext mapped correctly.
Testability: Domain code uses FeatureFlagsPort, tests use FakeFeatureFlags.

Phase 5 — Observability & Ops Baseline (Before Features)

Objective: Debuggability exists before complexity.

Canonical Reference: ADR-029: Observability Stack Architecture

Stack Components

Component	Role
OpenTelemetry	Instrumentation SDK (traces, metrics, logs)
OTel Collector v0.142.0	Telemetry pipeline (receive, process, export)
OpenObserve v0.30.2	Unified backend + dashboards (replaces Prometheus/Grafana)
Vanta	Continuous compliance automation (SOC2, ISO 27001)
Logfire v4.16.0	Python-native structured logging with trace correlation

Steps

Define logging conventions (structured, correlated via trace_id/span_id).
Add baseline metrics and tracing via OpenTelemetry SDK.
Implement health checks and dependency checks.
Decide audit boundaries (what must be recorded immutably).

Semantic Context Requirements (ADR-029 Invariants)

All telemetry MUST include these resource attributes:

Attribute	Description
`sea.domain`	Bounded context (e.g., `governance`)
`sea.concept`	Domain concept (e.g., `PolicyRule`)
`sea.regime_id`	Active compliance regime ID
`sea.platform`	Always `sea-forge`

Acceptance Criteria

A single request produces logs + traces you can inspect.
Dependency failures are visible and actionable.
Health endpoints reflect real system state.
OpenTelemetry: All services use OTel SDK, not custom instrumentation.
Semantic context: sea.domain, sea.concept, sea.regime_id present in all telemetry.
Compliance: Metrics export to Vanta for evidence collection.
Python services: Use Logfire with trace correlation.

Phase 6 — Define Contracts Before Implementation

Objective: Freeze boundaries early to reduce rework.

Steps

Define API request/response schemas.
Define async/job/event payload schemas.
Define structured outputs for nondeterministic systems (e.g., LLMs).
Define a stable error taxonomy across boundaries.

Acceptance Criteria

Contracts are versioned and validated.
Clients/types are generated or centrally enforced.
Core logic does not depend on raw external output.

Phase 7 — Generate Scaffolding (Structure via Automation)

Objective: Humans design structure once; generators reproduce it forever.

Steps

Use generators for apps, libs, modules, test harnesses.
Create custom generators for recurring patterns:
- new bounded context
- new adapter (real + fake)
- new API surface
Ensure every generated project has standard targets:
- lint, test, build

Acceptance Criteria

Adding a module is one command.
New code immediately conforms to architecture rules.
No copy-paste scaffolding.

Phase 8 — Build the Testing Harness (Before Behavior)

Objective: Every layer has a test strategy before features land.

Recommended Stack

Layer	TypeScript	Python	Purpose
Unit	Vitest 3.x	pytest 8.x	Fast, isolated tests for domain logic
Integration	Vitest + Testcontainers	pytest + testcontainers-python	Adapter tests with real DBs/services
E2E/Browser	Playwright 1.49.x	Playwright	Cross-browser UI tests
API Mocking	MSW 2.x	respx	Network-level request interception
Fixtures	@faker-js/faker	factory_boy	Deterministic test data

Steps

Define the test pyramid explicitly:
- Domain unit tests
- Application/use-case tests
- Adapter integration tests
- Minimal E2E tests
Provide deterministic fixtures and builders.
Ensure tests run locally and in CI.

Vitest + Nx Configuration (Critical for Stability)

Pitfall Avoidance: Vitest + Nx integration requires matching versions and proper plugin configuration.

1. Install (version-locked to Nx)

# Nx 19+ includes @nx/vite with Vitest support
nx add @nx/vite
pnpm add -D vitest @vitest/coverage-v8 vite-tsconfig-paths

2. Configure nx.json for Vitest plugin

{
  "plugins": [
    {
      "plugin": "@nx/vite/plugin",
      "options": {
        "buildTargetName": "build",
        "testTargetName": "test",
        "serveTargetName": "serve"
      }
    }
  ]
}

3. Project vite.config.ts (with Vitest inline)

/// <reference types="vitest" />
import { defineConfig } from 'vite';
import tsconfigPaths from 'vite-tsconfig-paths';

export default defineConfig({
  plugins: [tsconfigPaths()],
  test: {
    globals: true,
    environment: 'node', // or 'jsdom' for React
    include: ['src/**/*.{test,spec}.{ts,tsx}'],
    coverage: {
      provider: 'v8',
      reporter: ['text', 'lcov'],
      exclude: ['node_modules/', 'src/**/*.d.ts'],
    },
    // CRITICAL: Disable watch mode for CI
    watch: false,
  },
});

4. Run Tests via Nx

# Single project
nx test my-lib

# Affected only (CI)
nx affected -t test

# With coverage
nx test my-lib --coverage

Python Testing with pytest

# Install
pip install pytest pytest-cov pytest-asyncio testcontainers factory_boy

# Run
pytest tests/ -v --cov=src --cov-report=html

pytest.ini

[pytest]
asyncio_mode = auto
testpaths = tests
python_files = test_*.py
python_functions = test_*
addopts = -v --tb=short

Playwright E2E Setup

# Install
pnpm add -D @playwright/test
npx playwright install chromium

# Run
nx e2e my-app-e2e

playwright.config.ts

import { defineConfig } from '@playwright/test';

export default defineConfig({
  testDir: './e2e',
  timeout: 30_000,
  retries: process.env.CI ? 2 : 0,
  use: {
    baseURL: 'http://localhost:4200',
    trace: 'on-first-retry',
    screenshot: 'only-on-failure',
  },
  webServer: {
    command: 'nx serve my-app',
    url: 'http://localhost:4200',
    reuseExistingServer: !process.env.CI,
  },
});

Testcontainers for Integration Tests

import { PostgreSqlContainer } from '@testcontainers/postgresql';

describe('UserRepository', () => {
  let container: PostgreSqlContainer;
  let connectionString: string;

  beforeAll(async () => {
    container = await new PostgreSqlContainer().start();
    connectionString = container.getConnectionUri();
  });

  afterAll(async () => {
    await container.stop();
  });

  it('should save and retrieve user', async () => {
    const repo = new UserRepository(connectionString);
    // ...
  });
});

MSW for API Mocking

import { http, HttpResponse } from 'msw';
import { setupServer } from 'msw/node';

const server = setupServer(
  http.get('/api/users', () => {
    return HttpResponse.json([{ id: 1, name: 'Test User' }]);
  })
);

beforeAll(() => server.listen());
afterEach(() => server.resetHandlers());
afterAll(() => server.close());

Acceptance Criteria

Core logic can be tested without real infrastructure.
Integration tests are reproducible.
E2E tests are few and stable.
Vitest: Runs via nx test, version matches @nx/vite.
Playwright: Runs via nx e2e, traces on failure.
pytest: Runs via just test-python, async tests supported.

Phase 9 — Implement Features as Vertical Slices

Objective: Each increment is production-shaped and safe.

Required Order (Do Not Skip)

Domain model + invariants
Use case / application logic
Ports (interfaces + fakes)
Adapters (real implementations)
Wiring (API/UI)
Observability hooks
Tests at appropriate layers
Feature flags (if rollout risk exists)

Pre-Implementation Checklist

Lesson Learned: Infrastructure issues commonly derail implementation. Verify before coding.

Generators compile and ready: just generator-check
SEA-DSL has @cqrs annotations: just flow-lint
All dependencies installed: just setup
Environment healthy: just doctor
tsconfig.base.json has path mappings for new libraries

Generator Usage Patterns

Critical: Prefer just recipes. Use direct Nx invocation as fallback only.

Just Recipes (Preferred):

# Bounded context (creates domain/ports/adapters layers)
just generator-bc <name>
just generator-bc <name> sea typescript    # With explicit scope/lang

# Adapter pair (fake + real implementation)
just generator-adapter <name> <ctx> http

# API surface (routes, handlers)
just generator-api <name> <ctx>

# List available generators
just generator-list

# Build generators (required after changes to generators/)
just generator-build

Fallback (Direct Nx Invocation):

pnpm nx g @sea/generators:bounded-context <name> --scope=sea
pnpm nx g @sea/generators:adapter <name> --context=<ctx> --backend=http
pnpm nx g @sea/generators:api-surface <name> --context=<ctx>

After running generators:

Add path mappings to tsconfig.base.json:

"@sea/<name>-domain": ["libs/<name>/domain/src/index.ts"],
"@sea/<name>-ports": ["libs/<name>/ports/src/index.ts"],
"@sea/<name>-adapters": ["libs/<name>/adapters/src/index.ts"]

Install commonly needed dependencies: pnpm add uuid && pnpm add -D @types/uuid

Polyglot Implementation Strategy

For bounded contexts requiring both Python and TypeScript:

Layer	TypeScript Location	Python Location
Domain Types	`domain/src/lib/types.ts`	`domain/src/gen/types.py`
Ports	`ports/src/lib/*.port.ts`	`ports/src/gen/ports.py`
Adapters	`adapters/src/lib/*.adapter.ts`	`services/<ctx>/src/adapters/`
Tests	`adapters/src/lib/*.spec.ts`	`tests/<ctx>/test_*.py`

Python Service Structure:

services/<name>/
├── pyproject.toml          # Dependencies (use poetry or pip)
├── Dockerfile              # Container build
├── main.py                 # FastAPI entry point
└── src/
    ├── api/routes.py       # HTTP endpoints
    ├── adapters/<name>.py  # Real adapter implementation
    └── config/settings.py  # Pydantic settings

Fake-First Development

Lesson Learned: Create fakes before real adapters. This enables:

Parallel development (UI can proceed while backend is built)

Deterministic testing (no external dependencies)

Contract verification (fake implements same port interface)

// 1. Define the port first
export interface LlmProviderPort {
  completeChat(request: ChatRequest): Promise<ChatCompletion>;
}

// 2. Implement the fake immediately
export class FakeLlmAdapter implements LlmProviderPort {
  async completeChat(request: ChatRequest): Promise<ChatCompletion> {
    // Deterministic response for testing
    return {
      id: `fake-${Date.now()}`,
      content: `Echo: ${request.messages[0]?.content ?? ''}`,
      // ...
    };
  }
}

// 3. Write tests against the fake
// 4. Then implement the real adapter

Common Pitfalls

Pitfall	Symptom	Prevention
Wrong generator prefix	“Cannot find generator ‘sea:adapter’”	Use `@sea/generators:adapter`
Missing path mappings	“Cannot find module ‘@sea/myctx-domain’”	Add to tsconfig.base.json
Generator type errors	TypeScript compilation fails	Use literal unions, not `string`
Missing @cqrs annotations	flow_lint.py fails	Add annotations before pipeline
Python imports fail	ModuleNotFoundError	Use `pip install -e .` in service dir

Acceptance Criteria

The slice can be enabled/disabled safely.
At least one meaningful test covers the slice.
The slice emits useful telemetry.
Generators compile before use.
Fakes exist for all ports (testability).
Both Python and TypeScript types align (if polyglot).

Phase 10 — Feature Flags & Rollout Discipline

Objective: Enable speed without permanent complexity.

Steps

Classify flags (release, experiment, ops, tiering).
Assign owner and expiry for every flag.
Enforce server-side authority for security.
Schedule regular flag cleanup.

Flag Types

Type	Purpose	Typical Lifespan
`release`	Gate incomplete features	Days to weeks
`experiment`	A/B testing, metrics-driven	Weeks to months
`ops`	Circuit breakers, kill switches	Permanent
`tiering`	Entitlement, plan-based access	Permanent

Acceptance Criteria

Flags do not accumulate indefinitely.
Rollouts can occur without redeploys.
Disabled paths are tested.
Every flag has an owner and expiry date documented.
Expired flags generate CI warnings.

Phase 11 — CI/CD & Release Mechanics

Objective: Fast feedback, safe merges, predictable releases.

Steps

Use graph-aware CI (only build/test what changed).
Enforce quality gates (lint, types, tests).
Produce immutable release artifacts.
Define environment promotion flow.

Smart Sync Check (Spec-First CI)

Lesson Learned: Spec-first development means specs change before generated code. CI must accommodate this workflow.

File Category	CI Behavior
Only `/src/gen/`	Determinism check runs — must match pipeline output
`docs/specs/` or `generators/`	Determinism check skipped — stale code expected
Both categories	Determinism check skipped — spec changes take precedence

Workflow after spec PR merges:

just pipeline <context>   # Regenerate from updated specs
git add -A && git commit -m "chore: regenerate from spec updates"

Acceptance Criteria

CI is fast enough developers don’t bypass it.
Releases are reproducible from tagged commits.
Rollbacks are possible.

Phase 12 — Continuous Hardening & Debt Management

Objective: Prevent slow decay of system quality.

Steps

Add resilience patterns (timeouts, retries, idempotency).
Rotate secrets and audit dependencies.
Track intentional shortcuts with payoff windows.
Set and monitor performance budgets.

Acceptance Criteria

Failures degrade gracefully.
Known risks are visible and tracked.
Technical debt is intentional, not accidental.

Invariant Checklist (Use This Everywhere)

Can a new dev be productive in <30 minutes?
Are architectural violations prevented by tooling?
Does all external interaction go through ports?
Can core logic run against fakes?
Is observability present before feature depth?
Does every feature ship as a vertical slice?
Are feature flags owned and removable?
Is there one canonical deployment shape?

1) ENGINEERING SOP — 1-Page Printable Checklist Template

Use this as a literal checklist. If a phase isn’t satisfied, don’t move forward.

Phase 1 — Bootstrap Dev Environment

Runtime versions pinned (language, package manager)
OS/system deps declared deterministically
Env auto-loaded on directory entry
One-command bootstrap documented

Pass if: clean clone → setup → dev up → test → dev works

Phase 2 — Repository Spine

Workspace/monorepo initialized
Apps vs libs/modules clearly separated
Formatting + linting enforced
README lists only canonical commands

Pass if: new dev knows where to start in <5 minutes

Phase 3 — Architecture & Dependency Rules

Layers explicitly defined (domain / app / adapters / UI)
Import constraints enforced by tooling
External systems reachable only via ports
Contracts live in one canonical place

Pass if: architectural violations fail automatically

Phase 4 — Local-First Runtime Stack

Pass if: local mirrors production shape, flags work via OpenFeature

Phase 5 — Observability Baseline (ADR-029)

OpenTelemetry SDK installed (Node.js + Python)
OTel Collector configured (infra/otel/otel-collector-config.yaml)
PII scrubbing processor enabled in Collector
Semantic context attributes set (sea.domain, sea.concept, sea.regime_id)
Logfire configured for Python services
Health + readiness checks implemented
Audit boundaries defined (if required)
Metrics exported to Vanta for compliance evidence

Pass if: “what broke?” is answerable immediately, all telemetry uses OTel

Phase 6 — Contracts Before Code

API schemas defined and versioned
Async/job/event schemas defined
Structured outputs for nondeterministic systems
Stable error taxonomy established

Pass if: boundaries are frozen before implementation

Phase 7 — Generator-Driven Structure

Apps/libs created via generators (just generator-bc, just generator-adapter)
Custom generators for recurring patterns exist
Every project has lint/test/build targets
Generator Pre-Flight (before using)
- Generators compile: just generator-build (fallback: pnpm nx run generators:build)
- Use just recipes: just generator-bc <name> (fallback: pnpm nx g @sea/generators:bounded-context)
- Types use literal unions (not string) for narrowed params

Pass if: adding a module is one command, generators compile cleanly

Phase 8 — Testing Harness

Pass if: just test-ts and just test-e2e both pass

Phase 9 — Vertical Slice Delivery

Pre-Implementation Checks:

Generators compile: just generator-check (fallback: pnpm nx run generators:build)
SEA-DSL flow annotations valid: just flow-lint
Dependencies installed: just setup
Environment healthy: just doctor
Path mappings exist in tsconfig.base.json for new libs

For each feature:

Post-Generator Checklist:

Path mappings added to tsconfig.base.json:

"@sea/<ctx>-domain": ["libs/<ctx>/domain/src/index.ts"]

Common deps installed: pnpm add uuid && pnpm add -D @types/uuid
Python service has pip install -e . run

Pass if: slice is production-shaped, reversible, and fakes exist for all ports

Phase 10 — Feature Flags Discipline

Flag type classified (release/experiment/ops/tiering)
Owner + expiry defined in features.yaml comments
Server authoritative for security (Flipt server-side)
Cleanup scheduled (CI check for expired flags)
OpenFeature provider registered, evalContext mapped

Pass if: flags don’t accumulate, governance is automated

Phase 11 — CI/CD & Releases

Graph-aware CI
Quality gates enforced
Immutable build artifacts
Promotion/rollback path defined

Pass if: releases are boring and repeatable

Phase 12 — Continuous Hardening

Timeouts/retries/idempotency where needed
Secrets rotation policy
Debt items tracked with payoff windows
Performance budgets monitored

Pass if: quality improves over time