Complete the SEA-Forge™ code generation pipeline so that running python tools/codegen/gen.py <context>.manifest.json produces fully working code with no TODOs for all 17 bounded contexts.

Current State

What Works

What’s Broken

Task 1: Create SDS YAML Files for All Contexts

File Location Pattern

1
docs/specs/<context>/<context>.sds.yaml

SDS YAML Schema (Authoritative Reference)

Required Structure for Each Context

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
sds:
  id: SDS-<NNN>
  title: <Context Name> Context SDS
  bounded_context: <context-name>
  version: 1.0.0
  status: draft

# CRITICAL: Define fields for ALL aggregates in the manifest
domain:
  aggregates:
    - name: <AggregateName>
      root: true
      identity: <IdentityField>
      fields:
        id: string
        # Add all fields with types: string, number, boolean, datetime, object, object[], uuid
        field_name: type
      invariants: []

# Define commands with input/output schemas
cqrs:
  commands:
    - id: CMD-<NNN>
      name: <CommandName>
      input:
        field_name: type
      output: <AggregateName>
      transactional: true
      emits: [<EventName>]
      touches:
        aggregates: [<AggregateName>]

  queries:
    - id: QRY-<NNN>
      name: <QueryName>
      input:
        id: string
      output: <AggregateName>
      consistency: eventual
      read_model:
        source: primary_db

# Define events with payload schemas
events:
  - id: EVT-<NNN>
    name: <EventName>
    payload:
      id: string
      timestamp: datetime
    publish:
      topic: <context>.<event_name>.v1
      delivery: at_least_once

Contexts Requiring SDS YAML Creation

Priority 1 - Have commands/queries but no field definitions: | Context | Aggregates | Commands | Queries | Status | |———|————|———-|———|——–| | llm-provider | 9 | 2 | 2 | Needs SDS YAML | | context | 21 | 3 | 4 | Needs SDS YAML | | sea-api | 9 | 2 | 1 | Needs SDS YAML |

Priority 2 - Have aggregates but no commands/queries: | Context | Aggregates | Commands | Queries | Status | |———|————|———-|———|——–| | governance-runtime | 5 | 0 | 0 | Needs CQRS + fields | | governance | 3 | 0 | 0 | Needs CQRS + fields | | documentation | 17 | 0 | 0 | Needs CQRS + fields | | ingest | 2 | 0 | 0 | Needs CQRS + fields | | query | 3 | 0 | 0 | Needs CQRS + fields | | federal | 4 | 0 | 0 | Needs CQRS + fields | | developer-tooling | 6 | 0 | 0 | Needs CQRS + fields |

Priority 3 - Already have field definitions (verify only): | Context | Aggregates | Commands | Queries | Status | |———|————|———-|———|——–| | semantic-core | 4 | 4 | 5 | ✅ Complete | | cognitive-extension | 5 | 0 | 0 | Has fields, needs CQRS | | memory | 3 | 2 | 2 | Has fields, verify | | healthcare | 4 | 4 | 4 | Has fields, verify | | finance | 4 | 0 | 0 | Has fields, needs CQRS | | architectural-governance | 4 | 0 | 0 | Has fields, needs CQRS |

How to Derive Field Definitions

1
2
3
4
5
6
7
8
9
10
11
12
- name: LlmProvider
  root: true
  identity: id
  fields:
    id: string
    name: string
    endpoint: string
    api_key_ref: string  # Reference to secret, not actual key
    model_specs: object[]
    is_active: boolean
    created_at: datetime
    updated_at: datetime

Task 2: Update ir_to_manifest.py to Read SDS YAML

File Location

1
tools/ir_to_manifest.py

Current Behavior (BROKEN)

1
aggregates = {sym["name"]: {"fields": {}} for sym in entities.values()}

Required Changes

Dependencies to Add

1
2
# At top of file
import yaml  # pip install pyyaml

Verification Steps

1. Verify SDS YAML files exist

1
2
ls docs/specs/*/$(basename $PWD).sds.yaml 2>/dev/null | wc -l
# Should output: 17 (one per context)

2. Regenerate all manifests

1
2
3
4
5
for ctx in llm-provider context sea-api governance-runtime governance \
           documentation ingest query federal developer-tooling \
           cognitive-extension memory healthcare finance architectural-governance; do
  python tools/ir_to_manifest.py docs/specs/$ctx/$ctx.ir.json > docs/specs/$ctx/$ctx.manifest.json
done

3. Check manifests have field definitions

1
2
3
4
5
6
7
8
9
10
11
12
python3 -c "
import json
from pathlib import Path
for mf in Path('docs/specs').rglob('*.manifest.json'):
    data = json.load(open(mf))
    aggs = data.get('model', {}).get('aggregates', {})
    aggs_with_fields = sum(1 for a in aggs.values() if a.get('fields'))
    if len(aggs) != aggs_with_fields:
        print(f'❌ {mf.stem}: {aggs_with_fields}/{len(aggs)} aggregates have fields')
    else:
        print(f'✅ {mf.stem}: All {len(aggs)} aggregates have fields')
"

4. Generate code for all contexts

1
2
3
for ctx in semantic-core llm-provider context sea-api governance-runtime; do
  python tools/codegen/gen.py docs/specs/$ctx/$ctx.manifest.json
done

5. Check for TODOs in generated code

1
2
grep -rn "TODO\|Implement" libs/*/application/src/gen/ libs/*/domain/src/gen/ 2>/dev/null
# Should output: nothing (no matches)

6. Run tests

1
python -m pytest libs/*/adapters/tests/ -v

Example: Complete llm-provider SDS YAML

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
sds:
  id: SDS-049
  title: LLM Provider Context SDS
  bounded_context: llm-provider
  version: 1.0.0
  status: draft

domain:
  aggregates:
    - name: LlmProvider
      root: true
      identity: id
      fields:
        id: string
        name: string
        provider_type: string  # openai, anthropic, ollama, openrouter
        endpoint: string
        api_key_ref: string
        is_active: boolean
        created_at: datetime
        updated_at: datetime

    - name: ModelSpec
      identity: id
      fields:
        id: string
        provider_id: string
        model_name: string
        context_window: number
        max_tokens: number
        supports_streaming: boolean
        supports_functions: boolean
        cost_per_1k_input: number
        cost_per_1k_output: number

    - name: ChatMessage
      identity: id
      fields:
        id: string
        role: string  # system, user, assistant, function
        content: string
        name: string
        function_call: object
        created_at: datetime

    - name: ChatCompletion
      identity: id
      fields:
        id: string
        provider_id: string
        model: string
        messages: object[]
        response: string
        usage: object
        finish_reason: string
        created_at: datetime
        latency_ms: number

    - name: Embedding
      identity: id
      fields:
        id: string
        provider_id: string
        model: string
        input_text: string
        vector: object  # float array
        dimensions: number
        created_at: datetime

    - name: FallbackChain
      identity: id
      fields:
        id: string
        name: string
        provider_ids: object[]  # ordered list
        fallback_on_error: boolean
        fallback_on_rate_limit: boolean
        created_at: datetime

    - name: ProviderConfig
      identity: id
      fields:
        id: string
        provider_id: string
        timeout_ms: number
        max_retries: number
        rate_limit_rpm: number
        rate_limit_tpm: number

    - name: ProviderHealth
      identity: id
      fields:
        id: string
        provider_id: string
        is_healthy: boolean
        last_check_at: datetime
        error_count: number
        avg_latency_ms: number

    - name: TokenUsage
      identity: id
      fields:
        id: string
        provider_id: string
        model: string
        prompt_tokens: number
        completion_tokens: number
        total_tokens: number
        cost: number
        timestamp: datetime

cqrs:
  commands:
    - id: CMD-001
      name: CompleteChat
      description: Send chat messages and receive completion
      input:
        provider_id: string
        model: string
        messages: object[]
        temperature: number
        max_tokens: number
      output: ChatCompletion
      transactional: false
      emits: [ChatCompleted]
      touches:
        aggregates: [ChatCompletion, TokenUsage]
      preconditions:
        - provider must be active
        - model must be supported

    - id: CMD-002
      name: GenerateEmbedding
      description: Generate vector embedding for text
      input:
        provider_id: string
        model: string
        input_text: string
      output: Embedding
      transactional: false
      emits: [EmbeddingGenerated]
      touches:
        aggregates: [Embedding, TokenUsage]

  queries:
    - id: QRY-001
      name: GetProvider
      description: Retrieve provider by ID
      input:
        id: string
      output: LlmProvider
      consistency: strong
      read_model:
        source: primary_db

    - id: QRY-002
      name: ListProviders
      description: List all active providers
      input: {}
      output: LlmProvider[]
      consistency: eventual
      read_model:
        source: primary_db

events:
  - id: EVT-001
    name: ChatCompleted
    payload:
      completion_id: string
      provider_id: string
      model: string
      tokens_used: number
      latency_ms: number
      timestamp: datetime
    publish:
      topic: llm-provider.chat_completed.v1
      delivery: at_least_once

  - id: EVT-002
    name: EmbeddingGenerated
    payload:
      embedding_id: string
      provider_id: string
      dimensions: number
      timestamp: datetime
    publish:
      topic: llm-provider.embedding_generated.v1
      delivery: at_least_once

Prompt: Complete the SDS YAML → Manifest Pipeline for All Bounded Contexts

Objective