CodeRabbit Results Implementation Plan

For Claude: REQUIRED SUB-SKILL: Use superpowers:executing-plans to implement this plan task-by-task.

Goal: Complete all tasks listed in tmp/coderabbit_results.md with spec-first changes, updated handlers/adapters/tools, tests, and just ci passing.

Architecture: Apply spec-first updates to SDS/SEA-DSL where required, then regenerate artifacts via the pipeline. Implement handwritten logic in adapters and application handlers to use transactions/outbox, safe parsing, and defensive response handling. Update IR→manifest merging to preserve IR metadata and handle mixed input shapes. Align documentation instructions and policies with the repository conventions. Follow TDD for every behavioral change.

Tech Stack: Python (handlers/tools), pytest, SDS/SEA-DSL pipeline (just pipeline llm-provider), Markdown documentation.

Task 1: Reproduce and Plan CodeRabbit Items

Files:

Read: tmp/coderabbit_results.md

Step 1: Review tasks and classify generated vs handwritten

Confirm no edits in generated zones.

Step 2: Note spec-first items

Identify SDS/SEA-DSL changes needed before code changes.

Task 2: IR → Manifest Merge Logic (TDD)

Files:

Create: tools/tests/test_ir_to_manifest.py
Modify: tools/ir_to_manifest.py

Step 1: Write failing test for command merge + nullable type parsing

from tools.ir_to_manifest import merge_sds_cqrs

def test_merge_commands_preserves_existing_input_and_parses_nullable_types():
    cqrs = {
        "commands": {
            "CreateThing": {
                "input": {
                    "existing": {"type": "string", "required": True, "validation": {"min": 1}}
                }
            }
        },
        "queries": {},
    }
    sds = {
        "cqrs": {
            "commands": [
                {
                    "name": "CreateThing",
                    "input": {
                        "existing": {"type": "string"},
                        "optional_field": "number | null",
                    },
                }
            ]
        }
    }

    merged = merge_sds_cqrs(cqrs, sds, "ctx")
    cmd_input = merged["commands"]["CreateThing"]["input"]

    assert cmd_input["existing"]["validation"] == {"min": 1}
    assert cmd_input["optional_field"]["type"] == "number"
    assert cmd_input["optional_field"]["required"] is False

Step 2: Run test to verify it fails Run: python -m pytest tools/tests/test_ir_to_manifest.py::test_merge_commands_preserves_existing_input_and_parses_nullable_types -v Expected: FAIL (merge or nullable parsing missing).

Step 3: Write failing test for list inputs and query merge

def test_merge_queries_accepts_list_inputs_and_merges():
    cqrs = {
        "commands": {},
        "queries": {"FindThing": {"input": {"pre": {"type": "string", "required": True}}}},
    }
    sds = {"cqrs": {"queries": [{"name": "FindThing", "input": ["id", "scope"]}]}}

    merged = merge_sds_cqrs(cqrs, sds, "ctx")
    qry_input = merged["queries"]["FindThing"]["input"]

    assert qry_input["pre"]["type"] == "string"
    assert qry_input["id"]["type"] == "string"
    assert qry_input["scope"]["required"] is True

Step 4: Run test to verify it fails Run: python -m pytest tools/tests/test_ir_to_manifest.py::test_merge_queries_accepts_list_inputs_and_merges -v Expected: FAIL.

Step 5: Implement minimal merge + parsing in tools/ir_to_manifest.py

Normalize input definitions (dict vs list).
Parse nullable type strings by splitting on | and trimming.
Deep merge SDS input into existing IR input for commands and queries.

Step 6: Run tests Run: python -m pytest tools/tests/test_ir_to_manifest.py -v Expected: PASS.

Task 3: SDS Updates for LLM Provider (Spec-First)

Files:

Modify: docs/specs/llm-provider/llm-provider.sds.yaml

Step 1: Update nullable fields

ModelSpec.context_window, ModelSpec.max_tokens, ModelSpec.supports_streaming, ModelSpec.supports_functions, ModelSpec.cost_per_1k_input, ModelSpec.cost_per_1k_output → add | null.
ProviderHealth.avg_latency_ms → add | null.

Step 2: Update QRY-002 output and description

Output type ProviderHealth[].
Description clarifies behavior when provider_id is null.
Align touches/read_model as needed.

Step 3: Validate and regenerate Run:

just sds-validate docs/specs/llm-provider/llm-provider.sds.yaml
just pipeline llm-provider

Task 4: LLM Provider Handler + Adapter Changes (TDD)

Files:

Modify: libs/llm-provider/ports/src/lib/llm_provider_port.py
Modify: libs/llm-provider/application/src/impl/list_available_models_handler_impl.py
Modify: libs/llm-provider/application/src/impl/generate_embedding_handler_impl.py
Modify: libs/llm-provider/application/src/impl/complete_chat_handler_impl.py
Modify: libs/llm-provider/application/src/impl/get_provider_health_handler_impl.py
Modify: libs/llm-provider/adapters/src/lib/litellm_adapter.py

Step 1: Update/extend unit tests (RED)

libs/llm-provider/application/tests/unit/test_list_models_handler.py
- Assert result.data matches mocked models.
libs/llm-provider/application/tests/unit/test_generate_embedding_handler.py
- Assert dimensions are ints and outbox persistence is used.
- Assert publish is not called directly.
libs/llm-provider/application/tests/unit/test_complete_chat_handler.py
- Assert save call count is exactly 2.
- Add test for missing response.message fallback.
- Assert outbox save, no direct publish.
Create libs/llm-provider/application/tests/unit/test_get_provider_health_handler.py
- provider_id None returns list.
- latency None results in avg_latency None.

Step 2: Run each test to verify failure Run each pytest target individually and confirm expected failures.

Step 3: Implement minimal fixes (GREEN)

Ports: make EmbeddingResponse.dimensions computed, HealthStatus.latency_ms optional.
List models: use None for unknown ModelSpec fields.
Generate embedding: use int(response.dimensions) and log exceptions; save embedding + outbox in transaction; no direct publish.
Complete chat: guard response.message access; use transaction for saves; save outbox; log exceptions; re-raise asyncio.CancelledError and include exception type in error string.
Provider health: if provider_id is None, list providers and check each; safely parse latency.
LiteLLM adapter: configure litellm once; fix docstring; validate choices; select model by provider_id for health check.

Step 4: Run updated tests Run pytest for each updated module; confirm all green.

Task 5: Documentation Updates

Files:

Modify: .github/copilot-instructions.md
Modify: .github/instructions/workflow.instructions.md
Modify: .github/instructions/quality.instructions.md
Modify: .github/instructions/sea-dsl.instructions.md
Modify: .github/instructions/technical-debt.instructions.md
Modify: .github/instructions/architecture.instructions.md
Modify: .agents/architecture.instructions.md
Modify: .agents/code-style.instructions.md
Modify: AGENTS.md (policy clarification if needed)

Step 1: Write/update text per CodeRabbit tasks

Fix broken architecture link.
Add Generated Code Zones section and tool/manifest/spec/IR/generator definitions.
Add just sea-validate to quality checklist.
Update SEA-DSL docs: pipeline context, error handling, required vs optional annotations, idempotency key warning.
Update technical-debt instructions: abstract method exclusions, xargs usage, marker coverage.
Fix cognitive-extension path in architecture instructions.
Fix @cqrs JSON example.
Relax no‑debt rule in .agents/code-style.instructions.md with tracked/timeboxed markers.
Confirm AGENTS.md policy and adjust if needed.

Step 2: No tests required

Task 6: Checklist + CI

Files:

Modify: tmp/coderabbit_results.md

Step 1: Check off all tasks

Replace each [ ] with [x] after completion.

Step 2: Run targeted tests

python -m pytest tools/tests/test_ir_to_manifest.py -v
python -m pytest libs/llm-provider/application/tests/unit/test_list_models_handler.py -v
python -m pytest libs/llm-provider/application/tests/unit/test_generate_embedding_handler.py -v
python -m pytest libs/llm-provider/application/tests/unit/test_complete_chat_handler.py -v
python -m pytest libs/llm-provider/application/tests/unit/test_get_provider_health_handler.py -v

Step 3: Run full CI Run: just ci Expected: PASS.

Step 4: Summarize changes

Provide file list, tests run, and confirm just ci output.