Phase 9 Artifact — This plan enables vertical slice delivery per ENGINEERING.SOP.md.
Provide a unified LLM provider interface via LiteLLM, enabling SEA™ services to interact with OpenAI, Anthropic, Ollama, and OpenRouter through a single abstraction layer with Policy Gateway integration.
Critical ordering — This plan must be implemented after its dependencies and before services that consume it.
| Dependency | Plan | Reason |
|---|---|---|
| ⬆️ Depends on | P007 GovernedSpeed™ Runtime | LLM calls route through Policy Gateway (SDS-047) |
| ⬆️ Depends on | P009 Observability | LLM calls emit OpenTelemetry spans |
| ⬇️ Needed by | P020 Cognitive Extension Layer | Context Analyzer and Artifact Engine use LLM |
| ⬇️ Needed by | P019 Semantic Core Services | Embedding generation for Knowledge Graph |
| ⬇️ Needed by | P013 Generative Synthesis | LLM for code generation |
STOP. Before implementing, validate all input specifications.
| Check | Requirement | Pass |
|---|---|---|
| ADR-035 exists | File: docs/specs/shared/adr/035-llm-provider-abstraction.md |
[x] |
| Has Context section | Explains multi-provider LLM integration needs | [x] |
| Has Decision section | LiteLLM as unified abstraction | [x] |
| Has Constraints section | MUST/MUST NOT statements | [x] |
| Has Consequences section | Trade-offs documented | [x] |
| References prior ADRs | ADR-028 (GovernedSpeed™ LLMOps) | [x] |
| Check | Requirement | Pass |
|---|---|---|
| PRD-010 exists | docs/specs/shared/prd/010-ai-governance-runtime.md |
[x] |
Has Satisfies: ADR-028 |
Traces to governance decision | [x] |
| Uses EARS notation | When/The system shall… | [x] |
| Each REQ has ID | REQ-GS-001..005 | [x] |
| Check | Requirement | Pass |
|---|---|---|
| SDS-049 exists | docs/specs/shared/sds/049-llm-provider-service.md |
[x] |
| Has domain glossary | LlmProvider, ChatCompletion, Embedding | [x] |
| Entities defined | ProviderConfig, ModelSpec | [x] |
| Flows defined | CMD-LLM-001..002, QRY-LLM-001..002 | [x] |
| Ports defined | LlmProviderPort, PolicyGatewayPort, ProviderConfigPort | [x] |
| Invariants defined | POL-LLM-001..005, INV-LLM-001..004 | [x] |
Complete traceability from ADR → PRD → SDS → Implementation.
graph TD
ADR35[ADR-035: LLM Provider Abstraction] --> PRD10[PRD-010: AI Governance Runtime]
ADR28[ADR-028: GovernedSpeed™ LLMOps] --> PRD10
PRD10 --> SDS49[SDS-049: LLM Provider Service]
SDS49 --> SDS47[SDS-047: GovernedSpeed™ Runtime]
SDS49 --> C1A[C1A: Domain + Ports]
SDS49 --> C1B[C1B: LiteLLM Adapter]
SDS49 --> C2A[C2A: Fake Adapter + Tests]
style ADR35 fill:#e1f5ff
style ADR28 fill:#e1f5ff
style PRD10 fill:#fff4e1
style SDS49 fill:#e8f5e9
style SDS47 fill:#e8f5e9
| ADR ID | PRD ID | SDS Element | Cycle |
|---|---|---|---|
| ADR-035 | PRD-010 | Entity: ProviderConfig |
C1A |
| ADR-035 | PRD-010 | Port: LlmProviderPort |
C1A |
| ADR-035 | PRD-010 | Flow: CMD-LLM-001 |
C1B |
| ADR-035 | PRD-010 | Adapter: LiteLLMAdapter |
C1B |
| ADR-035 | PRD-010 | Adapter: FakeLlmAdapter |
C2A |
llm.provider.enabled for rollout| Cycle | Worktree | Branch | Wave | Implements |
|---|---|---|---|---|
| C1A | ../SEA-p31-c1A |
cycle/p31-c1A-llm-domain |
1 | Domain + Ports |
| C1B | ../SEA-p31-c1B |
cycle/p31-c1B-litellm-adapter |
1 | LiteLLM Adapter |
| C2A | ../SEA-p31-c2A |
cycle/p31-c2A-fake-tests |
2 | Fake Adapter + Tests |
| C3A | ../SEA-p31-c3A |
cycle/p31-c3A-ts-client |
3 | TypeScript HTTP Client |
../SEA-p31-c1Ajust generator-bc llm-providerlibs/llm-provider/domain/src/lib/provider-config.tslibs/llm-provider/domain/src/lib/chat-message.tslibs/llm-provider/domain/src/lib/embedding.tslibs/llm-provider/ports/src/lib/llm-provider.port.ts../SEA-p31-c1Bservices/llm-provider/src/adapters/litellm_adapter.pyservices/llm-provider/src/api/routes.pyservices/llm-provider/Dockerfile../SEA-p31-c2Alibs/llm-provider/adapters/src/lib/fake.adapter.tslibs/llm-provider/adapters/src/lib/fake.adapter.spec.ts../SEA-p31-c3Alibs/llm-provider/adapters/src/lib/litellm.adapter.tslibs/llm-provider/adapters/src/lib/litellm.adapter.spec.ts| Generator | Command | When to Use |
|---|---|---|
| Bounded Context | just generator-bc llm-provider |
Create llm-provider context |
| Adapter | just generator-adapter litellm llm-provider |
Create LiteLLM adapter |
| Adapter | just generator-adapter fake llm-provider |
Create Fake adapter |
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
libs/llm-provider/
├── domain/
│ └── src/
│ ├── index.ts
│ └── lib/
│ ├── provider-config.ts # [SDS-049: Entity ProviderConfig]
│ ├── model-spec.ts # [SDS-049: Entity ModelSpec]
│ ├── chat-message.ts # [SDS-049: VO ChatMessage]
│ └── embedding.ts # [SDS-049: VO Embedding]
├── ports/
│ └── src/
│ └── lib/
│ ├── llm-provider.port.ts # [SDS-049: PORT-LLM-001]
│ └── policy-gateway.port.ts # [SDS-049: PORT-LLM-002]
├── adapters/
│ └── src/
│ └── lib/
│ ├── litellm.adapter.ts # HTTP client to Python service
│ ├── litellm.adapter.spec.ts
│ ├── fake.adapter.ts # Deterministic test double
│ └── fake.adapter.spec.ts
└── application/
└── src/
└── lib/
├── complete-chat.handler.ts # [SDS-049: CMD-LLM-001]
└── generate-embedding.handler.ts # [SDS-049: CMD-LLM-002]
services/llm-provider/ # Python FastAPI service
├── src/
│ ├── adapters/
│ │ └── litellm_adapter.py # LiteLLM wrapper
│ ├── api/
│ │ └── routes.py # OpenAI-compatible API
│ └── main.py
├── pyproject.toml
└── Dockerfile
| Dependency | Type | Version | Package | Justification |
|---|---|---|---|---|
litellm |
Python | 1.56.x |
litellm |
Unified LLM provider abstraction |
ollama |
Docker | 0.5.x |
ollama/ollama |
Local development provider |
fastapi |
Python | 0.115.x |
fastapi |
LLM service API framework |
opentelemetry-instrumentation-litellm |
Python | — | opentelemetry-instrumentation-litellm |
OTel integration |
just spec-guard passesjust test passesjust nx-run workspace check passesgit diff --exit-code (repo clean after generation)llm.model, llm.provider attributes1
2
3
4
5
6
7
8
9
10
# Start Ollama locally
just dev-up
# Test LLM service
curl -X POST http://localhost:8001/v1/chat/completions \
-H "Content-Type: application/json" \
-d '{"model": "ollama/llama3.2", "messages": [{"role": "user", "content": "Hello"}]}'
# Run adapter tests
just test libs/llm-provider
| Risk | Likelihood | Impact | Mitigation |
|---|---|---|---|
| LiteLLM version incompatibility | Medium | Medium | Pin version, test upgrades in CI |
| Ollama unavailable in CI | Low | High | Use FakeLlmAdapter for unit tests; Ollama only for integration |
| Policy Gateway latency | Low | Medium | Cache policy decisions; monitor latency |
| Provider API changes | Low | Medium | LiteLLM handles provider differences |
| Document | Purpose |
|---|---|
| ADR-035 | Architecture decision |
| SDS-049 | Service design |
| SDS-047 | GovernedSpeed™ Runtime (Policy Gateway) |
| ADR-028 | Governance requirements |
| dependency_gap_analysis.md | LiteLLM selection rationale |
Note: SDS-042 (Policy Gateway Service) has been superseded by SDS-047 per ADR-031.