SDS-049: LLM Provider Service

Status: Draft
Version: 1.0
Date: 2025-12-30
Satisfies: [PRD-010, ADR-035, ADR-028]
Bounded-Context: llm-provider


0. Isomorphism Declaration

Declares the structure-preserving mappings for this SDS. Enables verification that translation is lossless.

Spec Section SEA-DSL Target Cardinality Verification
2.1 Entities Entity nodes 1:1 Field names + types match
2.2 Value Objects ValueObject nodes 1:1 Constraints preserved
3. Invariants Policy nodes 1:1 Boolean expression verbatim
4. Commands Flow @command 1:1 Input/output schema identical
5. Queries Flow @query 1:1 Input/output schema identical
6. Events Event nodes 1:1 Payload schema identical
7. Ports Interface definitions 1:1 Method signatures preserved

1. Domain Glossary

Prevents naming drift. Establishes canonical vocabulary for isomorphic translation.

Term Definition Type Notes Canonical?
LlmProvider Abstraction layer for LLM interactions Service Uses LiteLLM
ChatCompletion Request/response for conversational inference VO Messages → Completion
Embedding Vector representation of text VO Dimensions vary by model
ProviderConfig Configuration for a specific LLM provider Entity Per-provider settings
ModelSpec Specification of an available model Entity Name, capabilities, limits
FallbackChain Ordered list of providers for resilience VO Try in order
ChatMessage Single message in a conversation VO Role + content
TokenUsage Token consumption metrics VO Prompt + completion tokens

Isomorphism Rule: Only terms marked Canonical? ✓ may appear in SEA-DSL.


2. Domain Model (Canonical Tables)

2.1 Entities

Entity Fields Identity Aggregate Root Entity Invariants
ProviderConfig providerId: ProviderId, name: string, apiKeyEnvVar: string, baseUrl: string, isActive: boolean providerId yes POL-LLM-001
ModelSpec modelId: ModelId, providerId: ProviderId, name: string, maxTokens: number, supportsStreaming: boolean, supportsEmbeddings: boolean modelId no POL-LLM-002

2.2 Value Objects

Value Object Fields Constraints Self-Invariants
ChatMessage role: RoleType, content: string, name?: string content.length <= 100000 Non-empty content
ChatCompletion id: string, model: string, message: ChatMessage, usage: TokenUsage, finishReason: FinishReason Valid finish reason
Embedding vector: number[], model: string, dimensions: number dimensions = vector.length Positive dimensions
TokenUsage promptTokens: number, completionTokens: number, totalTokens: number total = prompt + completion Non-negative values
FallbackChain providers: ProviderId[], strategy: FallbackStrategy length >= 1 At least one provider

2.3 Enums

Enum Values Notes
RoleType system, user, assistant, tool OpenAI-compatible roles
FinishReason stop, length, content_filter, tool_calls Why generation ended
FallbackStrategy sequential, round_robin, lowest_latency How to select fallback
ProviderType ollama, openai, anthropic, openrouter Supported providers

3. Invariants & Policies

Becomes SEA™ Policy nodes. Enables generated validators.

3.1 Entity Invariants

POL-ID Applies-To Invariant Type Rule (Boolean Expression) Error Code Error Message Satisfies
POL-LLM-001 ProviderConfig Entity config.name != null AND config.name.length > 0 INVALID_PROVIDER Provider must have a name REQ-LLM-001
POL-LLM-002 ModelSpec Entity model.maxTokens > 0 INVALID_MODEL Model must have positive max tokens REQ-LLM-001

3.2 Process Invariants

POL-ID Process Invariant Type Rule Enforcement Satisfies
POL-LLM-003 ChatCompletion Process messages.length > 0 CompleteChat handler REQ-LLM-002
POL-LLM-004 Embedding Process input.trim().length > 0 GenerateEmbedding handler REQ-LLM-002
POL-LLM-005 PolicyGateway Process NOT bypassGateway OR environment = development All handlers INV-LLM-001

Isomorphism Guarantee: Each POL-ID maps to exactly one Policy node in SEA-DSL.


4. CQRS: Commands

Generates command DTO + handler + event emission rules.

CMD-LLM-001: CompleteChat

Field Value
CMD-ID CMD-LLM-001
Input messages: ChatMessage[], model?: string, temperature?: number, maxTokens?: number, stream?: boolean
Preconditions POL-LLM-003 (non-empty messages)
State Changes None (stateless)
Emits Events EVT-LLM-001 (ChatCompleted)
Satisfies REQ-LLM-002

Idempotency Specification:

Idempotent? Key Expression Strategy Behavior on Duplicate
no N/A Stateless Different response each call

CMD-LLM-002: GenerateEmbedding

Field Value
CMD-ID CMD-LLM-002
Input input: string \| string[], model?: string
Preconditions POL-LLM-004 (non-empty input)
State Changes None (stateless)
Emits Events EVT-LLM-002 (EmbeddingGenerated)
Satisfies REQ-LLM-003

Idempotency Specification:

Idempotent? Key Expression Strategy Behavior on Duplicate
yes hash(input + model) Cache Same embedding returned

5. CQRS: Queries

Generates query DTO + handler. Queries are inherently idempotent.

QRY-LLM-001: ListAvailableModels

Field Value
QRY-ID QRY-LLM-001
Input providerId?: ProviderId, capability?: Capability
Output ModelSpec[]
Read Model Provider configuration
Consistency strong
Satisfies REQ-LLM-001

QRY-LLM-002: GetProviderHealth

Field Value
QRY-ID QRY-LLM-002
Input providerId?: ProviderId
Output ProviderHealthDto { providerId, status, latencyMs, lastChecked }
Read Model Health check cache
Consistency eventual
Satisfies Operational

6. Domain Events

Generates event contracts + observability bindings.

EVT-LLM-001: ChatCompleted

Field Value
EVT-ID EVT-LLM-001
Payload model: string, provider: ProviderType, usage: TokenUsage, latencyMs: number, completedAt: DateTime
Published To llm.events
Consumers metrics (UpdateCounters), observability (RecordSpan)
Delivery fire-and-forget
Satisfies INV-LLM-004

EVT-LLM-002: EmbeddingGenerated

Field Value
EVT-ID EVT-LLM-002
Payload model: string, provider: ProviderType, dimensions: number, inputCount: number, latencyMs: number
Published To llm.events
Consumers metrics (UpdateCounters)
Delivery fire-and-forget
Satisfies INV-LLM-004

EVT-LLM-003: ProviderFailed

Field Value
EVT-ID EVT-LLM-003
Payload provider: ProviderType, error: string, fallbackUsed: ProviderId?, failedAt: DateTime
Published To llm.events
Consumers alerts (NotifyOnCall), metrics (IncrementErrors)
Delivery at-least-once
Satisfies Operational

7. Ports

Hexagonal ports define the isomorphic boundary between domain and infrastructure.

PORT-ID Port Name Direction Methods Used By Backed By Adapter
PORT-LLM-001 LlmProviderPort outbound completeChat(req): ChatCompletion, generateEmbedding(req): Embedding[], listModels(): ModelSpec[], healthCheck(): HealthStatus Cognitive services LiteLLMAdapter, FakeLlmAdapter
PORT-LLM-002 PolicyGatewayPort outbound filterRequest(req): FilterDecision, filterResponse(res): FilterDecision LiteLLMAdapter SDS-047 client
PORT-LLM-003 ProviderConfigPort inbound loadConfig(): ProviderConfig[] Startup EnvConfigAdapter

Isomorphism Guarantee: Port method signatures in spec = interface definitions in code.


8. Adapters (Implementation Choices)

Becomes manifest runtime selections.

Category Choice Notes
LLM Client LiteLLM (Python) Unified 100+ provider support
Config Environment variables LLM_* prefix
Policy Gateway HTTP client Forwards to SDS-047
Observability OpenTelemetry Spans on all LLM calls
Testing FakeLlmAdapter Deterministic responses

9. DI/Wiring

Makes composition root generation deterministic.

Port Adapter Implementation Lifetime Invariant
LlmProviderPort LiteLLMAdapter singleton Reuses connection pool
LlmProviderPort FakeLlmAdapter singleton Test double
PolicyGatewayPort HttpPolicyGatewayClient singleton Proxies to SDS-047
ProviderConfigPort EnvConfigAdapter singleton Loaded at startup

10. Invariant Summary

Consolidated view of all invariants for verification.

INV-ID Statement Type Defined In Enforced By
POL-LLM-001 Provider must have name Entity SDS 3.1 Config loader
POL-LLM-002 Model must have positive max tokens Entity SDS 3.1 Config loader
POL-LLM-003 Messages must be non-empty Process SDS 3.2 CompleteChat handler
POL-LLM-004 Input must be non-empty Process SDS 3.2 GenerateEmbedding handler
POL-LLM-005 Gateway bypass only in development Process SDS 3.2 All handlers
INV-LLM-001 Production calls route through gateway System ADR-035 HTTPs proxy
INV-LLM-004 All calls emit OTel spans System ADR-035 Adapter

11. LiteLLM Integration

Python Usage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
from litellm import completion, embedding

# Chat completion (routes through Policy Gateway in production)
response = completion(
    model="ollama/llama3.2",  # or "gpt-4o", "claude-3-sonnet-20241022"
    messages=[{"role": "user", "content": "Hello!"}],
    api_base=os.getenv("LLM_POLICY_GATEWAY_URL"),  # Policy Gateway proxy
)

# Embeddings
vectors = embedding(
    model="text-embedding-3-small",
    input=["Hello world", "Goodbye world"],
)

TypeScript Client (via HTTP)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
// libs/llm-provider/src/adapters/litellm.adapter.ts
export class LiteLLMAdapter implements LlmProviderPort {
  async completeChat(request: ChatRequest): Promise<ChatCompletion> {
    const response = await fetch(`${this.baseUrl}/v1/chat/completions`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        model: request.model ?? 'ollama/llama3.2',
        messages: request.messages,
        temperature: request.temperature,
      }),
    });
    return response.json();
  }
}