SDS-049: LLM Provider Service

Status: Draft
Version: 1.0
Date: 2025-12-30
Satisfies: [PRD-010, ADR-035, ADR-028]
Bounded-Context: llm-provider

0. Isomorphism Declaration

Declares the structure-preserving mappings for this SDS. Enables verification that translation is lossless.

Spec Section	SEA-DSL Target	Cardinality	Verification
2.1 Entities	`Entity` nodes	1:1	Field names + types match
2.2 Value Objects	`ValueObject` nodes	1:1	Constraints preserved
3. Invariants	`Policy` nodes	1:1	Boolean expression verbatim
4. Commands	`Flow @command`	1:1	Input/output schema identical
5. Queries	`Flow @query`	1:1	Input/output schema identical
6. Events	`Event` nodes	1:1	Payload schema identical
7. Ports	Interface definitions	1:1	Method signatures preserved

1. Domain Glossary

Prevents naming drift. Establishes canonical vocabulary for isomorphic translation.

Term	Definition	Type	Notes	Canonical?
LlmProvider	Abstraction layer for LLM interactions	Service	Uses LiteLLM	✓
ChatCompletion	Request/response for conversational inference	VO	Messages → Completion	✓
Embedding	Vector representation of text	VO	Dimensions vary by model	✓
ProviderConfig	Configuration for a specific LLM provider	Entity	Per-provider settings	✓
ModelSpec	Specification of an available model	Entity	Name, capabilities, limits	✓
FallbackChain	Ordered list of providers for resilience	VO	Try in order	✓
ChatMessage	Single message in a conversation	VO	Role + content	✓
TokenUsage	Token consumption metrics	VO	Prompt + completion tokens	✓

Isomorphism Rule: Only terms marked Canonical? ✓ may appear in SEA-DSL.

2. Domain Model (Canonical Tables)

2.1 Entities

Entity	Fields	Identity	Aggregate Root	Entity Invariants
ProviderConfig	`providerId: ProviderId`, `name: string`, `apiKeyEnvVar: string`, `baseUrl: string`, `isActive: boolean`	`providerId`	yes	POL-LLM-001
ModelSpec	`modelId: ModelId`, `providerId: ProviderId`, `name: string`, `maxTokens: number`, `supportsStreaming: boolean`, `supportsEmbeddings: boolean`	`modelId`	no	POL-LLM-002

2.2 Value Objects

Value Object	Fields	Constraints	Self-Invariants
ChatMessage	`role: RoleType`, `content: string`, `name?: string`	content.length <= 100000	Non-empty content
ChatCompletion	`id: string`, `model: string`, `message: ChatMessage`, `usage: TokenUsage`, `finishReason: FinishReason`	—	Valid finish reason
Embedding	`vector: number[]`, `model: string`, `dimensions: number`	dimensions = vector.length	Positive dimensions
TokenUsage	`promptTokens: number`, `completionTokens: number`, `totalTokens: number`	total = prompt + completion	Non-negative values
FallbackChain	`providers: ProviderId[]`, `strategy: FallbackStrategy`	length >= 1	At least one provider

2.3 Enums

Enum	Values	Notes
RoleType	`system`, `user`, `assistant`, `tool`	OpenAI-compatible roles
FinishReason	`stop`, `length`, `content_filter`, `tool_calls`	Why generation ended
FallbackStrategy	`sequential`, `round_robin`, `lowest_latency`	How to select fallback
ProviderType	`ollama`, `openai`, `anthropic`, `openrouter`	Supported providers

3. Invariants & Policies

Becomes SEA™ Policy nodes. Enables generated validators.

3.1 Entity Invariants

POL-ID	Applies-To	Invariant Type	Rule (Boolean Expression)	Error Code	Error Message	Satisfies
POL-LLM-001	ProviderConfig	Entity	`config.name != null AND config.name.length > 0`	`INVALID_PROVIDER`	Provider must have a name	REQ-LLM-001
POL-LLM-002	ModelSpec	Entity	`model.maxTokens > 0`	`INVALID_MODEL`	Model must have positive max tokens	REQ-LLM-001

3.2 Process Invariants

POL-ID	Process	Invariant Type	Rule	Enforcement	Satisfies
POL-LLM-003	ChatCompletion	Process	`messages.length > 0`	CompleteChat handler	REQ-LLM-002
POL-LLM-004	Embedding	Process	`input.trim().length > 0`	GenerateEmbedding handler	REQ-LLM-002
POL-LLM-005	PolicyGateway	Process	`NOT bypassGateway OR environment = development`	All handlers	INV-LLM-001

Isomorphism Guarantee: Each POL-ID maps to exactly one Policy node in SEA-DSL.

4. CQRS: Commands

Generates command DTO + handler + event emission rules.

CMD-LLM-001: CompleteChat

Field	Value
CMD-ID	CMD-LLM-001
Input	`messages: ChatMessage[]`, `model?: string`, `temperature?: number`, `maxTokens?: number`, `stream?: boolean`
Preconditions	POL-LLM-003 (non-empty messages)
State Changes	None (stateless)
Emits Events	EVT-LLM-001 (ChatCompleted)
Satisfies	REQ-LLM-002

Idempotency Specification:

Idempotent?	Key Expression	Strategy	Behavior on Duplicate
no	N/A	Stateless	Different response each call

CMD-LLM-002: GenerateEmbedding

Field	Value
CMD-ID	CMD-LLM-002
Input	`input: string \\| string[]`, `model?: string`
Preconditions	POL-LLM-004 (non-empty input)
State Changes	None (stateless)
Emits Events	EVT-LLM-002 (EmbeddingGenerated)
Satisfies	REQ-LLM-003

Idempotency Specification:

Idempotent?	Key Expression	Strategy	Behavior on Duplicate
yes	`hash(input + model)`	Cache	Same embedding returned

5. CQRS: Queries

Generates query DTO + handler. Queries are inherently idempotent.

QRY-LLM-001: ListAvailableModels

Field	Value
QRY-ID	QRY-LLM-001
Input	`providerId?: ProviderId`, `capability?: Capability`
Output	`ModelSpec[]`
Read Model	Provider configuration
Consistency	strong
Satisfies	REQ-LLM-001

QRY-LLM-002: GetProviderHealth

Field	Value
QRY-ID	QRY-LLM-002
Input	`providerId?: ProviderId`
Output	`ProviderHealthDto { providerId, status, latencyMs, lastChecked }`
Read Model	Health check cache
Consistency	eventual
Satisfies	Operational

6. Domain Events

Generates event contracts + observability bindings.

EVT-LLM-001: ChatCompleted

Field	Value
EVT-ID	EVT-LLM-001
Payload	`model: string`, `provider: ProviderType`, `usage: TokenUsage`, `latencyMs: number`, `completedAt: DateTime`
Published To	`llm.events`
Consumers	`metrics` (UpdateCounters), `observability` (RecordSpan)
Delivery	fire-and-forget
Satisfies	INV-LLM-004

EVT-LLM-002: EmbeddingGenerated

Field	Value
EVT-ID	EVT-LLM-002
Payload	`model: string`, `provider: ProviderType`, `dimensions: number`, `inputCount: number`, `latencyMs: number`
Published To	`llm.events`
Consumers	`metrics` (UpdateCounters)
Delivery	fire-and-forget
Satisfies	INV-LLM-004

EVT-LLM-003: ProviderFailed

Field	Value
EVT-ID	EVT-LLM-003
Payload	`provider: ProviderType`, `error: string`, `fallbackUsed: ProviderId?`, `failedAt: DateTime`
Published To	`llm.events`
Consumers	`alerts` (NotifyOnCall), `metrics` (IncrementErrors)
Delivery	at-least-once
Satisfies	Operational

7. Ports

Hexagonal ports define the isomorphic boundary between domain and infrastructure.

PORT-ID	Port Name	Direction	Methods	Used By	Backed By Adapter
PORT-LLM-001	LlmProviderPort	outbound	`completeChat(req): ChatCompletion`, `generateEmbedding(req): Embedding[]`, `listModels(): ModelSpec[]`, `healthCheck(): HealthStatus`	Cognitive services	LiteLLMAdapter, FakeLlmAdapter
PORT-LLM-002	PolicyGatewayPort	outbound	`filterRequest(req): FilterDecision`, `filterResponse(res): FilterDecision`	LiteLLMAdapter	SDS-047 client
PORT-LLM-003	ProviderConfigPort	inbound	`loadConfig(): ProviderConfig[]`	Startup	EnvConfigAdapter

Isomorphism Guarantee: Port method signatures in spec = interface definitions in code.

8. Adapters (Implementation Choices)

Becomes manifest runtime selections.

Category	Choice	Notes
LLM Client	LiteLLM (Python)	Unified 100+ provider support
Config	Environment variables	`LLM_*` prefix
Policy Gateway	HTTP client	Forwards to SDS-047
Observability	OpenTelemetry	Spans on all LLM calls
Testing	FakeLlmAdapter	Deterministic responses

9. DI/Wiring

Makes composition root generation deterministic.

Port	Adapter Implementation	Lifetime	Invariant
LlmProviderPort	LiteLLMAdapter	singleton	Reuses connection pool
LlmProviderPort	FakeLlmAdapter	singleton	Test double
PolicyGatewayPort	HttpPolicyGatewayClient	singleton	Proxies to SDS-047
ProviderConfigPort	EnvConfigAdapter	singleton	Loaded at startup

10. Invariant Summary

Consolidated view of all invariants for verification.

INV-ID	Statement	Type	Defined In	Enforced By
POL-LLM-001	Provider must have name	Entity	SDS 3.1	Config loader
POL-LLM-002	Model must have positive max tokens	Entity	SDS 3.1	Config loader
POL-LLM-003	Messages must be non-empty	Process	SDS 3.2	CompleteChat handler
POL-LLM-004	Input must be non-empty	Process	SDS 3.2	GenerateEmbedding handler
POL-LLM-005	Gateway bypass only in development	Process	SDS 3.2	All handlers
INV-LLM-001	Production calls route through gateway	System	ADR-035	HTTPs proxy
INV-LLM-004	All calls emit OTel spans	System	ADR-035	Adapter

11. LiteLLM Integration

Python Usage

from litellm import completion, embedding

# Chat completion (routes through Policy Gateway in production)
response = completion(
    model="ollama/llama3.2",  # or "gpt-4o", "claude-3-sonnet-20241022"
    messages=[{"role": "user", "content": "Hello!"}],
    api_base=os.getenv("LLM_POLICY_GATEWAY_URL"),  # Policy Gateway proxy
)

# Embeddings
vectors = embedding(
    model="text-embedding-3-small",
    input=["Hello world", "Goodbye world"],
)

Adapter Note: LiteLLM embedding responses may be typed objects or raw dicts (observed with Ollama). Implementations must handle both shapes and extract embedding from either attribute or key.

TypeScript Client (via HTTP)

// libs/llm-provider/src/adapters/litellm.adapter.ts
export class LiteLLMAdapter implements LlmProviderPort {
  async completeChat(request: ChatRequest): Promise<ChatCompletion> {
    const response = await fetch(`${this.baseUrl}/v1/chat/completions`, {
      method: 'POST',
      headers: { 'Content-Type': 'application/json' },
      body: JSON.stringify({
        model: request.model ?? 'ollama/llama3.2',
        messages: request.messages,
        temperature: request.temperature,
      }),
    });
    return response.json();
  }
}

ADRs: ADR-035, ADR-028
PRDs: PRD-010
Other SDS: SDS-047 (Policy Gateway Service), SDS-047 (Risk & Evidence Service)