ADR-035: LLM Provider Abstraction Architecture

Status: Accepted
Version: 1.0
Date: 2025-12-30
Supersedes: N/A
Related ADRs: ADR-028 (GovernedSpeed™ LLMOps)
Related PRDs: PRD-010 (AI Governance Runtime)


Context

SEA™ requires integration with multiple LLM providers to support:

  1. Local development — Ollama for offline, cost-free iteration
  2. Production workloads — OpenAI, Anthropic for production quality
  3. Cost optimization — OpenRouter for model arbitrage and fallbacks
  4. Governance compliance — All LLM calls must route through Policy Gateway (SDS-047)

Forces at play:

Key insight: A unified LLM abstraction layer eliminates provider lock-in while maintaining governance compliance and enabling local-first development.

Decision

Adopt LiteLLM as the canonical LLM provider abstraction layer, exposing a single LlmProviderPort interface for all AI interactions.

Core Principles

  1. Unified API — All LLM calls use LiteLLM’s OpenAI-compatible interface
  2. Provider Agnostic — Configuration-driven provider switching (no code changes)
  3. Policy Gateway Integration — LiteLLM proxies through SDS-047 for governance
  4. Local-First — Ollama as default development provider
  5. Fallback Chains — Automatic failover (e.g., Anthropic → OpenAI → Ollama)

Supported Providers

Provider Use Case Configuration
Ollama Local development, air-gapped environments ollama/llama3.2
OpenAI Production inference, embeddings gpt-4o, text-embedding-3-small
Anthropic Production inference, long context claude-3-5-sonnet-20241022
OpenRouter Cost optimization, model diversity openrouter/anthropic/claude-3-opus

Reference Architecture

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
┌─────────────────────────────────────────────────────────────────┐
│  SEA™ LLM Provider Architecture                                  │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  ┌──────────────────┐                                           │
│  │ Cognitive Service│                                           │
│  │ (Artifact Engine)│                                           │
│  └────────┬─────────┘                                           │
│           │ uses                                                │
│           ▼                                                     │
│  ┌──────────────────┐                                           │
│  │ LlmProviderPort  │ ◄── Hexagonal Port (Interface)            │
│  └────────┬─────────┘                                           │
│           │ implements                                          │
│           ▼                                                     │
│  ┌──────────────────┐    ┌────────────────────┐                 │
│  │ LiteLLMAdapter   │───►│ Policy Gateway     │ (SDS-047)       │
│  │ (Production)     │    │ (PII, Jailbreak)   │                 │
│  └────────┬─────────┘    └─────────┬──────────┘                 │
│           │                        │                            │
│           ▼                        ▼                            │
│  ┌─────────────────────────────────────────────────────────┐    │
│  │                    LiteLLM Router                       │    │
│  │  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌────────────┐  │    │
│  │  │ Ollama  │  │ OpenAI  │  │Anthropic│  │ OpenRouter │  │    │
│  │  └─────────┘  └─────────┘  └─────────┘  └────────────┘  │    │
│  └─────────────────────────────────────────────────────────┘    │
│                                                                 │
│  Testing:                                                       │
│  ┌──────────────────┐                                           │
│  │ FakeLlmAdapter   │ ◄── Deterministic responses for tests     │
│  └──────────────────┘                                           │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Rationale

  1. LiteLLM over raw SDKs: LiteLLM provides 100+ provider support with unified interface, retry logic, fallbacks, and observability — eliminating per-provider adapter maintenance
  2. Policy Gateway routing: All LLM calls route through SDS-047 for PII detection, jailbreak prevention, and audit logging per ADR-028
  3. Ollama for local-first: Developers iterate without API keys or cloud costs; same code path as production
  4. OpenRouter for flexibility: Access to 100+ models with single API key, cost optimization, and provider redundancy
  5. Port/Adapter pattern: LlmProviderPort enables testing with FakeLlmAdapter for deterministic unit tests

Constraints (MUST/MUST NOT)

Critical for generator choices. These constraints flow directly into manifests and SEA-DSL.

Isomorphic Guarantees

Defines structure-preserving mappings from this ADR to implementation.

Spec Concept Implementation Target Mapping Rule
LlmProviderPort libs/llm-provider/src/ports/llm-provider.port.ts 1:1 interface
LiteLLMAdapter libs/llm-provider/src/adapters/litellm.adapter.ts 1:1 implementation
FakeLlmAdapter libs/llm-provider/src/adapters/fake.adapter.ts 1:1 test double
ProviderConfig Environment variables LLM_* 1:1 config mapping

System Invariants

Non-negotiable truths that must hold across the system.

INV-ID Invariant Type Enforcement
INV-LLM-001 All production LLM calls route through Policy Gateway System HTTPs proxy config
INV-LLM-002 LlmProviderPort is the only LLM interface System Nx module boundaries
INV-LLM-003 Ollama available for local development Process Docker Compose config
INV-LLM-004 All LLM calls emit OpenTelemetry spans System Adapter instrumentation

Quality Attributes

Attribute Target Rationale
Latency <50ms overhead LiteLLM adds minimal proxy overhead
Availability 99.5% with fallbacks Fallback chains ensure resilience
Testability 100% mockable FakeLlmAdapter for unit tests
Observability Full trace coverage OTel spans on all calls

Bounded Contexts Impacted

Consequences

Benefits

Trade-offs


Configuration

Environment Variables

1
2
3
4
5
6
7
8
9
10
11
12
13
# Provider selection
LLM_PROVIDER=ollama           # ollama | openai | anthropic | openrouter
LLM_MODEL=llama3.2            # Model name per provider
LLM_FALLBACK_MODELS=gpt-4o,claude-3-sonnet  # Comma-separated fallback chain

# API keys (not needed for Ollama)
OPENAI_API_KEY=sk-...
ANTHROPIC_API_KEY=sk-ant-...
OPENROUTER_API_KEY=sk-or-...

# Policy Gateway (required for production)
LLM_POLICY_GATEWAY_URL=http://policy-gateway:8080/v1
LLM_BYPASS_GATEWAY=false      # Only true for local development

Docker Compose (Development)

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
services:
  ollama:
    image: ollama/ollama:latest
    ports:
      - "11434:11434"
    volumes:
      - ollama-models:/root/.ollama
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:11434/api/tags"]
      interval: 10s
      timeout: 5s
      retries: 3

volumes:
  ollama-models:

Success Criteria

Another developer can read this ADR and understand:

  1. The architectural guardrails for LLM provider integration
  2. Why LiteLLM was chosen over provider-specific SDKs
  3. How to configure providers for development vs production
  4. The isomorphic mappings that guarantee spec-to-code fidelity