ADR-028: GovernedSpeed™ LLMOps Architecture

Status: Accepted
Version: 1.0
Date: 2025-12-25
Supersedes: N/A
Related ADRs: ADR-012, ADR-022, ADR-025
Related PRDs: PRD-010


Context

Organizations deploying AI at enterprise scale face a critical tension: governance processes designed to ensure safety, fairness, and compliance often conflict with delivery velocity. Traditional approaches treat governance as a gate—a checkpoint that blocks deployment until manual review completes. This creates adversarial dynamics between risk teams and delivery teams.

Forces at play:

  1. Regulatory pressure: NIST AI RMF, ISO 42001, EU AI Act require documented risk management
  2. Velocity demands: Business units expect rapid iteration on AI capabilities
  3. Post-deployment risk: Bias, hallucination, and safety failures discovered after deployment cause reputational and financial damage
  4. Evidence gaps: Audit trails are often reconstructed after-the-fact rather than captured inline

Key insight: Governance and velocity are not trade-offs when governance is embedded inline with delivery rather than applied as an afterthought.

Decision

Adopt the GovernedSpeed™ architecture pattern: embed AI governance controls directly into the development and runtime lifecycle using Policy-as-Code, automated compliance gates, and real-time policy enforcement.

Core Principles

  1. Governance Inline, Not Afterward — Policy checks occur at commit time, not post-deployment
  2. Policy-as-Code — Governance rules are machine-readable YAML, not prose documents
  3. Sidecar Enforcement — Runtime policies are enforced via sidecar containers without modifying inference code
  4. Evidence-by-Default — Audit trails are generated automatically as pipeline artifacts
  5. Fail-Closed Gates — Threshold violations block progression; no silent failures

Reference Architecture

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
┌─────────────────────────────────────────────────────────────┐
│  GovernedSpeed™ LLMOps Pipeline                              │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  1. Code Commit                                             │
│     ↓                                                       │
│  2. Pre-Commit Gate (Policy-as-Code)                        │
│     • Validates quality/fairness/safety metrics             │
│     • Blocks merge if thresholds violated                   │
│     ↓                                                       │
│  3. CI/CD Pipeline                                          │
│     • Builds artifacts                                       │
│     • Runs evaluation suite                                 │
│     • Submits evidence to Risk & Evidence Service           │
│     ↓                                                       │
│  4. Runtime Enforcement (Policy Gateway Sidecar)            │
│     • Filters prompts (PII, jailbreaks)                     │
│     • Monitors outputs (copyright, toxicity)                │
│     • Proxies to LLM with policy checks                     │
│     ↓                                                       │
│  5. Continuous Monitoring                                   │
│     • Tracks quality drift (PSI > 0.2 → retrain)            │
│     • Alerts on fairness violations                         │
│     • Generates compliance snapshots                        │
│                                                             │
└─────────────────────────────────────────────────────────────┘

Rationale

  1. Pre-commit validation catches issues when they are cheapest to fix (developer’s machine, not production)
  2. Sidecar pattern ensures inference code remains unchanged—policy enforcement is a cross-cutting concern
  3. SHA256 evidence hashing provides cryptographic tamper-evidence for audit trails
  4. OpenTelemetry metrics enable real-time observability with vendor-neutral instrumentation
  5. Framework synthesis (NIST AI RMF + ISO 42001 + EU AI Act) provides regulatory coverage across jurisdictions

Constraints (MUST/MUST NOT)

Critical for generator choices. These constraints flow directly into manifests and SEA-DSL.

Isomorphic Guarantees

Defines structure-preserving mappings from this ADR to implementation.

Spec Concept Implementation Target Mapping Rule
Policy Rule policies/adr-006.embedded-governance.yaml 1:1; YAML rules[] array
Threshold thresholds section in policy YAML 1:1; metric → target → block_above
Evidence Artifact RES /evidence POST with hash 1:1; JSON content → SHA256
Governance Metric OpenTelemetry Gauge/Counter 1:1; metric name == spec metric

System Invariants

Non-negotiable truths that must hold across the system.

INV-ID Invariant Type Enforcement
INV-GS-001 Policy enforcement must not modify inference code System Sidecar architecture
INV-GS-002 All evidence artifacts must be SHA256 hashed System RES submission validator
INV-GS-003 Governance gates must block on threshold violations Process CI/CD gate exit code
INV-GS-004 Quality: pass@5 ≥ 0.82 Entity Pre-merge check
INV-GS-005 Fairness: subgroup_delta ≤ 0.05 Entity Pre-merge check
INV-GS-006 Safety: harmful_rate ≤ 0.005 Entity Pre-merge check
INV-GS-007 Drift: PSI ≤ 0.2 or trigger retraining Process Monitoring alert

Quality Attributes

Attribute Target Rationale
Latency Policy Gateway < 10ms overhead User experience
Availability 99.5% for governance services Production reliability
Auditability All decisions logged with hash Compliance requirement
Idempotency Evidence submission retry-safe Distributed system reliability

Bounded Contexts Impacted

Consequences

Benefits

Trade-offs


Success Criteria

Another developer can read this ADR and understand:

  1. The architectural guardrails that constrain LLMOps implementation choices.
  2. The isomorphic mappings that guarantee spec-to-code fidelity for governance rules.
  3. The system invariants that must never be violated during AI deployment.