Wire the existing Policy Gateway to LiteLLM Provider using dual-mode routing with fail-closed circuit breaker, automated port conflict detection, OPA policy versioning, and circuit breaker observability.
Configure dual-mode routing with fail-closed default — Set LLM_ENABLE_POLICY_GATEWAY=false (dev) and LLM_ENABLE_POLICY_GATEWAY=true (prod), LLM_POLICY_GATEWAY_URL=http://policy-gateway:8080/v1, POLICY_GATEWAY_FAIL_MODE=closed in .env.example; enforce the prod default in the policy-gateway container entrypoint by failing fast when SEA_ENV=production and LLM_ENABLE_POLICY_GATEWAY!=true; update Policy Gateway’s llm_provider_url to http://llm-provider:8001/v1 in settings.py
Create port registry and pre-commit validation — Add port allocation table (Policy Gateway: 8080, LLM Provider: 8001) to 008-technical-specifications.md, create infra/docker/scripts/validate-ports.sh scanning script, add pre-commit hook in .pre-commit-config.yaml, and integrate into just doctor
Dockerize services with conditional startup — Add policy-gateway and llm-provider services to docker-compose.skeleton.yml with profiles: ["ai-governance"], health checks, depends_on: [opa], and conditional startup driven by compose profiles (not env toggles) to avoid ambiguous prod behavior
Add OpenAI-compatible proxy path — Create /v1/chat/completions route in routes.py that delegates to existing /policy/chat/completions handler and shares the exact auth/ratelimiting middleware chain; preserve X-Forwarded headers and OpenTelemetry span context propagation
Implement PolicyGatewayPort with circuit breaker — Create HttpPolicyGatewayClient in policy-gateway.adapter.ts and policy_gateway_client.py with circuit breaker (5 failures → open, 30s timeout) that emits policy_gateway.circuit_breaker.state_change OpenTelemetry events and respects POLICY_GATEWAY_FAIL_MODE
Add OPA policy versioning — Add # Policy Version: 1.0.0 header to prompt_policy.rego and output_policy.rego, add policy_version field to SDS-047 schema, create docs/specs/shared/reference/opa-policy-migration-guide.md
Create OPA policy test suite — Add prompt_policy_test.rego and output_policy_test.rego covering 3 enforcement modes, PII patterns, jailbreak detection, token budgets; add just opa-test recipe to justfile
Add circuit breaker observability — Update SDS-030 with policy_gateway.circuit_breaker.state_change metric specification, implement OpenTelemetry event emission in circuit breaker adapters with attributes: state (open/closed/half_open), failure_count, timestamp
Create integration test suite and CI validation — Add tests/integration/test_policy_gateway_litellm.py validating blocked/allowed prompts, fail-closed behavior, circuit breaker recovery; add just ai-stack-up, just ai-stack-test to justfile; update .github/workflows/ci.yml to run port validation and OPA tests