NATS HA + Secrets Runbook

Purpose

Operate the NATS JetStream HA cluster and manage production secrets with SOPS while keeping dev on env vars.

Scope

Local HA (Docker)

Start

1
SEA_NATS_PROFILE=nats-ha just dev-up

Verify health

1
2
3
curl http://localhost:8222/healthz
curl http://localhost:8223/healthz
curl http://localhost:8224/healthz

Verify JetStream replication

1
curl -s http://localhost:8222/jsz | jq '.meta.cluster'

Failover check

1
docker stop sea-nats-1

Expected: publish/consume continues and a new leader is elected.

If cluster does not form

K3s/Mesh (Kubernetes)

Check pods and services

1
2
kubectl -n <namespace> get pods -l app=nats
kubectl -n <namespace> get svc nats nats-headless

Routing env

Health and replication

1
2
kubectl -n <namespace> port-forward svc/nats 8222:8222
curl -s http://localhost:8222/jsz | jq '.meta.cluster'

Failover check

1
kubectl -n <namespace> delete pod nats-0

Expected: new leader elected, streams remain available.

Production Secrets (SOPS)

Required keys

Manage secrets

Warning: Do not pass secrets as command arguments; use interactive mode or file redirection.

Interactive (Recommended):

1
2
just sops-init
just sops-edit

File-based:

1
2
3
just sops-add NATS_AUTH_TOKEN < token.txt
just sops-add NATS_TLS_CERT < cert.pem
just sops-add NATS_TLS_KEY < key.pem

Notes

Troubleshooting

Cluster does not form

JetStream not available

Auth/TLS failures