Integration Handbook Epic
User Journey
The Integration bounded context provides the messaging infrastructure and API patterns for SEA™ Forge services to communicate reliably. It implements NATS/JetStream for event streaming, outbox/inbox pattern for transactional messaging, API patterns for REST endpoints, and runbooks for debugging, DLQ processing, and troubleshooting.
Jobs to be Done & EARS Requirements
Job: Start NATS Infrastructure
User Story: As a platform engineer, I want to start NATS and PostgreSQL for messaging, so that services have reliable communication infrastructure.
EARS Requirement:
- While starting infrastructure, when docker-compose up is executed, the integration context shall:
- Start NATS container on ports 4222 (client connections) and 8222 (monitoring)
- Start PostgreSQL container on port 5432
- Verify both services report “running” status
- Expose NATS monitoring UI at http://localhost:8222
- Configure PostgreSQL with
DATABASE_URL environment variable
Job: Initialize JetStream Stream
User Story: As a messaging architect, I want to create a NATS JetStream stream with retention policies, so that events are durably stored and replayable.
EARS Requirement:
- While configuring JetStream, when stream is created, the integration context shall:
- Create Stream with command:
nats stream add SEA_EVENTS
- Subject pattern:
sea.event.> (wildcard for all event types)
- Retention:
limits (max age, max bytes, max messages)
- Interest-based retention: messages deleted when all consumers acknowledge
- Work queue retention: one consumer per message (work queue semantics)
- Max age:
7d (7 days)
- Max bytes:
10GB
- Storage:
file (durable storage)
- Create Consumer with command:
nats consumer add SEA_EVENTS sea__default
- Filter:
sea.event.>
- Ack policy:
explicit
- Wait time:
120s
- Max deliver:
20 (retry limit before DLQ)
- Pull mode:
true
- Verify Creation:
- Confirm stream exists with
nats stream info SEA_EVENTS
- Confirm consumer exists with
nats consumer info SEA_EVENTS sea__default
Job: Run Database Migrations
User Story: As a database administrator, I want to create outbox and inbox tables, so that transactional messaging is enabled.
EARS Requirement:
- While running migrations, when
just db-migrate is executed, the integration context shall:
- Create Outbox Table (
outbox_events):
id: UUID primary key
aggregate_type: Entity type string
aggregate_id: Entity instance ID
event_type: Domain event name
payload: JSONB event data
created_at: Timestamp
published: Boolean (false until sent)
publish_error: Text error if failed
- Create Inbox Table (
inbox_messages):
id: UUID primary key
message_id: Unique message identifier
source_context: Originating bounded context
event_type: Domain event name
payload: JSONB event data
processed: Boolean (false until handled)
processing_error: Text error if failed
created_at: Timestamp
delivered_at: Timestamp
- Create Indexes for query performance
- Return migration confirmation with table counts
Job: Publish Outbox Event
User Story: As a service developer, I want to write domain events to the outbox within the same transaction as state changes, so that messaging is reliable.
EARS Requirement:
- While processing domain commands, when an event is emitted, the integration context shall:
- Start Transaction:
- Begin database transaction
- Perform domain state changes
- Insert event into
outbox_events table
- Set Outbox Fields:
id: Generate UUID
aggregate_type: Entity type (e.g., “Order”)
aggregate_id: Entity instance ID
event_type: Event name (e.g., “OrderCreated”)
payload: Event data as JSONB
created_at: Current timestamp
published: false (not yet sent)
- Commit Transaction:
- Both state change and outbox entry succeed or fail together
- Return success confirmation
- Background Worker picks up unpublished events for NATS publishing
Job: Process Inbox Message
User Story: As a service developer, I want to receive events from the inbox with idempotency, so that messages are processed exactly once.
EARS Requirement:
- While consuming messages, when a message is delivered to inbox, the integration context shall:
- Write to Inbox:
- Insert into
inbox_messages table with:
message_id: Unique ID from NATS
source_context: Publishing bounded context
event_type: Domain event name
payload: Event data as JSONB
processed: false
- Check Idempotency:
- Query for existing record with same
message_id
- If exists, skip processing (already handled)
- If new, proceed with message handling
- Process Message:
- Invoke message handler with payload
- Update
processed to true on success
- Set
processing_error on failure
- Record
delivered_at timestamp
- Ack NATS Message after successful processing
Job: Route to Dead Letter Queue
User Story: As a system operator, I want failed messages to route to DLQ after retry exhaustion, so that they can be analyzed and replayed.
EARS Requirement:
- While processing messages, when max delivery count is exceeded (20), the integration context shall:
- Detect Exhaustion:
- Count delivery attempts for message
- If count >= max_deliver (20), route to DLQ
- Route to DLQ:
- Publish to DLQ subject:
sea.dlq.<original_subject>
- Include original message content
- Add metadata: failure count, error messages, timestamps
-
Ack Original Message to remove from active queue
- Log DLQ Event for monitoring and alerting
Job: Replay Dead Letter Messages
User Story: As a system operator, I want to replay messages from DLQ after fixing issues, so that I can recover from transient failures.
EARS Requirement:
- While managing DLQ, when replay is requested, the integration context shall:
- List DLQ Messages:
- Query DLQ stream for messages
- Show metadata: original subject, failure count, error, timestamps
- Select Messages for replay:
- Choose specific message ID or replay all
- Reset delivery count to 0
- Clear error metadata
- Republish to Original Subject:
- Publish to
sea.event.<original_subject>
- Preserve original payload and headers
- Generate new message ID
-
Ack DLQ Message to remove from DLQ
- Return replay confirmation with message count
Job: Design REST API Endpoints
User Story: As an API designer, I want to follow SEA™ API patterns for consistent REST endpoints, so that services integrate seamlessly.
EARS Requirement:
- While designing APIs, when endpoint is created, the integration context shall:
- Follow Naming Conventions:
- Use kebab-case for resource names (e.g.,
/bounded-contexts)
- Use plural nouns for collections (e.g.,
/orders, not /order)
- Nest resources logically (e.g.,
/orders/{id}/items)
- Use HTTP Methods Semantically:
GET: Retrieve resource (idempotent, safe)
POST: Create resource (non-idempotent)
PUT: Replace resource (idempotent)
PATCH: Partial update (not inherently idempotent; can be made idempotent using conditional requests or idempotent patch semantics)
DELETE: Remove resource (idempotent)
- Return Standard Status Codes:
- 200 OK: Successful GET/PUT/PATCH/DELETE
- 201 Created: Successful POST
- 204 No Content: Successful DELETE with no body
- 400 Bad Request: Invalid input
- 404 Not Found: Resource doesn’t exist
- 409 Conflict: Resource state conflict
- 422 Unprocessable Entity: Validation failed
- 500 Internal Server Error: Unexpected error
- Include Error Responses with:
error: Machine-readable error code
message: Human-readable description
details: Additional context (validation errors, field names)
request_id: For correlation
Job: Debug API Integration Issues
User Story: As a developer, I want to diagnose API integration problems using runbooks, so that I can resolve issues quickly.
EARS Requirement:
- While debugging APIs, when runbook steps are followed, the integration context shall:
- Check Service Health:
- Verify service is running (process status, container status)
- Check health endpoint returns 200 OK
- Review logs for startup errors
- Test API Endpoint:
- Use curl or HTTP client to call endpoint
- Check response status code and body
- Verify response headers (Content-Type, correlation-id)
- Analyze Error Response:
- Parse error code and message
- Check error details for specific issues
- Correlate request_id with service logs
- Review Metrics:
- Check request rate and error rate
- Monitor latency percentiles (p50, p95, p99)
- Identify anomalies or spikes
- Follow Runbook Steps:
- Start with api_debugging.md
- Proceed to specific troubleshooting based on symptoms
- Document resolution for future reference
Domain Entities Summary
Root Aggregates
- OutboxEvent: Domain event awaiting publication with id, aggregate_type, aggregate_id, event_type, payload, created_at, published, and publish_error
- InboxMessage: Received message with id, message_id, source_context, event_type, payload, processed, processing_error, and delivered_at
- DLQMessage: Dead letter queue message with original payload, failure count, error details, and timestamps
- APIEndpoint: REST API definition with route, method, request schema, response schema, and error codes
Value Objects
- JetStreamStream: NATS stream configuration with name, subjects, retention, max-age, and storage
- JetStreamConsumer: NATS consumer configuration with filter, ack policy, wait time, and max-deliver
- APIRequest: HTTP request with method, path, headers, body, and correlation-id
- APIResponse: HTTP response with status code, body, headers, and error details
Policy Rules
- TransactionalOutbox: Outbox writes must be in same transaction as state changes
- IdempotentInbox: Messages with same message_id must be processed only once
- DLQAfterRetries: Messages route to DLQ after max_deliver attempts (20)
- StandardHTTPMethods: API endpoints must use HTTP methods semantically
Integration Points
- NATS/JetStream: Message broker for event streaming and durable storage
- PostgreSQL: Outbox and inbox tables for transactional messaging
- API Services: REST endpoints following SEA™ API patterns
- Monitoring Systems: Metrics and logs for API debugging and DLQ monitoring
Success Metrics
- Message Delivery Success: >99.5% of messages delivered without DLQ
- Outbox Draining: <5 seconds from outbox write to NATS publish
- API Response Time: p95 <200ms for typical endpoints
- DLQ Replay Success: >90% of replayed messages process successfully
Non-Functional Requirements
- NFR-001: Outbox writes are atomic with state changes
- NFR-002: Inbox processing is idempotent by message_id
- NFR-003: JetStream persists messages for 7 days
- NFR-004: API endpoints return consistent error response format