⚡ Performance Tuning

Optimizing cognitive workflow performance.

Key Metrics

Metric	Description	Target
Latency	End-to-end workflow time	< 30s
Token Cost	Total tokens per workflow	Minimize
Convergence	Rounds to stable state	< 5
Accuracy	Output quality	> 90%

Latency Optimization

Reduce Round Count

execution:
  maxRounds: 3  # Down from 5
  convergenceThreshold: 0.90  # Lower threshold

Increase Parallelism

All specialists run in parallel by default. Ensure no serial dependencies:

specialists:
  - agentId: "agent-1"  # These run
  - agentId: "agent-2"  # in parallel
  - agentId: "agent-3"  # automatically

Use Faster Models

# For speed-critical agents
agents:
  - agentId: "fast-classifier"
    model: "claude-haiku"  # Faster than Sonnet
    maxOutputTokens: 200

Token Cost Optimization

Tighter Token Budgets

specialists:
  - agentId: "semantic-mapper"
    maxOutputTokens: 300  # Down from 500

Compact Output Schemas

// ❌ Verbose
{ "findings": [{ "text": "...", "source": "...", "confidence": 0.9 }] }

// ✅ Compact
{ "f": ["..."], "c": [0.9] }

Model Selection by Task

Task	Model	Cost
Classification	Haiku	Low
Analysis	Sonnet	Medium
Reasoning	Opus	High

Accuracy Optimization

More Rounds

execution:
  maxRounds: 7  # Up from 5

Higher Temperature for Creativity

agents:
  - agentId: "creative-agent"
    temperature: 0.9  # More creative

Lower Temperature for Precision

agents:
  - agentId: "rule-analyst"
    temperature: 0.2  # More deterministic

Memory Optimization

State Size Limits

validator:
  checks:
    - type: "state-size"
      maxBytes: 30000  # Reduce from 50000
      onViolation: "clip"

Delta-Only Updates

Ensure agents produce deltas, not full state replacements:

aggregator:
  strategy: "delta-merge"  # Not "full-replace"

Monitoring

Enable Detailed Tracing

observability:
  tracing:
    enabled: true
    detailLevel: "full"
  metrics:
    enabled: true
    buckets: [0.1, 0.5, 1, 2, 5, 10, 30]

View Performance Dashboard

just cognitive-perf-dashboard --workflow-id <id>

Benchmarking

# Run benchmark suite
just cognitive-benchmark \
  --config router-config.yaml \
  --test-cases tests/benchmark-cases.json \
  --iterations 10

# Output:
# Avg Latency: 2.3s
# Avg Tokens: 1500
# Avg Rounds: 3.2
# Accuracy: 94%

Tuning Tradeoffs

Want	Adjust	Tradeoff
Lower latency	Fewer rounds, faster models	May reduce accuracy
Lower cost	Smaller budgets, cheaper models	May reduce quality
Higher accuracy	More rounds, better models	Increases latency and cost