⚡ Performance Tuning

Optimizing cognitive workflow performance.


Key Metrics

Metric Description Target
Latency End-to-end workflow time < 30s
Token Cost Total tokens per workflow Minimize
Convergence Rounds to stable state < 5
Accuracy Output quality > 90%

Latency Optimization

Reduce Round Count

1
2
3
execution:
  maxRounds: 3  # Down from 5
  convergenceThreshold: 0.90  # Lower threshold

Increase Parallelism

All specialists run in parallel by default. Ensure no serial dependencies:

1
2
3
4
specialists:
  - agentId: "agent-1"  # These run
  - agentId: "agent-2"  # in parallel
  - agentId: "agent-3"  # automatically

Use Faster Models

1
2
3
4
5
# For speed-critical agents
agents:
  - agentId: "fast-classifier"
    model: "claude-haiku"  # Faster than Sonnet
    maxOutputTokens: 200

Token Cost Optimization

Tighter Token Budgets

1
2
3
specialists:
  - agentId: "semantic-mapper"
    maxOutputTokens: 300  # Down from 500

Compact Output Schemas

1
2
3
4
5
//  Verbose
{ "findings": [{ "text": "...", "source": "...", "confidence": 0.9 }] }

//  Compact
{ "f": ["..."], "c": [0.9] }

Model Selection by Task

Task Model Cost
Classification Haiku Low
Analysis Sonnet Medium
Reasoning Opus High

Accuracy Optimization

More Rounds

1
2
execution:
  maxRounds: 7  # Up from 5

Higher Temperature for Creativity

1
2
3
agents:
  - agentId: "creative-agent"
    temperature: 0.9  # More creative

Lower Temperature for Precision

1
2
3
agents:
  - agentId: "rule-analyst"
    temperature: 0.2  # More deterministic

Memory Optimization

State Size Limits

1
2
3
4
5
validator:
  checks:
    - type: "state-size"
      maxBytes: 30000  # Reduce from 50000
      onViolation: "clip"

Delta-Only Updates

Ensure agents produce deltas, not full state replacements:

1
2
aggregator:
  strategy: "delta-merge"  # Not "full-replace"

Monitoring

Enable Detailed Tracing

1
2
3
4
5
6
7
observability:
  tracing:
    enabled: true
    detailLevel: "full"
  metrics:
    enabled: true
    buckets: [0.1, 0.5, 1, 2, 5, 10, 30]

View Performance Dashboard

1
just cognitive-perf-dashboard --workflow-id <id>

Benchmarking

1
2
3
4
5
6
7
8
9
10
11
# Run benchmark suite
just cognitive-benchmark \
  --config router-config.yaml \
  --test-cases tests/benchmark-cases.json \
  --iterations 10

# Output:
# Avg Latency: 2.3s
# Avg Tokens: 1500
# Avg Rounds: 3.2
# Accuracy: 94%

Tuning Tradeoffs

Want Adjust Tradeoff
Lower latency Fewer rounds, faster models May reduce accuracy
Lower cost Smaller budgets, cheaper models May reduce quality
Higher accuracy More rounds, better models Increases latency and cost