Optimizing cognitive workflow performance.
| Metric | Description | Target |
|---|---|---|
| Latency | End-to-end workflow time | < 30s |
| Token Cost | Total tokens per workflow | Minimize |
| Convergence | Rounds to stable state | < 5 |
| Accuracy | Output quality | > 90% |
1
2
3
execution:
maxRounds: 3 # Down from 5
convergenceThreshold: 0.90 # Lower threshold
All specialists run in parallel by default. Ensure no serial dependencies:
1
2
3
4
specialists:
- agentId: "agent-1" # These run
- agentId: "agent-2" # in parallel
- agentId: "agent-3" # automatically
1
2
3
4
5
# For speed-critical agents
agents:
- agentId: "fast-classifier"
model: "claude-haiku" # Faster than Sonnet
maxOutputTokens: 200
1
2
3
specialists:
- agentId: "semantic-mapper"
maxOutputTokens: 300 # Down from 500
1
2
3
4
5
// ❌ Verbose
{ "findings": [{ "text": "...", "source": "...", "confidence": 0.9 }] }
// ✅ Compact
{ "f": ["..."], "c": [0.9] }
| Task | Model | Cost |
|---|---|---|
| Classification | Haiku | Low |
| Analysis | Sonnet | Medium |
| Reasoning | Opus | High |
1
2
execution:
maxRounds: 7 # Up from 5
1
2
3
agents:
- agentId: "creative-agent"
temperature: 0.9 # More creative
1
2
3
agents:
- agentId: "rule-analyst"
temperature: 0.2 # More deterministic
1
2
3
4
5
validator:
checks:
- type: "state-size"
maxBytes: 30000 # Reduce from 50000
onViolation: "clip"
Ensure agents produce deltas, not full state replacements:
1
2
aggregator:
strategy: "delta-merge" # Not "full-replace"
1
2
3
4
5
6
7
observability:
tracing:
enabled: true
detailLevel: "full"
metrics:
enabled: true
buckets: [0.1, 0.5, 1, 2, 5, 10, 30]
1
just cognitive-perf-dashboard --workflow-id <id>
1
2
3
4
5
6
7
8
9
10
11
# Run benchmark suite
just cognitive-benchmark \
--config router-config.yaml \
--test-cases tests/benchmark-cases.json \
--iterations 10
# Output:
# Avg Latency: 2.3s
# Avg Tokens: 1500
# Avg Rounds: 3.2
# Accuracy: 94%
| Want | Adjust | Tradeoff |
|---|---|---|
| Lower latency | Fewer rounds, faster models | May reduce accuracy |
| Lower cost | Smaller budgets, cheaper models | May reduce quality |
| Higher accuracy | More rounds, better models | Increases latency and cost |