Skip to main content
Like AOF? Give us a star!
If you find AOF useful, please star us on GitHub. It helps us reach more developers and grow the community.

AgentFleet: Multi-Agent Coordination

AgentFleet enables multiple AI agents to work together on complex tasks. Think of it like a Kubernetes Deployment - multiple specialized pods working in parallel toward a common goal.

Quick Reference: Coordination Modes

ModeDescriptionUse Case
peerAll agents parallel + consensus OR aggregationCode review, multi-expert analysis
hierarchicalManager coordinates workersComplex orchestration
pipelineSequential handoffData transformation
swarmSelf-organizing, load balancedHigh-volume parallel
tieredTier-based parallelMulti-model RCA
deepIterative planning + executionComplex investigations

Fleet Patterns for Agentic Ops

Choose the right fleet pattern for your use case:

Use CaseFleet PatternWhy
Code ReviewPeer + AggregationMultiple specialists (security, quality) provide complementary findings
Root Cause AnalysisTiered or DeepData collectors → Reasoners → Synthesis
Incident ResponseHierarchicalCoordinator delegates to specialists
Multi-Model ValidationPeer + ConsensusMultiple LLMs validate same hypothesis
Data PipelinePipelineSequential transformation stages
High-Volume ProcessingSwarmSelf-organizing load balancing
Complex InvestigationDeepIterative planning, execution, re-planning
Change ApprovalPeer + ConsensusVoting/unanimous agreement
Parallel Data CollectionPeer + AggregationLogs + Metrics + Traces agents

Quick Decision Guide

Need multiple perspectives on SAME task?
└── Yes: Use Peer + Consensus (pick best)
└── No: Different specialists?
└── Yes: Use Peer + Aggregation (collect all)
└── No: Sequential processing?
└── Yes: Use Pipeline
└── No: Complex orchestration?
└── Yes: Use Hierarchical
└── No: Iterative investigation?
└── Yes: Use Deep
└── No: High-volume?
└── Yes: Use Swarm
└── No: Multi-tier RCA?
└── Yes: Use Tiered

Agent-First Architecture

AOF uses a simple, composable model - Agent for 95% of tasks, Fleet when sophisticated reasoning is needed:

ConceptWhat It IsExample
AgentSingle-purpose specialistk8s-agent, prometheus-agent
FleetTeam of agents for a purposedevops-fleet, rca-fleet
FlowEvent routing to fleetsTelegram → Fleet → Response

The key insight: Don't build "super agents" with many tools. Build focused agents, then compose them into fleets.

# ✅ GOOD: Fleet composes single-purpose agents
apiVersion: aof.dev/v1alpha1
kind: AgentFleet
metadata:
name: devops-fleet
spec:
agents:
- ref: library/k8s-agent.yaml # kubectl, helm only
- ref: library/docker-agent.yaml # docker only
- ref: library/git-agent.yaml # git only
# ❌ BAD: One agent with too many tools
spec:
tools: [kubectl, docker, git, terraform, aws, helm]
# Hard to maintain, not reusable, unfocused

Why Use Fleets?

The Single Agent Problem

A single AI agent, even a powerful one like Claude or GPT-4, has limitations:

  • Single perspective: One model, one viewpoint
  • Blind spots: May miss domain-specific issues
  • Hallucination risk: No cross-validation
  • Single point of failure: If it's wrong, you're wrong

The Fleet Solution

Fleets solve these problems through specialization and consensus:

┌─────────────────────────────────────────────────────────────┐
│ SINGLE AGENT │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ Generalist Agent │ │
│ │ - Knows security (somewhat) │ │
│ │ - Knows performance (somewhat) │ │
│ │ - Knows style (somewhat) │ │
│ │ - Single point of failure │ │
│ └─────────────────────────────────────────────────────┘ │
└─────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────┐
│ AGENT FLEET │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Security │ │ Performance │ │ Quality │ │
│ │ Specialist │ │ Specialist │ │ Specialist │ │
│ │ (focused) │ │ (focused) │ │ (focused) │ │
│ └──────┬──────┘ └──────┬──────┘ └──────┬──────┘ │
│ └────────────────┼────────────────┘ │
│ ▼ │
│ ┌─────────────┐ │
│ │ CONSENSUS │ │
│ └─────────────┘ │
│ ▼ │
│ Unified, validated result │
└─────────────────────────────────────────────────────────────┘

When to Use Fleet vs Single Agent

Use a Single Agent When:

  • Task is straightforward (answer questions, write simple code)
  • You need conversational memory across turns
  • Cost is a primary concern
  • Latency is critical (sub-second responses)

Use a Fleet When:

ScenarioWhy Fleet Wins
Code Review3 specialists catch more than 1 generalist
Incident ResponseParallel analysis of logs, metrics, traces
Critical DecisionsConsensus reduces hallucination risk
Cost OptimizationCheap models in parallel vs expensive single model
Compliance/Audit"3 agents agreed on this decision"
SpecializationFocused instructions → better results

Cost Comparison

Single Claude Opus:
- 1 call × $15/1M tokens
- 1 perspective
- ~30s response time

Fleet of 3 Gemini Flash:
- 3 parallel calls × $0.075/1M tokens
- 3 perspectives + consensus
- ~5s response time (parallel execution!)

Coordination Modes

AOF supports six coordination modes for different use cases:

1. Peer Mode (Default)

All agents work as equals, executing in parallel. Results can be:

  • Consensus (default): Pick the best result when agents compete on the same task
  • Aggregation: Merge ALL results when agents provide complementary information

Peer Mode with Consensus (Competing Perspectives)

Use consensus when multiple agents tackle the same problem and you want to pick the best answer:

coordination:
mode: peer
distribution: round-robin
consensus:
algorithm: majority
min_votes: 2

Best for: Multiple RCA hypotheses, voting scenarios, redundancy

Peer Mode with Aggregation (Complementary Specialists)

Use aggregation when specialists provide different but complementary information:

coordination:
mode: peer
distribution: round-robin
aggregation: merge # Collect ALL agent results

Best for: Code review (security + quality), multi-expert analysis, parallel data collection

Aggregation Options:

  • merge - Collect and combine all agent results into a structured output
  • consensus - Fall back to consensus (default)
  • manager_synthesis - Use a manager to synthesize (requires hierarchical mode)

How it works:

  1. Task submitted to all agents simultaneously
  2. Each agent executes independently (in parallel)
  3. Results collected from all agents
  4. Either consensus picks one OR aggregation merges all
     ┌──────────┐
│ Task │
└────┬─────┘

┌─────┼─────┐
▼ ▼ ▼
┌──────┐┌──────┐┌──────┐
│Agent1││Agent2││Agent3│ (parallel)
└──┬───┘└──┬───┘└──┬───┘
│ │ │
└───────┼───────┘

┌──────────────┐
│ Consensus OR │
│ Aggregation │
└──────────────┘

When to Use Which:

ScenarioUseExample
Same task, pick bestconsensus3 models diagnose same issue
Different specialists, need allaggregation: mergeSecurity + Quality reviewers
Voting/approvalconsensus3 agents approve deployment
Parallel data collectionaggregation: mergeLogs + Metrics + Traces agents

2. Hierarchical Mode

A manager agent coordinates worker agents, delegating tasks and synthesizing results.

coordination:
mode: hierarchical

agents:
- name: coordinator
role: manager
spec:
instructions: |
You coordinate the team. Analyze tasks and delegate to specialists.
Synthesize their findings into a final report.

- name: log-analyzer
role: worker
spec:
instructions: Focus only on log analysis.

- name: metrics-analyzer
role: worker
spec:
instructions: Focus only on metrics analysis.

Best for: Complex orchestration, incident response, multi-stage workflows

How it works:

  1. Manager agent receives task
  2. Manager analyzes and decides delegation strategy
  3. Workers execute their assigned portions
  4. Manager synthesizes final result
          ┌──────────┐
│ Task │
└────┬─────┘

┌──────────┐
│ Manager │
└────┬─────┘
│ delegates
┌─────────┼─────────┐
▼ ▼ ▼
┌────────┐┌────────┐┌────────┐
│Worker 1││Worker 2││Worker 3│
└───┬────┘└───┬────┘└───┬────┘
│ │ │
└─────────┼─────────┘
▼ results
┌──────────┐
│ Manager │
│synthesize│
└──────────┘

3. Pipeline Mode

Agents execute sequentially, with each agent's output becoming the next agent's input.

coordination:
mode: pipeline

agents:
- name: data-collector
spec:
instructions: Collect and format raw data.

- name: analyzer
spec:
instructions: Analyze the collected data.

- name: reporter
spec:
instructions: Generate a human-readable report.

Best for: Data transformation, multi-stage processing, ETL workflows

How it works:

  1. First agent processes original input
  2. Output passed to next agent as input
  3. Continues through all agents sequentially
  4. Final agent produces the result
┌──────────┐    ┌──────────┐    ┌──────────┐
│ Agent 1 │───▶│ Agent 2 │───▶│ Agent 3 │
│(collect) │ │(analyze) │ │(report) │
└──────────┘ └──────────┘ └──────────┘
input ──▶ intermediate ──▶ output

4. Swarm Mode

Self-organizing, dynamic coordination with intelligent load balancing.

coordination:
mode: swarm
distribution: least-loaded

Best for: High-volume task processing, load-balanced workloads

How it works:

  1. Task arrives
  2. System selects least-loaded idle agent
  3. Agent processes task
  4. Metrics tracked for future balancing

5. Tiered Mode (Multi-Model RCA)

Tier-based parallel execution with consensus at each tier. Designed for multi-model scenarios like Root Cause Analysis where cheap data collectors feed reasoning models.

coordination:
mode: tiered
consensus:
algorithm: weighted
min_confidence: 0.6
tiered:
pass_all_results: true
final_aggregation: manager_synthesis

agents:
# Tier 1: Data Collectors (cheap models, parallel)
- name: loki-collector
tier: 1
spec:
model: google:gemini-2.0-flash # ~$0.075/1M tokens

- name: prometheus-collector
tier: 1
spec:
model: google:gemini-2.0-flash

# Tier 2: Reasoning Models (multi-model consensus)
- name: claude-analyzer
tier: 2
weight: 1.5 # Higher weight for Claude
spec:
model: anthropic:claude-sonnet-4-20250514

- name: gemini-analyzer
tier: 2
weight: 1.0
spec:
model: google:gemini-2.5-pro

# Tier 3: Coordinator (synthesis)
- name: rca-coordinator
tier: 3
role: manager
spec:
model: anthropic:claude-sonnet-4-20250514

Best for: Multi-model RCA, complex analysis workflows, cost-optimized multi-perspective analysis

How it works:

  1. Tier 1 agents execute in parallel (cheap data collection)
  2. Results from Tier 1 are passed to Tier 2 agents
  3. Tier 2 agents analyze with different LLMs (multi-model consensus)
  4. Tier 3 manager synthesizes final result
        ┌─────────────────────────────────────────────────┐
│ TIERED EXECUTION │
│ │
│ TIER 1: Data Collectors (parallel) │
│ ┌────────┐ ┌────────┐ ┌────────┐ ┌────────┐ │
│ │ Loki │ │ Prom │ │ K8s │ │ Git │ │
│ │ Logs │ │Metrics │ │ State │ │Changes │ │
│ └───┬────┘ └───┬────┘ └───┬────┘ └───┬────┘ │
│ └──────────┼──────────┼──────────┘ │
│ ▼ (collected data) │
│ │
│ TIER 2: Reasoning Models (multi-model) │
│ ┌────────────┐ ┌────────────┐ ┌────────────┐ │
│ │ Claude │ │ Gemini │ │ GPT-4 │ │
│ │ (wt: 1.5) │ │ (wt: 1.0) │ │ (wt: 1.0) │ │
│ └─────┬──────┘ └─────┬──────┘ └─────┬──────┘ │
│ └──────────────┼──────────────┘ │
│ ▼ (weighted consensus) │
│ │
│ TIER 3: Coordinator │
│ ┌────────────────────────────────────────┐ │
│ │ RCA Coordinator │ │
│ │ (synthesize final report) │ │
│ └────────────────────────────────────────┘ │
│ ▼ │
│ Final RCA Report │
└─────────────────────────────────────────────────┘

Tiered Configuration Options:

tiered:
# Pass all results to next tier (vs just consensus winner)
pass_all_results: true

# Final tier aggregation strategy
final_aggregation: manager_synthesis # or: consensus, merge

# Per-tier consensus configuration
tier_consensus:
"1":
algorithm: first_wins # Fast data collection
"2":
algorithm: weighted # Multi-model consensus
min_votes: 2

6. Deep Mode (Iterative Planning + Execution)

Deep mode adds an agentic loop pattern: planning → execution → re-planning. Unlike other modes that execute once and return, deep mode iterates until the goal is achieved or max iterations reached.

coordination:
mode: deep
deep:
max_iterations: 10
planning: true
memory: true

Best for: Complex investigations, root cause analysis, multi-step reasoning tasks

What Deep Mode Adds:

  1. Planning - LLM generates investigation steps before execution
  2. Iteration - Execute steps until goal achieved (not just once)
  3. Re-planning - Adjust plan based on findings mid-execution
  4. Memory - Persist findings across iterations for context

How it works:

User: /fleet rca "why is API returning 500 errors"


┌─────────────────────────────────────────────────────────────┐
│ DEEP MODE = Agentic Loop (like Claude Code) │
│ │
│ 1. PLAN: Generate investigation steps │
│ → Check pods, fetch logs, query metrics, correlate │
│ │
│ 2. EXECUTE: Run each step using appropriate tools │
│ → kubectl get pods → kubectl logs → promql query │
│ │
│ 3. ITERATE: Continue until goal achieved │
│ → Found OOM? Check memory limits. Still unclear? Dig. │
│ │
│ 4. SYNTHESIZE: Produce final answer with evidence │
│ → Root cause + evidence + recommendations │
└─────────────────────────────────────────────────────────────┘

Example Output:

🔍 RCA Fleet - Investigating...

📋 Plan:
1. Check pod status
2. Fetch error logs
3. Query metrics
4. Correlate findings

⏳ Step 1/4: Checking pods...
→ 2/3 pods in CrashLoopBackOff

⏳ Step 2/4: Fetching logs...
→ OutOfMemoryError at 14:32

⏳ Step 3/4: Querying metrics...
→ Memory at 98% before crash

⏳ Step 4/4: Correlating...

✅ Root Cause: Memory leak in API service

Evidence:
• Pods crashing with OOMKilled
• Memory at 98% before crash
• 3x traffic spike at 14:30

Recommendations:
1. Increase memory limits (immediate)
2. Profile for memory leak (long-term)

Deep Configuration Options:

coordination:
mode: deep
deep:
# Safety limit - stop after N iterations
max_iterations: 10

# Enable planning phase (LLM generates steps)
planning: true

# Persist findings in memory across iterations
memory: true

# Optional: Override model for planning
planner_model: anthropic:claude-sonnet-4-20250514

# Optional: Custom planning prompt
planner_prompt: |
You are planning an investigation. Generate steps to find the root cause.

# Optional: Custom synthesis prompt
synthesizer_prompt: |
Synthesize findings into a clear report with evidence.

When to Use Deep vs Other Modes:

ScenarioRecommended Mode
Multiple perspectives on same taskPeer + Consensus
Manager delegates to workersHierarchical
Sequential data transformationPipeline
High-volume parallel processingSwarm
Multi-model analysis (cheap → expensive)Tiered
Complex investigation needing iterationDeep

Consensus Algorithms

When using Peer or Tiered modes, consensus determines how agent results are aggregated.

Supported Algorithms

AlgorithmRuleUse Case
majority>50% must agreeCode review, incident triage
unanimous100% must agreeCritical deployments, security
weightedVotes weighted by roleSenior > Junior reviewers
first_winsFirst response winsTime-critical scenarios
human_reviewAlways flag for humanHigh-stakes decisions

Algorithm Details

Majority

More than 50% of agents must agree on the result. Fast and tolerant of outliers.

Unanimous

100% of agents must agree. Use for critical decisions where false positives are costly.

Weighted

Each agent has a configurable weight. Senior reviewers can count more than juniors.

agents:
- name: senior-reviewer
weight: 2.0 # Counts as 2 votes

- name: junior-reviewer
weight: 1.0 # Counts as 1 vote

consensus:
algorithm: weighted
weights:
senior-reviewer: 2.0
junior-reviewer: 1.0

FirstWins

First agent to respond wins. Use when speed matters more than consensus.

HumanReview

Always flags for human operator decision. Use for high-stakes scenarios.

consensus:
algorithm: human_review
min_confidence: 0.9 # If confidence below this, definitely needs review

Configuration

coordination:
mode: peer
consensus:
algorithm: majority # majority, unanimous, weighted, first_wins, human_review
min_votes: 2 # Minimum responses required
timeout_ms: 60000 # Max wait time (60 seconds)
allow_partial: true # Accept result if some agents fail
min_confidence: 0.7 # Below this, flag for human review
weights: # Per-agent weights (for weighted algorithm)
senior-reviewer: 2.0
junior-reviewer: 1.0

Example: Code Review Consensus

Security Agent:   "CRITICAL: SQL injection on line 42"
Performance Agent: "No SQL issues, but N+1 query problem"
Quality Agent: "SQL injection on line 42, missing tests"

Consensus (majority):
- SQL injection CONFIRMED (2/3 agree)
- N+1 query flagged (1/3, noted but not critical)
- Missing tests flagged (1/3, noted)

Task Distribution Strategies

How tasks are assigned to agents (for Hierarchical and Swarm modes):

StrategyDescriptionBest For
round-robinCycle through agentsEven distribution
least-loadedAgent with fewest tasksLoad balancing
randomRandom selectionSimple scenarios
skill-basedMatch agent skills to taskSpecialized work
stickySame task type → same agentCaching benefits
coordination:
mode: hierarchical
distribution: least-loaded # or round-robin, random, skill-based, sticky

Agent Roles

Agents can have defined roles that affect their behavior in the fleet:

RoleDescriptionUsed In
workerRegular task executorAll modes
managerCoordinator/orchestratorHierarchical mode
specialistDomain expertAny mode
validatorReview/validationQuality gates
agents:
- name: security-expert
role: specialist
spec:
instructions: You are a security specialist...

- name: team-lead
role: manager
spec:
instructions: You coordinate the team...

Communication Patterns

Agents can communicate through shared memory and messaging:

Shared Memory Types

TypeDescriptionUse Case
in_memoryRAM-basedSingle process, testing
redisDistributed cacheMulti-instance, real-time
sqliteLocal databasePersistent, single node
postgresDistributed DBProduction, multi-node

Message Patterns

PatternDescriptionUse Case
directPoint-to-pointAgent-to-agent
broadcastAll agents receiveAnnouncements
pub_subTopic-basedEvent-driven
request_replyRequest-responseQueries
communication:
pattern: broadcast
broadcast:
channel: team-updates
include_sender: false

shared:
memory:
type: redis
url: redis://localhost:6379

Real-World Examples

Example 1: Code Review Fleet (with Aggregation)

Two specialists review code in parallel. Uses aggregation to collect ALL findings (not consensus):

apiVersion: aof.dev/v1
kind: AgentFleet
metadata:
name: code-review-fleet
spec:
agents:
- name: security-reviewer
role: specialist
spec:
model: google:gemini-2.5-flash
instructions: |
Focus on security: SQL injection, XSS, authentication,
secrets in code, dependency vulnerabilities.
Format: ## Security Review with Critical/High/Medium sections.

- name: quality-reviewer
role: specialist
spec:
model: google:gemini-2.5-flash
instructions: |
Focus on quality: SOLID principles, error handling,
code structure, naming conventions.
Format: ## Quality Review with Issues/Suggestions/Score.

coordination:
mode: peer
distribution: round-robin
aggregation: merge # Collect ALL findings from both specialists

Run it:

aofctl run fleet code-review-fleet.yaml \
--input "Review: function login(user, pass) { const query = 'SELECT * FROM users WHERE name=' + user; return db.query(query); }"

Output (both reviews merged):

╔═══════════════════════════════════════════════════════════════════════════╗
║ FLEET RESULTS (2 agents) ║
╚═══════════════════════════════════════════════════════════════════════════╝

┌─ security-reviewer
│ ## Security Review
│ ### Critical Issues
│ * SQL Injection: Direct concatenation of user input...


┌─ quality-reviewer
│ ## Quality Review
│ ### Issues Found
│ 1. Unused parameter 'pass'
│ 2. No error handling
│ ### Score: 2/10

Example 1b: Code Review Fleet (with Consensus)

Same task but with consensus - picks the most agreed-upon findings:

coordination:
mode: peer
distribution: round-robin
consensus:
algorithm: majority
min_votes: 2

Use consensus when: Multiple agents analyze the same aspect and you want to validate findings (e.g., 3 security reviewers must agree on vulnerabilities).

Example 2: Incident Response Fleet

Hierarchical coordination with a manager orchestrating specialists:

apiVersion: aof.dev/v1
kind: AgentFleet
metadata:
name: incident-response-team
spec:
agents:
- name: incident-commander
role: manager
spec:
model: google:gemini-2.5-flash
instructions: |
You are the Incident Commander. When an incident arrives:
1. Assess severity and impact
2. Delegate investigation to specialists
3. Coordinate response actions
4. Synthesize findings into an incident report
tools:
- shell

- name: log-investigator
role: specialist
spec:
model: google:gemini-2.5-flash
instructions: |
You analyze logs. Search for errors, exceptions, and anomalies.
Report timeline of events and root cause indicators.
tools:
- shell
- read_file

- name: metrics-analyst
role: specialist
spec:
model: google:gemini-2.5-flash
instructions: |
You analyze metrics and dashboards. Look for:
- Resource exhaustion (CPU, memory, disk)
- Traffic anomalies
- Error rate spikes
tools:
- shell

coordination:
mode: hierarchical
distribution: skill-based

Example 3: Data Pipeline Fleet

Sequential processing with each stage building on the previous:

apiVersion: aof.dev/v1
kind: AgentFleet
metadata:
name: data-pipeline
spec:
agents:
- name: collector
spec:
model: google:gemini-2.5-flash
instructions: |
Collect and normalize raw data from the input source.
Output clean, structured JSON.

- name: analyzer
spec:
model: google:gemini-2.5-flash
instructions: |
Analyze the structured data. Identify patterns, anomalies,
and key insights. Output analysis summary.

- name: reporter
spec:
model: google:gemini-2.5-flash
instructions: |
Generate a human-readable report from the analysis.
Include executive summary, key findings, and recommendations.

coordination:
mode: pipeline

Fleet Metrics

Fleets automatically track execution metrics:

metrics:
total_tasks: 150
completed_tasks: 145
failed_tasks: 5
avg_task_duration_ms: 2340
active_agents: 3
total_agents: 3
consensus_rounds: 145 # For peer mode

Access via CLI:

aofctl describe fleet code-review-team

Token Usage Tracking

Fleet execution automatically tracks token usage for each agent and provides aggregated totals. This is useful for cost monitoring and optimization.

Token Usage in Output

After fleet execution, token usage is displayed in the completion summary:

╭─────────────────────────────────────────────────────────────╮
│ 🚀 FLEET EXECUTION COMPLETE │
├─────────────────────────────────────────────────────────────┤
│ Fleet: code-review-fleet │
│ Duration: 5.23s │
├─────────────────────────────────────────────────────────────┤
│ Token Usage: │
│ Input: 1,234 tokens │
│ Output: 567 tokens │
│ Total: 1,801 tokens │
╰─────────────────────────────────────────────────────────────╯

Token Usage in JSON Output

When using --output json, token usage is included in the result:

{
"results": [...],
"agent_count": 2,
"usage": {
"input_tokens": 1234,
"output_tokens": 567,
"total_tokens": 1801
}
}

Per-Agent Token Tracking

Each agent's token usage is tracked individually and included in the result when using aggregation: merge:

{
"results": [
{
"agent": "security-reviewer",
"response": "...",
"input_tokens": 612,
"output_tokens": 284
},
{
"agent": "quality-reviewer",
"response": "...",
"input_tokens": 622,
"output_tokens": 283
}
],
"usage": {
"input_tokens": 1234,
"output_tokens": 567,
"total_tokens": 1801
}
}

Cost Estimation

Use token counts to estimate costs based on your model's pricing:

# Gemini 2.5 Flash pricing (~$0.075/1M input, ~$0.30/1M output)
# 1,234 input + 567 output ≈ $0.00026

# Claude Sonnet pricing (~$3/1M input, ~$15/1M output)
# Same tokens ≈ $0.012

Best Practices

1. Choose the Right Mode

SituationRecommended Mode
Need multiple perspectivesPeer + Consensus
Complex orchestrationHierarchical
Sequential processingPipeline
High-volume, uniform tasksSwarm

2. Optimize Agent Instructions

Each agent should have focused, specific instructions:

# ❌ Bad: Too generic
instructions: Review the code for issues.

# ✅ Good: Focused and specific
instructions: |
You are a SECURITY specialist. Focus ONLY on:
- SQL injection vulnerabilities
- XSS attack vectors
- Authentication/authorization flaws
- Secrets or credentials in code
- Insecure dependencies

Ignore performance and style issues - other agents handle those.

3. Use Appropriate Consensus

ScenarioAlgorithm
General reviewmajority
Security-criticalunanimous
Mixed seniorityweighted
Time-criticalfirst_wins

4. Set Reasonable Timeouts

consensus:
timeout_ms: 60000 # 60 seconds for most tasks
allow_partial: true # Don't fail if one agent is slow

5. Use Replicas for Scaling

agents:
- name: worker
replicas: 3 # 3 instances for load balancing

Summary

AspectSingle AgentFleet
Perspectives1Multiple
ReliabilitySingle point of failureConsensus-validated
CostCan be expensiveOften cheaper (parallel cheap models)
LatencySequentialParallel execution
SpecializationGeneralistDeep expertise per agent
Audit TrailLimitedFull consensus history

Rule of thumb: Use fleets for anything critical, multi-perspective, or high-volume. Use single agents for simple, conversational, or cost-sensitive tasks.


Next Steps: