Skip to main content
Like AOF? Give us a star!
If you find AOF useful, please star us on GitHub. It helps us reach more developers and grow the community.

Quickstart: Test Multi-Model RCA in 5 Minutes

Test AOF's multi-model consensus and tiered execution - with mock data OR real infrastructure.

Two Options

OptionRequirementsBest For
Mock FleetAPI key onlyLearning, demos, testing
Real FleetPrometheus + Loki + K8sProduction RCA

Prerequisites

  1. AOF installed:

    # Check installation
    aofctl version

    # If not installed:
    curl -sSL https://docs.aof.sh/install.sh | bash
  2. At least ONE API key (Gemini is cheapest):

    # Option 1: Google (cheapest - recommended for testing)
    export GOOGLE_API_KEY=your-key-here

    # Option 2: Anthropic
    export ANTHROPIC_API_KEY=your-key-here

    # Option 3: OpenAI
    export OPENAI_API_KEY=your-key-here

Option A: Mock Fleet (No Infrastructure)

Run the Demo

Step 1: Clone the Examples (if needed)

# If you don't have the examples
git clone https://github.com/agenticdevops/aof.git
cd aof

Step 2: Run the Mock Fleet

# Run mock fleet (simulates real data)
aofctl run fleet examples/fleets/mock/multi-model-rca-mock.yaml \
--input "Investigate: API returning 500 errors since the last deployment"

Step 3: Watch the Tiered Execution

You'll see output like:

[FLEET] Initializing multi-model-rca-demo with 6 agents
[FLEET] Mode: tiered (3 tiers)

[TIER 1] Starting 3 data simulators in parallel...
[AGENT] log-simulator: Generating simulated log data
[AGENT] metrics-simulator: Generating simulated metrics
[AGENT] changes-simulator: Generating simulated changes
[TIER 1] Complete (consensus: first_wins)

[TIER 2] Starting 2 reasoning models in parallel...
[AGENT] reasoning-model-1: Analyzing with 5-Whys approach
[AGENT] reasoning-model-2: Analyzing with correlation approach
[TIER 2] Complete (consensus: weighted, confidence: 0.85)

[TIER 3] Starting coordinator synthesis...
[AGENT] rca-coordinator: Generating final report
[TIER 3] Complete

[FLEET] Execution complete

Step 4: Review the Output

The final output is a complete RCA report:

# Root Cause Analysis Report

## Executive Summary
The API 500 errors began at 14:02 UTC following a deployment at 13:58.
Root cause: Database connection pool exhaustion due to increased memory limits.

## Root Cause (Consensus)
**Category**: config
**Description**: Memory limit change caused connection pool starvation
**Confidence**: HIGH (2/2 models agreed)

### Evidence
1. Deployment at 13:58 changed memory limits
2. Connection pool errors started at 14:02 (4 minutes later)
3. Metrics show connection pool exhaustion

## Model Agreement Matrix
| Finding | Model 1 | Model 2 | Consensus |
|---------|---------|---------|-----------|
| Config change is root cause ||| HIGH |
| Memory pressure contributed ||| HIGH |
| Rollback recommended ||| HIGH |

## Immediate Actions
- [ ] Rollback deployment abc123 (Priority: HIGH)
- [ ] Increase connection pool size (Priority: MEDIUM)

---
*Generated by AOF Multi-Model RCA Demo*

Try Different Scenarios (Mock)

Scenario 1: High Latency

aofctl run fleet examples/fleets/mock/multi-model-rca-mock.yaml \
--input "Investigate: P99 latency increased from 50ms to 2s on user service"

Scenario 2: Memory Issues

aofctl run fleet examples/fleets/mock/multi-model-rca-mock.yaml \
--input "Investigate: Pods crashing with OOMKilled in production"

Scenario 3: Database Problems

aofctl run fleet examples/fleets/mock/multi-model-rca-mock.yaml \
--input "Investigate: Database connection timeouts affecting checkout flow"

Scenario 4: Cascading Failure

aofctl run fleet examples/fleets/mock/multi-model-rca-mock.yaml \
--input "Investigate: Multiple services returning errors after auth service upgrade"

Option B: Real Fleet (With Infrastructure)

Use this when you have actual Prometheus, Loki, and Kubernetes.

Prerequisites

Verify your infrastructure:

# Check Kubernetes
kubectl get nodes

# Check Prometheus (default: NodePort 30400)
curl -s "http://localhost:30400/api/v1/query?query=up" | jq '.status'

# Check Loki (default: NodePort 30700)
curl -s "http://localhost:30700/loki/api/v1/labels" | jq '.status'

Run the Real Fleet

# Run against real infrastructure
aofctl run fleet examples/fleets/real/multi-model-rca-real.yaml \
--input "Investigate: High memory usage in monitoring namespace"

Real Scenarios to Try

Check Memory Issues

aofctl run fleet examples/fleets/real/multi-model-rca-real.yaml \
--input "Investigate: Which pods are using the most memory?"

Check Pod Health

aofctl run fleet examples/fleets/real/multi-model-rca-real.yaml \
--input "Investigate: Are there any unhealthy pods or recent restarts?"

Check Recent Changes

aofctl run fleet examples/fleets/real/multi-model-rca-real.yaml \
--input "Investigate: What deployments or config changes happened in the last hour?"

Full RCA

aofctl run fleet examples/fleets/real/multi-model-rca-real.yaml \
--input "Investigate: Prometheus pod keeps restarting - perform full RCA"

Custom Endpoints

If your Prometheus/Loki are on different ports:

# Edit the fleet file or set environment variables
PROMETHEUS_URL=http://your-prometheus:9090 \
LOKI_URL=http://your-loki:3100 \
aofctl run fleet examples/fleets/real/multi-model-rca-real.yaml \
--input "Investigate: Issue description"

Understanding the Output

Tier Execution

TierPurposeModels UsedConsensus
1Data CollectionGemini Flash (cheap)first_wins
2ReasoningMultiple modelsweighted
3SynthesisSingle coordinatorfirst_wins

Confidence Scoring

  • HIGH (>0.8): All models agree strongly
  • MEDIUM (0.5-0.8): Majority agrees, some uncertainty
  • LOW (<0.5): Models disagree, needs human review

Cost Breakdown

TierAgentsEst. TokensEst. Cost
13~15K~$0.01
22~20K~$0.05
31~10K~$0.02
Total6~45K~$0.08

Customize the Demo

Use Different Models

Edit the spec.model field in each agent:

# Use Claude instead of Gemini
spec:
model: anthropic:claude-sonnet-4-20250514

Adjust Weights

Give more weight to models you trust more:

- name: reasoning-model-1
tier: 2
weight: 2.0 # Counts as 2 votes

Change Consensus Algorithm

coordination:
consensus:
algorithm: unanimous # Require all models to agree

Next Steps

  1. Full Tutorial: Multi-Model RCA with Tiered Execution
  2. Architecture Guide: Multi-Model Consensus Architecture
  3. Production Setup: Multi-Model RCA Fleet (with real observability)
  4. Fleet Reference: AgentFleet YAML Spec

Troubleshooting

"No API key found"

# Check your environment
echo $GOOGLE_API_KEY
echo $ANTHROPIC_API_KEY
echo $OPENAI_API_KEY

# At least one must be set
export GOOGLE_API_KEY=your-key-here

"Model not available"

The demo uses Gemini by default. If you only have OpenAI:

# Edit the file to use GPT-4 instead
sed -i 's/google:gemini/openai:gpt-4o/g' examples/fleets/multi-model-rca-demo.yaml

"Timeout"

Increase the timeout:

coordination:
consensus:
timeout_ms: 300000 # 5 minutes

That's it! You've tested multi-model consensus and tiered execution. The same architecture works with real observability data in production.