Like AOF? Give us a star!
If you find AOF useful, please star us on GitHub. It helps us reach more developers and grow the community.
Building a Discord Ops Bot
Build a complete DevOps bot for Discord with slash commands, embeds, buttons, and multi-agent coordination.
What We're Building
A Discord bot that:
- Responds to slash commands for DevOps operations
- Shows rich status embeds with color-coded information
- Uses buttons for interactive workflows
- Coordinates multi-agent tasks
- Supports role-based access control
Prerequisites
- Completed Discord Quickstart
- Kubernetes cluster access (local minikube or remote)
- Basic familiarity with Discord bot development
Architecture Overview
Discord Server
↓
Discord Gateway → AOF Webhook
↓
Ed25519 Verify
↓
Parse Interaction
↓
┌────────┴────────┐
↓ ↓
Slash Command Component
↓ ↓
Route to Handle
Agent/Fleet Action
↓ ↓
Embed + Buttons ← Format Response
↓
Discord Reply
Step 1: Enhanced Trigger Configuration
Create a comprehensive trigger with commands and role restrictions:
# triggers/discord-ops.yaml
apiVersion: aof.dev/v1
kind: Trigger
metadata:
name: discord-ops
labels:
platform: discord
environment: production
spec:
type: Discord
config:
bot_token: ${DISCORD_BOT_TOKEN}
application_id: ${DISCORD_APPLICATION_ID}
public_key: ${DISCORD_PUBLIC_KEY}
# Restrict to specific servers
guild_ids:
- ${DISCORD_GUILD_ID}
# Role restrictions
allowed_roles:
- ${DISCORD_ADMIN_ROLE}
- ${DISCORD_DEVOPS_ROLE}
# Command definitions
commands:
/help:
agent: devops
description: "Show available commands"
/status:
agent: k8s-ops
description: "Cluster status dashboard"
/pods:
agent: k8s-ops
description: "List pods in namespace"
/logs:
agent: k8s-ops
description: "View pod logs"
/deploy:
agent: deployer
description: "Deploy application"
/scale:
agent: k8s-ops
description: "Scale deployment"
/diagnose:
fleet: rca-fleet
description: "Root cause analysis"
/incident:
flow: incident-flow
description: "Start incident response"
default_agent: devops
Step 2: Specialized Agents
DevOps Agent
# agents/devops.yaml
apiVersion: aof.dev/v1alpha1
kind: Agent
metadata:
name: devops
labels:
platform: discord
spec:
model: google:gemini-2.5-flash
temperature: 0
max_tokens: 2048
description: "General DevOps assistant for Discord"
tools:
- kubectl
- docker
- helm
- aws
system_prompt: |
You are a DevOps assistant in Discord.
## Response Format
Use Discord embed formatting:
- Title with emoji prefix
- Color-coded status (green=success, yellow=warning, red=error)
- Inline fields for metrics
- Footer with timestamp
## Available Commands
- /status - Cluster dashboard
- /pods [namespace] - List pods
- /logs <pod> - View logs
- /deploy <app> <version> - Deploy
- /scale <deployment> <replicas> - Scale
- /diagnose - Root cause analysis
- /incident - Start incident response
## Example Status Response
Title: 📊 Cluster Status
Color: 0x00FF00 (green for healthy)
Fields:
- Nodes: ✅ 3/3 Ready (inline)
- Pods: ⚠️ 45/48 Running (inline)
- Services: ✅ 12/12 Active (inline)
Description:
All core services healthy. 3 pods pending in staging namespace.
Buttons: [Refresh] [View Pods] [View Logs]
Kubernetes Ops Agent
# agents/k8s-ops.yaml
apiVersion: aof.dev/v1alpha1
kind: Agent
metadata:
name: k8s-ops
labels:
platform: discord
specialty: kubernetes
spec:
model: google:gemini-2.5-flash
temperature: 0
max_tokens: 2048
description: "Kubernetes operations specialist"
tools:
- kubectl
- helm
system_prompt: |
You are a Kubernetes specialist in Discord.
## Response Guidelines
- Use embed format for structured data
- Color code by severity:
- 0x00FF00 (green) - All healthy
- 0xFFFF00 (yellow) - Warnings present
- 0xFF0000 (red) - Errors/failures
- Show pod status in table format
- Highlight issues prominently
## Pod Status Format
Title: 📦 Pods in {namespace}
Pod Status Restarts Age ───────────────────────────────────────── api-abc123 ✅ Running 0 2d web-xyz789 ⚠️ Pending 0 5m worker-def456 ❌ Error 5 1h
Footer: {count} pods | {healthy} healthy | {issues} issues
Buttons: [Describe] [Logs] [Events] [Refresh]
## Component Handling
When receiving component:approve_restart:
- Execute kubectl rollout restart
- Update message with result
Deployer Agent
# agents/deployer.yaml
apiVersion: aof.dev/v1alpha1
kind: Agent
metadata:
name: deployer
labels:
platform: discord
specialty: deployments
spec:
model: google:gemini-2.5-flash
temperature: 0
max_tokens: 2048
description: "Deployment manager"
tools:
- kubectl
- helm
- argocd
system_prompt: |
You are a deployment manager in Discord.
## Deployment Request Format
Title: 🚀 Deployment Request
Fields:
- Application: {app} (inline)
- Version: {current} → {new} (inline)
- Environment: {env} (inline)
Description:
Changes:
• Feature: New user dashboard
• Fix: Login timeout issue
Requested by: @{user}
Color: 0x0099FF (blue for pending)
Buttons:
[✅ Approve] [❌ Reject] [📋 View Changes]
## Post-Deployment
Title: ✅ Deployment Complete
Fields:
- Duration: 45s
- Replicas: 3/3 Ready
- Status: ✅ Healthy
Color: 0x00FF00 (green)
Buttons: [View Logs] [Rollback]
Step 3: RCA Fleet
# fleets/rca-fleet.yaml
apiVersion: aof.dev/v1alpha1
kind: Fleet
metadata:
name: rca-fleet
labels:
platform: discord
spec:
description: "Multi-agent root cause analysis"
agents:
- name: symptom-collector
role: "Gather symptoms and current state"
model: google:gemini-2.5-flash
tools: [kubectl, prometheus]
- name: log-analyzer
role: "Analyze logs for errors"
model: google:gemini-2.5-flash
tools: [kubectl, loki]
- name: metric-analyzer
role: "Check metrics for anomalies"
model: google:gemini-2.5-flash
tools: [prometheus, grafana]
- name: synthesizer
role: "Synthesize findings into RCA"
model: google:gemini-2.5-flash
workflow:
mode: parallel
consensus: required
timeout: 300
output_format: |
Title: 🔍 Root Cause Analysis
Fields:
- Severity: {severity} (inline)
- Duration: {duration} (inline)
- Services: {affected_count} affected (inline)
Description:
**Summary:** {summary}
**Root Cause:**
{root_cause}
**Timeline:**
{timeline}
**Remediation:**
{remediation_steps}
Color: Based on severity
Buttons: [Apply Fix] [View Details] [Dismiss]
Step 4: Custom Slash Commands
Define more detailed commands with options:
# Full command definitions
commands:
- name: agent
description: "Manage AOF agents"
options:
- name: action
type: 3 # STRING
description: "Action to perform"
required: true
choices:
- name: run
value: run
- name: status
value: status
- name: stop
value: stop
- name: agent_id
type: 3
description: "Agent ID"
required: true
- name: deploy
description: "Deploy application"
options:
- name: application
type: 3
description: "Application name"
required: true
autocomplete: true
- name: version
type: 3
description: "Version to deploy"
required: true
- name: environment
type: 3
description: "Target environment"
required: true
choices:
- name: development
value: dev
- name: staging
value: staging
- name: production
value: prod
- name: scale
description: "Scale deployment"
options:
- name: deployment
type: 3
description: "Deployment name"
required: true
- name: replicas
type: 4 # INTEGER
description: "Number of replicas"
required: true
min_value: 0
max_value: 100
Step 5: Component Handlers
Handle button clicks and select menus:
# In agent system prompt
system_prompt: |
## Component Handling
When you receive a message with "component:" prefix,
it's a button or select menu interaction.
Handle these component IDs:
- component:refresh_status → Re-run status check
- component:view_pods → List all pods
- component:view_logs → Show recent logs
- component:approve_deploy → Execute deployment
- component:reject_deploy → Cancel deployment
- component:apply_fix → Apply recommended fix
- component:rollback → Rollback last deployment
Response format for component interactions:
- Update the original message if possible
- Use ephemeral message for confirmations
- Add new buttons for next actions
Step 6: Embed Templates
Status Dashboard
{
"embeds": [{
"title": "📊 Cluster Dashboard",
"description": "Real-time cluster health overview",
"color": 65280,
"fields": [
{ "name": "🖥️ Nodes", "value": "✅ 3/3 Ready", "inline": true },
{ "name": "📦 Pods", "value": "⚠️ 45/48 Running", "inline": true },
{ "name": "🔗 Services", "value": "✅ 12/12 Active", "inline": true },
{ "name": "💾 PVCs", "value": "✅ 8/8 Bound", "inline": true },
{ "name": "🔐 Secrets", "value": "✅ 15 Active", "inline": true },
{ "name": "⚙️ ConfigMaps", "value": "✅ 23 Active", "inline": true }
],
"footer": { "text": "Last updated" },
"timestamp": "2024-01-15T10:30:00.000Z"
}],
"components": [{
"type": 1,
"components": [
{ "type": 2, "style": 1, "label": "🔄 Refresh", "custom_id": "refresh_status" },
{ "type": 2, "style": 2, "label": "📦 Pods", "custom_id": "view_pods" },
{ "type": 2, "style": 2, "label": "📋 Logs", "custom_id": "view_logs" },
{ "type": 2, "style": 2, "label": "⚠️ Alerts", "custom_id": "view_alerts" }
]
}]
}
Deployment Approval
{
"embeds": [{
"title": "🚀 Deployment Approval Required",
"description": "A new deployment is waiting for approval",
"color": 255,
"fields": [
{ "name": "Application", "value": "api-server", "inline": true },
{ "name": "Version", "value": "v2.1.0 → v2.2.0", "inline": true },
{ "name": "Environment", "value": "production", "inline": true },
{ "name": "Requested By", "value": "<@123456789>", "inline": true },
{ "name": "Changes", "value": "• Bug fixes\n• Performance improvements", "inline": false }
],
"footer": { "text": "Deployment ID: deploy-abc123" }
}],
"components": [
{
"type": 1,
"components": [
{ "type": 2, "style": 3, "label": "✅ Approve", "custom_id": "approve_deploy_abc123" },
{ "type": 2, "style": 4, "label": "❌ Reject", "custom_id": "reject_deploy_abc123" },
{ "type": 2, "style": 2, "label": "📋 View Changes", "custom_id": "view_changes_abc123" }
]
}
]
}
Step 7: Role-Based Access
Implement role checks in your configuration:
# triggers/discord-ops.yaml
spec:
config:
# Role IDs from Discord server
allowed_roles:
- "111111111111111111" # Admin
- "222222222222222222" # DevOps
- "333333333333333333" # SRE
# Command-specific role requirements
commands:
/deploy:
agent: deployer
description: "Deploy application"
required_roles:
- "111111111111111111" # Admin only
- "222222222222222222" # DevOps
/status:
agent: k8s-ops
description: "View status"
# No role restriction - all allowed_roles can use
Step 8: Testing
Test Slash Commands
/help
/status
/pods namespace:default
/deploy application:api-server version:v2.2.0 environment:staging
Test Button Interactions
- Run
/statusto get a status embed - Click the "Refresh" button
- Verify the embed updates
Test Deployment Flow
- Run
/deploy application:test version:v1.0.0 environment:dev - See approval embed appear
- Click "Approve" button
- Verify deployment proceeds
Step 9: Production Deployment
Docker Deployment
FROM rust:1.75 as builder
WORKDIR /app
COPY . .
RUN cargo build --release
FROM debian:bookworm-slim
COPY --from=builder /app/target/release/aofctl /usr/local/bin/
COPY config/ /app/config/
WORKDIR /app
CMD ["aofctl", "daemon", "start"]
Kubernetes Deployment
apiVersion: apps/v1
kind: Deployment
metadata:
name: aof-discord-bot
spec:
replicas: 2
selector:
matchLabels:
app: aof-discord-bot
template:
metadata:
labels:
app: aof-discord-bot
spec:
containers:
- name: aofctl
image: aof/aofctl:latest
ports:
- containerPort: 8080
env:
- name: DISCORD_BOT_TOKEN
valueFrom:
secretKeyRef:
name: discord-credentials
key: bot_token
- name: DISCORD_APPLICATION_ID
valueFrom:
secretKeyRef:
name: discord-credentials
key: application_id
- name: DISCORD_PUBLIC_KEY
valueFrom:
secretKeyRef:
name: discord-credentials
key: public_key
volumeMounts:
- name: config
mountPath: /app/.aof
volumes:
- name: config
configMap:
name: aof-config
Best Practices
1. Embed Design
- Use consistent color coding
- Keep embeds scannable
- Limit fields to essential info
- Use inline fields for related data
2. Button Organization
- Max 5 buttons per row
- Group related actions
- Use appropriate styles (colors)
- Clear, concise labels
3. Error Handling
- Show user-friendly error messages
- Include troubleshooting hints
- Log errors for debugging
- Provide retry options
4. Performance
- Respond within 3 seconds
- Use deferred responses for long operations
- Cache frequently accessed data
- Batch API calls when possible
Next Steps
- Discord Reference - Full API reference
- Fleet Configuration - Multi-agent coordination
- Custom Tools - Add your own tools
- Deployment Guide - Production deployment