Like AOF? Give us a star!

If you find AOF useful, please star us on GitHub. It helps us reach more developers and grow the community.

Building a Discord Ops Bot

Build a complete DevOps bot for Discord with slash commands, embeds, buttons, and multi-agent coordination.

What We're Building

A Discord bot that:

Responds to slash commands for DevOps operations
Shows rich status embeds with color-coded information
Uses buttons for interactive workflows
Coordinates multi-agent tasks
Supports role-based access control

Prerequisites

Completed Discord Quickstart
Kubernetes cluster access (local minikube or remote)
Basic familiarity with Discord bot development

Architecture Overview

Discord Server
      ↓
Discord Gateway → AOF Webhook
                       ↓
                  Ed25519 Verify
                       ↓
                  Parse Interaction
                       ↓
             ┌────────┴────────┐
             ↓                 ↓
        Slash Command     Component
             ↓                 ↓
        Route to           Handle
        Agent/Fleet        Action
             ↓                 ↓
        Embed + Buttons ← Format Response
             ↓
        Discord Reply

Step 1: Enhanced Trigger Configuration

Create a comprehensive trigger with commands and role restrictions:

# triggers/discord-ops.yaml
apiVersion: aof.dev/v1
kind: Trigger
metadata:
  name: discord-ops
  labels:
    platform: discord
    environment: production
spec:
  type: Discord
  config:
    bot_token: ${DISCORD_BOT_TOKEN}
    application_id: ${DISCORD_APPLICATION_ID}
    public_key: ${DISCORD_PUBLIC_KEY}

    # Restrict to specific servers
    guild_ids:
      - ${DISCORD_GUILD_ID}

    # Role restrictions
    allowed_roles:
      - ${DISCORD_ADMIN_ROLE}
      - ${DISCORD_DEVOPS_ROLE}

  # Command definitions
  commands:
    /help:
      agent: devops
      description: "Show available commands"

    /status:
      agent: k8s-ops
      description: "Cluster status dashboard"

    /pods:
      agent: k8s-ops
      description: "List pods in namespace"

    /logs:
      agent: k8s-ops
      description: "View pod logs"

    /deploy:
      agent: deployer
      description: "Deploy application"

    /scale:
      agent: k8s-ops
      description: "Scale deployment"

    /diagnose:
      fleet: rca-fleet
      description: "Root cause analysis"

    /incident:
      flow: incident-flow
      description: "Start incident response"

  default_agent: devops

Step 2: Specialized Agents

DevOps Agent

# agents/devops.yaml
apiVersion: aof.dev/v1alpha1
kind: Agent
metadata:
  name: devops
  labels:
    platform: discord
spec:
  model: google:gemini-2.5-flash
  temperature: 0
  max_tokens: 2048

  description: "General DevOps assistant for Discord"

  tools:
    - kubectl
    - docker
    - helm
    - aws

  system_prompt: |
    You are a DevOps assistant in Discord.

    ## Response Format
    Use Discord embed formatting:
    - Title with emoji prefix
    - Color-coded status (green=success, yellow=warning, red=error)
    - Inline fields for metrics
    - Footer with timestamp

    ## Available Commands
    - /status - Cluster dashboard
    - /pods [namespace] - List pods
    - /logs <pod> - View logs
    - /deploy <app> <version> - Deploy
    - /scale <deployment> <replicas> - Scale
    - /diagnose - Root cause analysis
    - /incident - Start incident response

    ## Example Status Response
    Title: 📊 Cluster Status
    Color: 0x00FF00 (green for healthy)

    Fields:
    - Nodes: ✅ 3/3 Ready (inline)
    - Pods: ⚠️ 45/48 Running (inline)
    - Services: ✅ 12/12 Active (inline)

    Description:
    All core services healthy. 3 pods pending in staging namespace.

    Buttons: [Refresh] [View Pods] [View Logs]

Kubernetes Ops Agent

# agents/k8s-ops.yaml
apiVersion: aof.dev/v1alpha1
kind: Agent
metadata:
  name: k8s-ops
  labels:
    platform: discord
    specialty: kubernetes
spec:
  model: google:gemini-2.5-flash
  temperature: 0
  max_tokens: 2048

  description: "Kubernetes operations specialist"

  tools:
    - kubectl
    - helm

  system_prompt: |
    You are a Kubernetes specialist in Discord.

    ## Response Guidelines
    - Use embed format for structured data
    - Color code by severity:
      - 0x00FF00 (green) - All healthy
      - 0xFFFF00 (yellow) - Warnings present
      - 0xFF0000 (red) - Errors/failures
    - Show pod status in table format
    - Highlight issues prominently

    ## Pod Status Format
    Title: 📦 Pods in {namespace}

Pod Status Restarts Age ───────────────────────────────────────── api-abc123 ✅ Running 0 2d web-xyz789 ⚠️ Pending 0 5m worker-def456 ❌ Error 5 1h

Footer: {count} pods | {healthy} healthy | {issues} issues

Buttons: [Describe] [Logs] [Events] [Refresh]

## Component Handling
When receiving component:approve_restart:
- Execute kubectl rollout restart
- Update message with result

Deployer Agent

# agents/deployer.yaml
apiVersion: aof.dev/v1alpha1
kind: Agent
metadata:
  name: deployer
  labels:
    platform: discord
    specialty: deployments
spec:
  model: google:gemini-2.5-flash
  temperature: 0
  max_tokens: 2048

  description: "Deployment manager"

  tools:
    - kubectl
    - helm
    - argocd

  system_prompt: |
    You are a deployment manager in Discord.

    ## Deployment Request Format
    Title: 🚀 Deployment Request

    Fields:
    - Application: {app} (inline)
    - Version: {current} → {new} (inline)
    - Environment: {env} (inline)

    Description:
    Changes:
    • Feature: New user dashboard
    • Fix: Login timeout issue

    Requested by: @{user}

    Color: 0x0099FF (blue for pending)

    Buttons:
    [✅ Approve] [❌ Reject] [📋 View Changes]

    ## Post-Deployment
    Title: ✅ Deployment Complete

    Fields:
    - Duration: 45s
    - Replicas: 3/3 Ready
    - Status: ✅ Healthy

    Color: 0x00FF00 (green)

    Buttons: [View Logs] [Rollback]

Step 3: RCA Fleet

# fleets/rca-fleet.yaml
apiVersion: aof.dev/v1alpha1
kind: Fleet
metadata:
  name: rca-fleet
  labels:
    platform: discord
spec:
  description: "Multi-agent root cause analysis"

  agents:
    - name: symptom-collector
      role: "Gather symptoms and current state"
      model: google:gemini-2.5-flash
      tools: [kubectl, prometheus]

    - name: log-analyzer
      role: "Analyze logs for errors"
      model: google:gemini-2.5-flash
      tools: [kubectl, loki]

    - name: metric-analyzer
      role: "Check metrics for anomalies"
      model: google:gemini-2.5-flash
      tools: [prometheus, grafana]

    - name: synthesizer
      role: "Synthesize findings into RCA"
      model: google:gemini-2.5-flash

  workflow:
    mode: parallel
    consensus: required
    timeout: 300

  output_format: |
    Title: 🔍 Root Cause Analysis

    Fields:
    - Severity: {severity} (inline)
    - Duration: {duration} (inline)
    - Services: {affected_count} affected (inline)

    Description:
    **Summary:** {summary}

    **Root Cause:**
    {root_cause}

    **Timeline:**
    {timeline}

    **Remediation:**
    {remediation_steps}

    Color: Based on severity

    Buttons: [Apply Fix] [View Details] [Dismiss]

Step 4: Custom Slash Commands

Define more detailed commands with options:

# Full command definitions
commands:
  - name: agent
    description: "Manage AOF agents"
    options:
      - name: action
        type: 3  # STRING
        description: "Action to perform"
        required: true
        choices:
          - name: run
            value: run
          - name: status
            value: status
          - name: stop
            value: stop
      - name: agent_id
        type: 3
        description: "Agent ID"
        required: true

  - name: deploy
    description: "Deploy application"
    options:
      - name: application
        type: 3
        description: "Application name"
        required: true
        autocomplete: true
      - name: version
        type: 3
        description: "Version to deploy"
        required: true
      - name: environment
        type: 3
        description: "Target environment"
        required: true
        choices:
          - name: development
            value: dev
          - name: staging
            value: staging
          - name: production
            value: prod

  - name: scale
    description: "Scale deployment"
    options:
      - name: deployment
        type: 3
        description: "Deployment name"
        required: true
      - name: replicas
        type: 4  # INTEGER
        description: "Number of replicas"
        required: true
        min_value: 0
        max_value: 100

Step 5: Component Handlers

Handle button clicks and select menus:

# In agent system prompt
system_prompt: |
  ## Component Handling
  When you receive a message with "component:" prefix,
  it's a button or select menu interaction.

  Handle these component IDs:
  - component:refresh_status → Re-run status check
  - component:view_pods → List all pods
  - component:view_logs → Show recent logs
  - component:approve_deploy → Execute deployment
  - component:reject_deploy → Cancel deployment
  - component:apply_fix → Apply recommended fix
  - component:rollback → Rollback last deployment

  Response format for component interactions:
  - Update the original message if possible
  - Use ephemeral message for confirmations
  - Add new buttons for next actions

Step 6: Embed Templates

Status Dashboard

{
  "embeds": [{
    "title": "📊 Cluster Dashboard",
    "description": "Real-time cluster health overview",
    "color": 65280,
    "fields": [
      { "name": "🖥️ Nodes", "value": "✅ 3/3 Ready", "inline": true },
      { "name": "📦 Pods", "value": "⚠️ 45/48 Running", "inline": true },
      { "name": "🔗 Services", "value": "✅ 12/12 Active", "inline": true },
      { "name": "💾 PVCs", "value": "✅ 8/8 Bound", "inline": true },
      { "name": "🔐 Secrets", "value": "✅ 15 Active", "inline": true },
      { "name": "⚙️ ConfigMaps", "value": "✅ 23 Active", "inline": true }
    ],
    "footer": { "text": "Last updated" },
    "timestamp": "2024-01-15T10:30:00.000Z"
  }],
  "components": [{
    "type": 1,
    "components": [
      { "type": 2, "style": 1, "label": "🔄 Refresh", "custom_id": "refresh_status" },
      { "type": 2, "style": 2, "label": "📦 Pods", "custom_id": "view_pods" },
      { "type": 2, "style": 2, "label": "📋 Logs", "custom_id": "view_logs" },
      { "type": 2, "style": 2, "label": "⚠️ Alerts", "custom_id": "view_alerts" }
    ]
  }]
}

Deployment Approval

{
  "embeds": [{
    "title": "🚀 Deployment Approval Required",
    "description": "A new deployment is waiting for approval",
    "color": 255,
    "fields": [
      { "name": "Application", "value": "api-server", "inline": true },
      { "name": "Version", "value": "v2.1.0 → v2.2.0", "inline": true },
      { "name": "Environment", "value": "production", "inline": true },
      { "name": "Requested By", "value": "<@123456789>", "inline": true },
      { "name": "Changes", "value": "• Bug fixes\n• Performance improvements", "inline": false }
    ],
    "footer": { "text": "Deployment ID: deploy-abc123" }
  }],
  "components": [
    {
      "type": 1,
      "components": [
        { "type": 2, "style": 3, "label": "✅ Approve", "custom_id": "approve_deploy_abc123" },
        { "type": 2, "style": 4, "label": "❌ Reject", "custom_id": "reject_deploy_abc123" },
        { "type": 2, "style": 2, "label": "📋 View Changes", "custom_id": "view_changes_abc123" }
      ]
    }
  ]
}

Step 7: Role-Based Access

Implement role checks in your configuration:

# triggers/discord-ops.yaml
spec:
  config:
    # Role IDs from Discord server
    allowed_roles:
      - "111111111111111111"  # Admin
      - "222222222222222222"  # DevOps
      - "333333333333333333"  # SRE

  # Command-specific role requirements
  commands:
    /deploy:
      agent: deployer
      description: "Deploy application"
      required_roles:
        - "111111111111111111"  # Admin only
        - "222222222222222222"  # DevOps

    /status:
      agent: k8s-ops
      description: "View status"
      # No role restriction - all allowed_roles can use

Step 8: Testing

Test Slash Commands

/help
/status
/pods namespace:default
/deploy application:api-server version:v2.2.0 environment:staging

Test Button Interactions

Run /status to get a status embed
Click the "Refresh" button
Verify the embed updates

Test Deployment Flow

Run /deploy application:test version:v1.0.0 environment:dev
See approval embed appear
Click "Approve" button
Verify deployment proceeds

Step 9: Production Deployment

Docker Deployment

FROM rust:1.75 as builder
WORKDIR /app
COPY . .
RUN cargo build --release

FROM debian:bookworm-slim
COPY --from=builder /app/target/release/aofctl /usr/local/bin/
COPY config/ /app/config/
WORKDIR /app
CMD ["aofctl", "daemon", "start"]

Kubernetes Deployment

apiVersion: apps/v1
kind: Deployment
metadata:
  name: aof-discord-bot
spec:
  replicas: 2
  selector:
    matchLabels:
      app: aof-discord-bot
  template:
    metadata:
      labels:
        app: aof-discord-bot
    spec:
      containers:
        - name: aofctl
          image: aof/aofctl:latest
          ports:
            - containerPort: 8080
          env:
            - name: DISCORD_BOT_TOKEN
              valueFrom:
                secretKeyRef:
                  name: discord-credentials
                  key: bot_token
            - name: DISCORD_APPLICATION_ID
              valueFrom:
                secretKeyRef:
                  name: discord-credentials
                  key: application_id
            - name: DISCORD_PUBLIC_KEY
              valueFrom:
                secretKeyRef:
                  name: discord-credentials
                  key: public_key
          volumeMounts:
            - name: config
              mountPath: /app/.aof
      volumes:
        - name: config
          configMap:
            name: aof-config

Best Practices

1. Embed Design

Use consistent color coding
Keep embeds scannable
Limit fields to essential info
Use inline fields for related data

2. Button Organization

Max 5 buttons per row
Group related actions
Use appropriate styles (colors)
Clear, concise labels

3. Error Handling

Show user-friendly error messages
Include troubleshooting hints
Log errors for debugging
Provide retry options

4. Performance

Respond within 3 seconds
Use deferred responses for long operations
Cache frequently accessed data
Batch API calls when possible

Next Steps

Discord Reference - Full API reference
Fleet Configuration - Multi-agent coordination
Custom Tools - Add your own tools
Deployment Guide - Production deployment

What We're Building​

Prerequisites​

Architecture Overview​

Step 1: Enhanced Trigger Configuration​

Step 2: Specialized Agents​

DevOps Agent​

Kubernetes Ops Agent​

Deployer Agent​

Step 3: RCA Fleet​

Step 4: Custom Slash Commands​

Step 5: Component Handlers​

Step 6: Embed Templates​

Status Dashboard​

Deployment Approval​

Step 7: Role-Based Access​

Step 8: Testing​

Test Slash Commands​

Test Button Interactions​

Test Deployment Flow​

Step 9: Production Deployment​

Docker Deployment​

Kubernetes Deployment​

Best Practices​

1. Embed Design​

2. Button Organization​

3. Error Handling​

4. Performance​

Next Steps​