AI & RAG

Is This Metric Normal? On-Call Baseline Answer Bot

An on-call engineer asks 'is this metric normal?' in Slack, and the bot answers with the live value compared against historical Datadog baselines.

CategoryAI & RAG
Enginesim
Difficultyintermediate
Triggerchat
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerOn-call engineer mentions the bot in Slack with a metric questionSlack
  • ActionParse question into a Datadog metric query and time window (OpenAI)OpenAI
  • ActionFetch current value and time-of-day historical baselines from DatadogDatadogDatadog
  • ActionRetrieve the runbook section defining the healthy range from ConfluenceConfluenceConfluence
  • LogicClassify the reading as normal, elevated, or anomalous vs baseline
  • OutputReply in Slack thread with verdict, baseline comparison, and runbook citationSlack

What it does

When an on-call engineer posts a question like "is checkout p99 latency normal right now?" in a Slack channel, this bot pulls the current metric value from Datadog, compares it against the same metric's historical baseline (last 7 and 28 days, same time-of-day window), and answers in plain English with a verdict: normal, elevated, or anomalous. Every answer cites the Confluence runbook passage that defines the expected range so the engineer can trust and trace the call.

When to use it

Use it during incidents or routine on-call shifts when someone sees a number on a dashboard and isn't sure whether it warrants action. It replaces the "let me scroll through three weeks of graphs" reflex with a cited, two-second answer.

How it works

  1. 1A Slack mention or slash command triggers the flow with the engineer's question.
  2. 2An OpenAI step parses the question into a Datadog metric query and time window.
  3. 3Datadog returns the current value plus historical series for matching time-of-day windows.
  4. 4Confluence is searched for the runbook section defining that metric's healthy range.
  5. 5OpenAI synthesizes a verdict from the live value, the computed baseline, and the runbook text.
  6. 6The bot replies in-thread with the verdict, the baseline comparison, and a runbook citation link.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect SlackChannels, DMs, threads, mentions.
  2. 2
    Connect DatadogMetrics, traces, log search.
  3. 3
    Connect ConfluenceSpaces, pages, blueprints.
  4. 4
    Connect OpenAIModels, embeddings, files.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.