AI & RAG

Weekly Service Baseline Drift Digest

On a weekly schedule, the bot compares each tracked service's key metrics against their established Datadog baselines, flags which ones have drifted.

CategoryAI & RAG
Enginesim
Difficultyintermediate
Triggerschedule
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerWeekly schedule fires
  • ActionPull recent-week and 90-day series for each watchlist metric from DatadogDatadogDatadog
  • LogicFlag metrics whose recent baseline diverges from the long-term baseline
  • ActionLook up documented thresholds for flagged metrics in ConfluenceConfluenceConfluence
  • ActionCompose the drift digest with stale-threshold callouts (OpenAI)OpenAI
  • OutputPost the weekly baseline drift digest to SlackSlack

What it does

Every week this workflow reviews a watchlist of service metrics, pulls each one's recent Datadog trend against its longer-term baseline, and identifies metrics whose normal range has quietly shifted. It cross-references the documented thresholds in Confluence runbooks, then posts a single Slack digest highlighting drift so the team can update alert thresholds and runbook expectations before the drift becomes an incident.

When to use it

Use it to keep alerting honest. Baselines decay as traffic patterns and deploys change; this catches the slow drift that makes monitors either too noisy or dangerously quiet.

How it works

  1. 1A weekly schedule trigger starts the run.
  2. 2Datadog returns recent-week and trailing-90-day series for every metric on the watchlist.
  3. 3A logic step flags metrics whose recent baseline diverges materially from the long-term one.
  4. 4For each flagged metric, Confluence is searched for the runbook's documented threshold.
  5. 5OpenAI writes a digest noting drift direction, magnitude, and whether the runbook threshold is now stale.
  6. 6The digest posts to the team's Slack channel with per-metric runbook citations.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect ConfluenceSpaces, pages, blueprints.
  3. 3
    Connect OpenAIModels, embeddings, files.
  4. 4
    Connect SlackChannels, DMs, threads, mentions.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.