AI & RAG

PagerDuty Alert Enrichment with Baseline + Runbook Context

When a PagerDuty alert fires, the bot automatically enriches it by comparing the alerting metric to its Datadog historical baseline and attaching the matching runbook steps.

CategoryAI & RAG
Enginesim
Difficultyadvanced
Triggerevent
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerPagerDuty incident is createdPagerDutyPagerDuty
  • ActionPull alerting metric current value and 28-day baseline from DatadogDatadogDatadog
  • LogicCompute deviation from baseline and derive a severity score
  • ActionFind the matching runbook by monitor/service tag in ConfluenceConfluenceConfluence
  • ActionDraft an enriched incident context card (OpenAI)OpenAI
  • OutputPost context card to the incident Slack channel and link it on PagerDutySlack

What it does

This workflow fires on every new PagerDuty incident and does the legwork an engineer would do in the first five panicked minutes: it identifies the alerting metric, pulls its historical baseline from Datadog to show how far outside normal the current value is, locates the matching runbook in Confluence, and posts a single enriched context card to the incident's Slack channel. The responder opens the page already knowing the severity and the first remediation step.

When to use it

Use it to cut mean-time-to-context on noisy or unfamiliar alerts, especially for rotations where responders cover services they didn't build.

How it works

  1. 1A PagerDuty incident trigger delivers the alert payload and offending metric.
  2. 2Datadog is queried for that metric's current value and 28-day baseline at the same hour-of-week.
  3. 3A logic step computes the deviation and a rough severity score from the baseline gap.
  4. 4Confluence is searched for the runbook tied to the alert's monitor or service tag.
  5. 5OpenAI drafts a concise context card: what's abnormal, by how much, and the first runbook action.
  6. 6The card is posted to the incident's Slack channel and linked back onto the PagerDuty incident.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect PagerDutyIncidents, on-call, escalations.
  2. 2
    Connect DatadogMetrics, traces, log search.
  3. 3
    Connect ConfluenceSpaces, pages, blueprints.
  4. 4
    Connect OpenAIModels, embeddings, files.
  5. 5
    Connect SlackChannels, DMs, threads, mentions.
  6. 6
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  7. 7
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  8. 8
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.