DEVOPS

Escalate Datadog cache-regression alerts to PagerDuty with deploy context

When a Datadog monitor on Cloudflare cache-hit ratio fires, enriches the alert with the current cache breakdown and the last deploy.

CategoryDevOps
Enginesim
Difficultyadvanced
Triggerwebhook
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerDatadog cache-hit monitor alertDatadogDatadog
  • ActionConfirm drop via Cloudflare cache-statusCloudflareCloudflare
  • LogicProceed only if regression is sustained
  • ActionFetch most recent Vercel deployVercelVercel
  • ActionOpen PagerDuty incident with contextPagerDutyPagerDuty
  • OutputMirror incident summary to on-call SlackSlack

What it does

This workflow upgrades a raw Datadog monitor into an actionable page. When the cache-hit-ratio monitor trips, it pulls live Cloudflare cache-status detail and the most recent Vercel deploy, confirms the regression is sustained rather than a momentary blip, and opens a PagerDuty incident pre-loaded with the likely cause and the numbers on-call needs.

When to use it

Use it when a single Datadog threshold breach is too noisy to page on directly, but a genuine, sustained cache-hit drop must wake someone. It suits teams who already alert on Cloudflare metrics in Datadog and want enriched, deduplicated escalation instead of a bare monitor notification.

How it works

  1. 1A Datadog monitor alert webhook for the cache-hit-ratio monitor fires.
  2. 2Re-query Cloudflare for the current cache-status breakdown to confirm the drop.
  3. 3Branch: proceed only if the regression has persisted across the confirmation window.
  4. 4Fetch the most recent Vercel deploy as the prime suspect.
  5. 5Open a PagerDuty incident with the cache numbers and suspect deploy attached.
  6. 6Mirror the incident summary to the on-call Slack channel for visibility.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect CloudflareWorkers, Pages, R2, KV — the edge stack.
  3. 3
    Connect VercelDeploys, runtime logs, analytics.
  4. 4
    Connect PagerDutyIncidents, on-call, escalations.
  5. 5
    Connect SlackChannels, DMs, threads, mentions.
  6. 6
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  7. 7
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  8. 8
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.