DEVOPS

Escalate Datadog cache-regression alerts to PagerDuty with deploy context

When a Datadog monitor on Cloudflare cache-hit ratio fires, enriches the alert with the current cache breakdown and the last deploy.

CategoryDevOps

Enginesim

Difficultyadvanced

Triggerwebhook

Steps6

Setup~25 min

How it runs

The automated pipeline, trigger to output.

TriggerDatadog cache-hit monitor alertDatadog
ActionConfirm drop via Cloudflare cache-statusCloudflare
LogicProceed only if regression is sustained
ActionFetch most recent Vercel deployVercel
ActionOpen PagerDuty incident with contextPagerDuty
OutputMirror incident summary to on-call SlackSlack

What it does

This workflow upgrades a raw Datadog monitor into an actionable page. When the cache-hit-ratio monitor trips, it pulls live Cloudflare cache-status detail and the most recent Vercel deploy, confirms the regression is sustained rather than a momentary blip, and opens a PagerDuty incident pre-loaded with the likely cause and the numbers on-call needs.

When to use it

Use it when a single Datadog threshold breach is too noisy to page on directly, but a genuine, sustained cache-hit drop must wake someone. It suits teams who already alert on Cloudflare metrics in Datadog and want enriched, deduplicated escalation instead of a bare monitor notification.

How it works

1A Datadog monitor alert webhook for the cache-hit-ratio monitor fires.
2Re-query Cloudflare for the current cache-status breakdown to confirm the drop.
3Branch: proceed only if the regression has persisted across the confirmation window.
4Fetch the most recent Vercel deploy as the prime suspect.
5Open a PagerDuty incident with the cache numbers and suspect deploy attached.
6Mirror the incident summary to the on-call Slack channel for visibility.

Set it up

What you configure once, before turning it on.

1
Connect DatadogMetrics, traces, log search.
2
Connect CloudflareWorkers, Pages, R2, KV — the edge stack.
3
Connect VercelDeploys, runtime logs, analytics.
4
Connect PagerDutyIncidents, on-call, escalations.
5
Connect SlackChannels, DMs, threads, mentions.
6
Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
7
Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
8
Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

More DevOps workflows

Hugging Face Spaces idle-runtime sweep with auto-pause

On a schedule, scans all Hugging Face Spaces for ones running idle past a threshold, pauses them to stop billing, and posts a Slack summary with the estimated monthly savings.

Slack-approved pause for idle Hugging Face Spaces

On a daily scan it finds idle paid Spaces and posts an interactive Slack approval; on approve it pauses the Space and logs the decision to a GitHub issue audit trail.

Generate a weekly de-flake report and assign Linear cleanup tickets

On a weekly schedule, aggregates the current quarantine manifest and recent flake history, builds a prioritized report.

Block costly Hugging Face Space hardware upgrades in PR review

When a pull request changes a Space's hardware config, it estimates the new monthly cost and posts a GitHub PR comment that flags upgrades crossing a budget ceiling.

Auto-release tests from quarantine once they prove stable

Triggered by a webhook from a nightly stability runner, checks whether quarantined tests have passed enough consecutive runs, removes the stable ones from quarantine in GitHub.

Quarantine a test on demand from a PR comment command

Triggered when an engineer comments a quarantine command on a pull request, validates the test name, commits the quarantine change to that PR branch, opens a tracking issue.

Browse all DevOps →

Run it inside a business

This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Finance

Research & Trading Desk

Governance-first research, execution, and risk — every trade on the audit trail.

Operations

Internal Operations

Runbooks, on-call, vendor management — disciplined and audited.

Software

Agent Hive runs Agent Hive

The team that built Agent Hive, exactly as it runs today.

Browse all business templates →Solutions by industry →

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.

Join the Waitlist Browse all workflows →