AI AGENTS
Cache-hit-ratio collapse escalator
When Cloudflare's cache-hit ratio drops below a floor in real time, an agent confirms the regression, identifies the route that lost caching and the deploy that likely caused it.
How it runs
The automated pipeline, trigger to output.
- TriggerFast-interval cache-ratio check
- ActionRead current Cloudflare cache-hit ratioCloudflare
- LogicFire only on sustained sub-floor drop
- ActionFind route with largest HIT-to-MISS swingCloudflare
- ActionAttribute to recent Vercel deploy + draft incidentOpenAI
- OutputTrigger PagerDuty incident for on-callPagerDuty
What it does
Treats a sudden cache-hit-ratio collapse as the leading indicator of a runaway bill and pages on-call before the cost lands. It confirms the drop isn't noise, finds the specific route that stopped caching, and points at the deploy that changed its cache headers.
When to use it
When a cache misconfiguration can quietly 10x your origin and CDN cost within hours, and you want a real-time page with the root-cause route rather than a next-day cost surprise.
How it works
- 1A scheduled fast-interval check reads the current Cloudflare cache-hit ratio.
- 2A logic gate fires only when the ratio falls below the floor and stays there across two consecutive samples.
- 3The agent queries Cloudflare for routes with the largest HIT-to-MISS swing in the window.
- 4It cross-references the timing against the most recent Vercel deployment to attribute the regression.
- 5An OpenAI step writes a tight incident summary with the offending path and a rollback or cache-header hint.
- 6A PagerDuty incident is triggered for the on-call engineer.
Set it up
What you configure once, before turning it on.
- 1Connect CloudflareWorkers, Pages, R2, KV — the edge stack.
- 2Connect VercelDeploys, runtime logs, analytics.
- 3Connect OpenAIModels, embeddings, files.
- 4Connect PagerDutyIncidents, on-call, escalations.
- 5Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 6Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 7Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More AI Agents workflows
Custom Metrics Cardinality Spike Pager
A webhook from a Datadog monitor fires when custom-metric cardinality jumps; an agent pinpoints the offending metric and tag, estimates the added cost.
Sentry-to-Confluence Runbook Updater
When a Sentry issue is resolved, the agent finds the matching Confluence runbook page and proposes an inline update with the verified fix.
Stale Doc-PR Chaser for Runbook Gaps
On a daily schedule the agent finds runbook doc PRs that were opened from resolved incidents but never reviewed, summarizes what each one fixes.
Resolved Incident to Public Troubleshooting Doc
For customer-facing errors resolved in Sentry, the agent drafts a sanitized troubleshooting entry and opens a PR to your ReadMe documentation.
On-Call Runbook Gap Closer: Resolved Sentry Issues to Doc PRs
An agent reads each newly resolved Sentry issue, compares the actual fix against your existing runbook, and opens a GitHub PR adding the missing remediation steps.
Weekly On-Call Doc-Gap Digest
Each week the agent reviews every Sentry issue resolved in the last 7 days, ranks the ones whose runbook coverage is missing or thin.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
