DEVOPS

Datadog Cache-Efficiency Alert to GitLab Rollback Triage

Receives a Datadog monitor alert when Cloudflare cache hit ratio breaches its SLO, then auto-triages the suspect config change and drafts a GitLab rollback MR for human approval.

CategoryDevOps
Enginesim
Difficultyintermediate
Triggerwebhook
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerDatadog cache-SLO monitor webhookDatadogDatadog
  • ActionConfirm ratio + list ruleset changes in window (Cloudflare)CloudflareCloudflare
  • LogicConfig-driven breach (recent ruleset edit)?
  • ActionMap ruleset change to GitLab commitGitLabGitLab
  • ActionOpen draft revert MR linking the alertGitLabGitLab
  • OutputNotify channel with alert + suspect commit + MRSlack

What it does

Instead of polling, this workflow reacts to a Datadog monitor you already own. When Datadog fires a cache-hit-ratio SLO breach via webhook, the workflow pulls the alert window, asks Cloudflare which cache rules changed in that window, matches the change to a GitLab commit, and drafts a rollback merge request in draft state for an engineer to review and merge.

When to use it

Use it when your cache SLOs already live in Datadog and you want the alert to do more than page someone — you want it to arrive with the likely root-cause commit and a pre-built rollback attached. Keeps Datadog as the single source of alerting truth.

How it works

  1. 1Datadog monitor webhook triggers the workflow on a cache-hit-ratio breach.
  2. 2Cloudflare confirms the current ratio and lists ruleset changes inside the alert window.
  3. 3A logic step verifies the breach is config-driven (a recent ruleset edit exists) rather than traffic-driven.
  4. 4GitLab maps the ruleset change to its originating commit.
  5. 5GitLab opens a draft revert MR linking the Datadog alert.
  6. 6Slack notifies the channel with the alert, suspect commit, and draft MR.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect CloudflareWorkers, Pages, R2, KV — the edge stack.
  3. 3
    Connect GitLabRepos, MRs, pipelines, registry.
  4. 4
    Connect SlackChannels, DMs, threads, mentions.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.