OTHER

Datadog team budget breach escalation

Tracks month-to-date Datadog spend per team against assigned budgets and escalates via PagerDuty plus Slack when a team is projected to overshoot its monthly cap.

CategoryOther
Enginesim
Difficultyadvanced
Triggerschedule
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerMultiple-times-daily schedule
  • ActionFetch month-to-date cost by teamDatadogDatadog
  • LogicProject month-end and classify vs budget
  • ActionOpen PagerDuty incident for hard breachesPagerDutyPagerDuty
  • OutputPost tiered budget digest to SlackSlack

What it does

This workflow enforces per-team observability budgets. Several times a day it compares each team's month-to-date Datadog spend against its assigned budget, projects end-of-month spend from the current run rate, and escalates the teams on track to blow their cap — paging on-call for hard breaches and posting a softer heads-up for early warnings.

When to use it

Use it when teams have committed observability budgets and overruns need an owner and a response, not just a dashboard nobody checks. Best where finance has set real caps and wants enforcement with escalation tiers.

How it works

  1. 1A schedule runs the check a few times per day.
  2. 2The Datadog action fetches month-to-date cost grouped by team tag.
  3. 3A logic step projects month-end spend from the run rate and classifies each team as ok, warning, or breach against its budget.
  4. 4For hard breaches a PagerDuty action opens an incident routed to the team's service.
  5. 5The output step posts a tiered budget-status digest to Slack, highlighting warnings and confirming the paged breaches.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect PagerDutyIncidents, on-call, escalations.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.