DEVOPS

Pre-Deploy Error-Rate Baseline Check With Datadog

Before a deploy job runs, it queries Datadog for the current error rate and latency of the target service and blocks or warns if the service is already degraded.

CategoryDevOps
Enginesim
Difficultyintermediate
Triggerwebhook
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerPre-deploy check requested (incoming webhook)HTTP webhook
  • ActionQuery current error rate and latencyDatadogDatadog
  • LogicCompare metrics to healthy baseline
  • LogicDecide allow vs. block deploy
  • OutputReturn decision to CI and notify SlackSlack

What it does

This workflow runs as a pre-deploy gate: it asks Datadog whether the target service is currently healthy and returns a clear allow or block decision to your CI pipeline so deploys onto an already-degraded service get stopped automatically.

When to use it

Use it when deploys sometimes land while a service is mid-incident, making it impossible to tell whether the new release or the existing problem caused the next page.

How it works

  1. 1The CI deploy job calls an incoming webhook with the service name and environment.
  2. 2A Datadog step queries the last 15 minutes of error rate, p95 latency, and saturation metrics.
  3. 3A logic step compares each metric against its healthy baseline.
  4. 4A branch returns allow when all metrics are nominal, or block when any breaches.
  5. 5The decision and the breaching metric are posted to Slack and returned in the webhook response for CI to act on.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect HTTP webhookTrigger any URL on agent actions.
  2. 2
    Connect DatadogMetrics, traces, log search.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.