DEVOPS

Quarantine Tests Crossing a Datadog Flake-Rate Threshold

Runs on a schedule, queries Datadog CI Visibility for each test's flaky-rate metric.

CategoryDevOps
Enginesim
Difficultyadvanced
Triggerschedule
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNightly schedule
  • ActionQuery Datadog CI Visibility flake ratesDatadogDatadog
  • LogicFilter tests over threshold, not yet quarantined
  • ActionOpen GitHub quarantine PR with skip annotationGitHubGitHub
  • ActionPage owning team via PagerDutyPagerDutyPagerDuty
  • OutputDeliver quarantine summary

What it does

This workflow uses real flake-rate telemetry rather than a single rerun. It polls Datadog CI Visibility for per-test flakiness over a rolling window and, for any test exceeding your tolerance, opens a quarantine PR that marks the test skipped and alerts the owning team.

When to use it

Use this when you already ship test results to Datadog and want a data-driven quarantine policy — e.g. any test flaking more than 5% over 7 days gets pulled from the blocking path until fixed, with on-call notified.

How it works

  1. 1A scheduled trigger fires nightly.
  2. 2The flow queries Datadog CI Visibility for each test's flaky-rate over the trailing window.
  3. 3A logic step filters to tests above the configured threshold that are not already quarantined.
  4. 4For each, it opens a GitHub PR adding a skip/quarantine annotation and a comment with the flake-rate evidence.
  5. 5It pages the owning team through PagerDuty with the test name, rate, and PR link.
  6. 6A summary of all quarantined tests is delivered as the final output.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Connect PagerDutyIncidents, on-call, escalations.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.