ENGINEERING

Quarantine Tests Exceeding a Datadog Flake-Rate Threshold

On a daily schedule, reads per-test pass/fail metrics from Datadog CI Visibility, and for any test whose intermittent failure rate crosses a threshold.

CategoryEngineering
Enginesim
Difficultyadvanced
Triggerschedule
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerDaily schedule
  • ActionQuery per-test flake rates from Datadog CI VisibilityDatadogDatadog
  • LogicSelect tests in the intermittent-failure band
  • ActionOpen GitHub PR adding tests to quarantine listGitHubGitHub
  • ActionFile Linear deflake ticket per testLinearLinear
  • OutputReport quarantine batch to SlackSlack

What it does

It turns raw CI telemetry into action. Each day it queries Datadog CI Visibility for per-test flake rates, identifies tests that fail intermittently above your tolerance (for example, 2-15% on the same branch), and both files a Linear ticket and opens a GitHub PR that adds them to a quarantine skip list so they stop blocking the main pipeline.

When to use it

Use it when you already ship test results to Datadog and want a data-driven, no-arguments policy for what gets quarantined. It removes the judgment call about whether a test is "flaky enough."

How it works

  1. 1A daily schedule triggers the run.
  2. 2A Datadog action queries CI Visibility for each test's pass and fail counts over the trailing window.
  3. 3A logic step computes flake rate and selects tests inside the intermittent band, excluding consistently-failing (genuinely broken) tests.
  4. 4A GitHub action opens a PR adding the selected tests to the quarantine list.
  5. 5A Linear action files a deflake ticket per test linking the Datadog metric and the quarantine PR.
  6. 6A Slack message reports the day's quarantine batch.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Connect LinearIssues, projects, cycles, triage.
  4. 4
    Connect SlackChannels, DMs, threads, mentions.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.