DEVOPS

Quarantine tests when Datadog flake rate crosses a threshold

Runs on a schedule, queries Datadog CI Test Visibility for tests whose flake rate exceeds a set threshold over the past week, quarantines them in GitHub.

CategoryDevOps
Enginesim
Difficultyintermediate
Triggerschedule
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerWeekly schedule
  • ActionQuery Datadog CI Test Visibility for 7-day flake ratesDatadogDatadog
  • LogicKeep tests above flake-rate threshold and run count
  • ActionCommit quarantine annotation in GitHubGitHubGitHub
  • ActionCreate ClickUp fix task per offenderClickUpClickUp
  • OutputPost weekly quarantine digest to SlackSlack

What it does

It uses Datadog's CI Test Visibility data to find tests that have been statistically flaky over a rolling window, then isolates the worst offenders and creates a ClickUp task to fix each one.

When to use it

Use it when you already send test results to Datadog and want quarantine decisions driven by real flake-rate metrics over time rather than a single failing run. Good for teams that want a weekly cleanup pass instead of reacting to every red build.

How it works

  1. 1A weekly schedule starts the run.
  2. 2The flow queries Datadog CI Test Visibility for each test's flake rate over the trailing seven days.
  3. 3A logic step keeps only tests above the configured flake-rate threshold and minimum run count.
  4. 4For each qualifying test it commits a quarantine annotation to the repo via GitHub.
  5. 5It creates a ClickUp task in the engineering backlog with the metric, owning team, and a link to the Datadog test page.
  6. 6It posts a digest of everything quarantined this week to the channel.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Connect ClickUpDocs + tasks + chats in one workspace.
  4. 4
    Connect SlackChannels, DMs, threads, mentions.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.