ENGINEERING

Auto-Quarantine Flaky Tests from Reruns

When a CI test passes only after a rerun on the same commit, files a GitHub issue, tags the test with a quarantine label, and skips it in future runs so it stops blocking merges.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerwebhook
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitHub workflow run completedGitHubGitHub
  • LogicRerun passed but original failed on same SHA?
  • ActionParse failed test ID from original run logsGitHubGitHub
  • ActionOpen or update quarantine issue with labelGitHubGitHub
  • ActionCommit test to skip-list fileGitHubGitHub
  • OutputPost quarantine summary to SlackSlack

What it does

Detects tests that fail then pass on a rerun of the *same* commit — the clearest signal of flakiness — and quarantines them automatically. It opens a tracking GitHub issue, applies a `flaky-quarantine` label, and adds the test to a skip-list so the next CI run no longer fails on it.

When to use it

Use it when intermittent failures are forcing engineers to mash the rerun button and red builds are eroding trust in CI. It removes the manual judgment call of "is this test actually broken or just flaky?" by keying off the rerun-passed signal.

How it works

  1. 1A GitHub Actions webhook fires when a workflow run completes.
  2. 2A filter checks whether the run was a rerun that passed while the original run on the same commit SHA failed.
  3. 3If so, it parses the failed test identifier from the original run's logs.
  4. 4It searches existing issues to avoid duplicates, then opens (or comments on) a quarantine issue with run links.
  5. 5It applies the `flaky-quarantine` label and appends the test to the repo's skip-list file via a commit.
  6. 6It posts a summary to the team's Slack CI channel.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitHubRepos, issues, pull requests, actions.
  2. 2
    Connect SlackChannels, DMs, threads, mentions.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.