ENGINEERING

Auto-quarantine flaky tests from CI re-run signal

When a GitHub Actions test job fails then passes on automatic retry, this flags the test as intermittent, tags the source file.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerwebhook
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitHub workflow_run completed webhookGitHubGitHub
  • LogicKeep tests that failed then passed on retry
  • ActionResolve test file CODEOWNERGitHubGitHub
  • ActionOpen owner-assigned flaky quarantine issueGitHubGitHub
  • OutputComment quarantined tests on the PRGitHubGitHub

What it does

It watches GitHub Actions for tests that fail on the first attempt but pass on a retry inside the same run — the classic flaky signature. Instead of letting the green re-run hide the problem, it adds a `flaky` label, comments the failing test name on the PR, and files a quarantine issue assigned to the file's CODEOWNER.

When to use it

Use it when your pipeline retries failed jobs automatically and "eventually green" builds are masking unstable tests. It turns a silent retry into a tracked, owned work item.

How it works

  1. 1A GitHub `workflow_run` webhook fires when a CI run completes.
  2. 2A logic step inspects the run's job attempts and keeps only tests that failed-then-passed across retries.
  3. 3For each flaky test, GitHub is queried for the test file's CODEOWNER.
  4. 4A GitHub issue is opened with the test name, run link, and failure log, assigned to that owner and labeled `flaky`.
  5. 5The originating PR gets a comment listing the quarantined tests so reviewers see the instability before merging.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitHubRepos, issues, pull requests, actions.
  2. 2
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  3. 3
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  4. 4
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.