DEVOPS

Auto-rerun a single failed test to confirm flakiness

When a CI run fails on one test, dispatches a targeted rerun of just that test a few times; if it passes on retry it is confirmed flaky and quarantined, otherwise it is escalated…

CategoryDevOps
Enginesim
Difficultyadvanced
Triggerevent
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitHub workflow_run failedGitHubGitHub
  • ActionDispatch targeted rerun of failing testGitHubGitHub
  • LogicTally reruns: flaky vs true failure
  • ActionLabel and quarantine confirmed flakesGitHubGitHub
  • OutputPage on-call for confirmed regressionsPagerDutyPagerDuty

What it does

This template proves whether a failure is flaky instead of guessing. When a CI run fails, it isolates the failing test and dispatches a targeted GitHub Actions rerun of only that test a configurable number of times. A mix of pass and fail confirms a flake; consistent failure confirms a real bug.

When to use it

Use it when full-suite reruns are too slow and you want fast, evidence-based confirmation before quarantining anything. It avoids quarantining tests that are actually broken.

How it works

  1. 1A GitHub workflow_run failure event fires the trigger.
  2. 2The flow extracts the single failing test and dispatches a targeted rerun workflow N times.
  3. 3A logic step tallies the rerun outcomes: any passes among failures means flaky; all-fail means a true regression.
  4. 4Confirmed flakes get the `flaky` label and are added to the quarantine list via GitHub.
  5. 5Confirmed regressions trigger a PagerDuty alert to the on-call owner so the breakage is handled immediately.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitHubRepos, issues, pull requests, actions.
  2. 2
    Connect PagerDutyIncidents, on-call, escalations.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.