ENGINEERING

Flaky-Test Quarantine Orchestrator (GitLab CI)

Detects a test that passed on retry in GitLab CI, marks it as quarantined by opening a tracking issue, and pings the last engineer who touched the test file.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerwebhook
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitLab pipeline completed webhookGitLabGitLab
  • ActionFetch job traces and parse retry resultsGitLabGitLab
  • LogicKeep only fail-then-pass-on-retry tests
  • ActionResolve last author via commits APIGitLabGitLab
  • ActionCreate or update flaky-test tracking issueGitLabGitLab
  • OutputNotify last toucher in SlackSlack

What it does

When a GitLab CI pipeline finishes, this workflow inspects the job logs for tests that failed on a first attempt but passed on retry — the signature of a flaky test. For each one it confirmed flaky, it opens (or updates) a GitLab quarantine tracking issue and notifies the engineer who last modified the test via git blame.

When to use it

Run this on every merge-request and main-branch pipeline once you've enabled job retries. It keeps intermittently failing tests from eroding trust in CI without forcing anyone to babysit the pipeline page.

How it works

  1. 1A GitLab pipeline-completion webhook fires with the pipeline and job IDs.
  2. 2The flow pulls each job's trace and parses retry markers to find tests that flipped fail-then-pass.
  3. 3A logic step filters to genuinely flaky cases (passed on retry, same commit) and drops hard failures.
  4. 4For each flaky test it resolves the last author via the GitLab commits API on the test path.
  5. 5It creates or updates a labeled `flaky-test` issue in GitLab with the test name, failure rate, and trace excerpt.
  6. 6It posts a Slack message tagging the last toucher with a link to the issue.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitLabRepos, MRs, pipelines, registry.
  2. 2
    Connect SlackChannels, DMs, threads, mentions.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.