ENGINEERING

Mainline flaky-storm detector: page on-call when the same test flakes repeatedly

Watches main-branch CI failures and, when one test flakes more than N times in a rolling window, pages the on-call engineer via PagerDuty and files a high-priority Linear ticket…

CategoryEngineering
Enginesim
Difficultyadvanced
Triggerwebhook
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerMain-branch CI failureGitHubGitHub
  • LogicCount flakes of this test in rolling window
  • LogicCompare to storm threshold
  • ActionTrigger PagerDuty incident for on-callPagerDutyPagerDuty
  • OutputFile high-priority linked Linear ticketLinearLinear

What it does

Distinguishes a single annoying flake from a flaky storm that's blocking everyone's merges on the main branch. It counts repeated flaky failures of the same test in a short window, and when one crosses the storm threshold it treats the situation as an incident.

When to use it

Use when a flaky test on main can grind the whole team's deploys to a halt. Routine flakes get a normal ticket elsewhere; this workflow is the loud path that wakes someone up only when a test is actively causing a pile-up.

How it works

  1. 1A GitHub webhook fires on each main-branch CI failure.
  2. 2The flow records the failure and counts how many times that test flaked in the rolling window.
  3. 3A branch checks the count against the storm threshold (e.g. 3 flakes in 2 hours).
  4. 4Below threshold, it logs and exits quietly.
  5. 5At or above threshold, it triggers a PagerDuty incident for the on-call engineer.
  6. 6It files a high-priority Linear ticket with the storm timeline and links it to the incident.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitHubRepos, issues, pull requests, actions.
  2. 2
    Connect PagerDutyIncidents, on-call, escalations.
  3. 3
    Connect LinearIssues, projects, cycles, triage.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.