DEVOPS

Nightly Flaky-Test Scan from JUnit Reports with Linear Rollup

Each night this workflow scans stored JUnit XML results, computes per-test flake rates over a rolling window, auto-skips any test above the threshold for one cycle.

CategoryDevOps
Enginesim
Difficultyadvanced
Triggerschedule
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNightly schedule fires
  • ActionPull rolling-window JUnit reports from S3AWS S3
  • LogicCompute flake rate and select tests over threshold
  • ActionCommit one-cycle skip annotations to the repoGitHubGitHub
  • OutputCreate or update a Linear issue per quarantined testLinearLinear

What it does

Instead of reacting to a single failure, this scheduled job looks at the trailing pass/fail history of every test and flags the ones that are statistically unreliable. Tests crossing the flake-rate threshold are quarantined and tracked in Linear.

When to use it

Use it when single-run detection is too noisy and you want flakiness judged on a trend. Good for large suites where one bad run shouldn't quarantine a test but a 15 percent flake rate should.

How it works

  1. 1A nightly schedule triggers the scan.
  2. 2The workflow pulls the last N days of JUnit XML reports from the S3 results bucket.
  3. 3A logic step computes each test's flake rate and selects those above the configured threshold.
  4. 4It commits skip annotations to the repo for the selected tests, tagged for one cycle.
  5. 5For each quarantined test it creates or updates a Linear issue with the flake rate and history link, assigned to the owning team.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect AWS S3Buckets, objects, signed URLs.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Connect LinearIssues, projects, cycles, triage.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.