ENGINEERING

Flaky-Test Trend Report from CI Warehouse

On a schedule, queries historical CI test results in BigQuery to rank the most flaky and most-quarantined tests over the quarter.

CategoryEngineering

Enginesim

Difficultyadvanced

Triggerschedule

Steps6

Setup~25 min

How it runs

The automated pipeline, trigger to output.

TriggerMonthly schedule
ActionQuery CI history flake rates in BigQueryBigQuery
LogicIdentify chronic repeat-quarantined tests
ActionFile Linear ticket per new chronic offenderLinear
ActionPublish ranked trend report to ConfluenceConfluence
OutputShare report link in SlackSlack

What it does

Turns raw CI result history into a quarterly flakiness trend report. It queries a BigQuery table of every test run, computes flake rate and quarantine frequency per test over time, identifies tests that keep getting re-quarantined, and publishes a ranked report to Confluence. New chronic offenders get a Linear ticket.

When to use it

Use it when you have CI results landing in a data warehouse and engineering leadership wants visibility into where test reliability is trending. It surfaces the repeat offenders that point-in-time quarantine workflows keep parking but never fix.

How it works

1A monthly schedule trigger starts the report.
2It runs a BigQuery query over historical CI results to compute per-test flake rate and quarantine count.
3A branch identifies tests quarantined more than N times this quarter (chronic offenders).
4For each new chronic offender, it opens a Linear ticket labeled `flaky-chronic`.
5It renders the ranked trend tables and publishes them to a Confluence page.
6It posts the report link to the engineering leadership Slack channel.

Set it up

What you configure once, before turning it on.

1
Connect BigQueryDatasets, queries, schemas.
2
Connect LinearIssues, projects, cycles, triage.
3
Connect ConfluenceSpaces, pages, blueprints.
4
Connect SlackChannels, DMs, threads, mentions.
5
Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
6
Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
7
Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

More Engineering workflows

Agent reviews model-license fit and suggests compliant swaps on the PR

When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.

Block PRs that add incompatible Hugging Face model licenses

When a pull request adds or bumps a Hugging Face model dependency, it fetches the model card license, checks it against your org's allowed-license policy.

Quarterly Logging Hygiene Audit Agent

An agent-driven quarterly sweep that surveys all Axiom datasets, builds a logging-hygiene scorecard per service.

Post-Merge Log Volume Recheck After Downsampling PR

After a log-level PR merges, waits a day then re-queries Axiom to confirm the targeted stream's volume actually dropped.

Axiom Ingest Cost Spike to Linear Triage Ticket

When Axiom ingest volume spikes beyond its baseline, identifies which service caused it and files a Linear ticket with the offending log stream, sample lines, and a downsampling…

File a Linear license-review ticket for risky model adds

When a PR introduces a Hugging Face model with a non-permissive or unknown license, it opens a Linear issue assigned to the legal-review team with the model, license.

Browse all Engineering →

Run it inside a business

This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Software

Agent Hive runs Agent Hive

The team that built Agent Hive, exactly as it runs today.

Support

Customer Support Hub

Tier-1, tier-2, refunds, and escalations — same-hour.

Software

SaaS Operator (Pre-PMF)

Talk to users, ship features, kill what doesn't land.

Browse all business templates →Solutions by industry →

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.

Join the Waitlist Browse all workflows →