ENGINEERING

Flaky-Test History Warehouse and Weekly Report

Captures every flaky-test detection into BigQuery and emails engineering leads a weekly report of the top offenders, trends, and quarantine throughput.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerwebhook
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerFlaky-test detection webhookHTTP webhook
  • ActionNormalize detection payload
  • ActionInsert record into BigQuery history tableGoogle BigQueryBigQuery
  • ActionWeekly: query top offenders and trendsGoogle BigQueryBigQuery
  • LogicFormat ranked report with trend deltas
  • OutputEmail weekly report to eng leadsGmailGmail

What it does

This workflow turns scattered flaky-test events into a durable record. Each detection is written to a BigQuery table, and once a week it aggregates the data into a leadership report covering the worst-offending tests, week-over-week flake trends, and how fast tests move in and out of quarantine.

When to use it

Use it when you want to manage flakiness as a metric, not anecdotes — to justify test-platform investment, spot suites that are degrading, and track whether your quarantine process is actually shrinking the backlog.

How it works

  1. 1An incoming webhook receives a flaky-test detection event from your CI workflows.
  2. 2The flow normalizes the payload (test name, suite, flake rate, last author, commit).
  3. 3It inserts the record into a BigQuery flaky-test history table.
  4. 4On a weekly schedule, a separate path queries BigQuery for top offenders and trend deltas.
  5. 5A logic step formats the aggregates into a ranked report with sparklines.
  6. 6It emails the report to engineering leads via Gmail.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect HTTP webhookTrigger any URL on agent actions.
  2. 2
    Connect BigQueryDatasets, queries, schemas.
  3. 3
    Connect GmailRead, draft, send, label.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.