ENGINEERING
Flaky-Test History Warehouse and Weekly Report
Captures every flaky-test detection into BigQuery and emails engineering leads a weekly report of the top offenders, trends, and quarantine throughput.
How it runs
The automated pipeline, trigger to output.
- TriggerFlaky-test detection webhookHTTP webhook
- ActionNormalize detection payload
- ActionInsert record into BigQuery history tableBigQuery
- ActionWeekly: query top offenders and trendsBigQuery
- LogicFormat ranked report with trend deltas
- OutputEmail weekly report to eng leadsGmail
What it does
This workflow turns scattered flaky-test events into a durable record. Each detection is written to a BigQuery table, and once a week it aggregates the data into a leadership report covering the worst-offending tests, week-over-week flake trends, and how fast tests move in and out of quarantine.
When to use it
Use it when you want to manage flakiness as a metric, not anecdotes — to justify test-platform investment, spot suites that are degrading, and track whether your quarantine process is actually shrinking the backlog.
How it works
- 1An incoming webhook receives a flaky-test detection event from your CI workflows.
- 2The flow normalizes the payload (test name, suite, flake rate, last author, commit).
- 3It inserts the record into a BigQuery flaky-test history table.
- 4On a weekly schedule, a separate path queries BigQuery for top offenders and trend deltas.
- 5A logic step formats the aggregates into a ranked report with sparklines.
- 6It emails the report to engineering leads via Gmail.
Set it up
What you configure once, before turning it on.
- 1Connect HTTP webhookTrigger any URL on agent actions.
- 2Connect BigQueryDatasets, queries, schemas.
- 3Connect GmailRead, draft, send, label.
- 4Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 5Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 6Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Engineering workflows
Gate breaking API PRs behind downstream consumer acknowledgement
When a PR introduces a breaking contract change, comments the impact summary back on the PR, applies a blocking label.
Publish a versioned API changelog to Confluence on each release tag
On a new semver release tag, gathers the contract changes since the last release and writes a clean.
Agent reviews model-license fit and suggests compliant swaps on the PR
When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.
Upgrade Impact Router to Module Code Owners
Maps a dependency-bump PR's affected modules to their CODEOWNERS, then DMs each owner on Slack with only the changelog slice that touches code they own.
Re-Voice IVR Prompts on Phone-Tree Config Merge
When a phone-tree config change merges in GitHub, regenerates the ElevenLabs audio for any prompt whose script changed in the diff and opens a follow-up PR adding the new audio…
Upstream Release to Notion Upgrade Brief
When a watched package publishes a new release, fetches the release notes, maps them to the internal modules that depend on it.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
