ENGINEERING
Auto-Quarantine Tests Crossing a Datadog Flakiness Threshold
On a daily schedule, queries Datadog CI Visibility for tests whose flakiness rate exceeds a threshold, quarantines each one in the repo.
How it runs
The automated pipeline, trigger to output.
- TriggerDaily schedule
- ActionQuery Datadog CI Visibility for flaky tests over thresholdDatadog
- LogicFilter out already-quarantined tests
- ActionOpen quarantine PR for each offenderGitHub
- ActionCreate Linear issue with flake rate and deadlineLinear
- OutputPost quarantine summary to SlackSlack
What it does
Uses Datadog CI Visibility flaky-test metrics rather than a single failure to decide what to quarantine. Any test whose flake rate over the trailing window crosses your threshold gets quarantined and an owner gets a data-backed ticket.
When to use it
Use this when single-failure triggers are too noisy and you want to quarantine only persistently unreliable tests, with the evidence (flake percentage, run count) attached so owners can't dispute it.
How it works
- 1A scheduled trigger runs every morning.
- 2The workflow queries Datadog CI Visibility for tests with flakiness rate above the configured threshold over the last 7 days.
- 3A logic step filters out tests already quarantined to avoid duplicate tickets.
- 4For each remaining offender it opens a GitHub PR adding the quarantine annotation.
- 5It creates a Linear issue per test, embedding the flake rate, sample count, and a link to the Datadog flaky-test view, with a re-enable deadline.
- 6A summary of all quarantined tests is posted to the team's engineering Slack channel.
Set it up
What you configure once, before turning it on.
- 1Connect DatadogMetrics, traces, log search.
- 2Connect GitHubRepos, issues, pull requests, actions.
- 3Connect LinearIssues, projects, cycles, triage.
- 4Connect SlackChannels, DMs, threads, mentions.
- 5Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 6Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 7Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Engineering workflows
Agent reviews model-license fit and suggests compliant swaps on the PR
When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.
Block PRs that add incompatible Hugging Face model licenses
When a pull request adds or bumps a Hugging Face model dependency, it fetches the model card license, checks it against your org's allowed-license policy.
Quarterly Logging Hygiene Audit Agent
An agent-driven quarterly sweep that surveys all Axiom datasets, builds a logging-hygiene scorecard per service.
Post-Merge Log Volume Recheck After Downsampling PR
After a log-level PR merges, waits a day then re-queries Axiom to confirm the targeted stream's volume actually dropped.
Axiom Ingest Cost Spike to Linear Triage Ticket
When Axiom ingest volume spikes beyond its baseline, identifies which service caused it and files a Linear ticket with the offending log stream, sample lines, and a downsampling…
File a Linear license-review ticket for risky model adds
When a PR introduces a Hugging Face model with a non-permissive or unknown license, it opens a Linear issue assigned to the legal-review team with the model, license.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
