ENGINEERING
Triage Flaky Tests by Datadog Flakiness Rate
On a daily schedule, pulls per-test flakiness rates from Datadog CI Visibility, quarantines any test above a threshold by filing a labeled Linear ticket.
How it runs
The automated pipeline, trigger to output.
- TriggerDaily schedule
- ActionQuery Datadog CI flakiness ratesDatadog
- LogicFilter tests above flake-rate threshold
- ActionResolve owning team from CODEOWNERSGitHub
- ActionCreate labeled Linear ticket per testLinear
- OutputPost quarantine digest to SlackSlack
What it does
Runs a daily sweep over Datadog CI Visibility flakiness metrics, ranks tests by their flake rate, and quarantines any that cross a configured threshold (e.g. >5% over 50 runs). Each quarantined test gets a Linear ticket labeled `flaky`, assigned to the owning team via CODEOWNERS.
When to use it
Use it when you have Datadog Test Optimization wired into CI and want a metrics-driven, non-reactive quarantine policy rather than chasing individual red builds. It catches slow-burn flakiness that single-run detectors miss.
How it works
- 1A daily schedule trigger kicks off the sweep.
- 2It queries Datadog CI Visibility for per-test flake rates over the trailing window.
- 3A filter keeps only tests above the configured flake-rate threshold.
- 4For each, it resolves the owning team from the repo's CODEOWNERS via the GitHub API.
- 5It creates a Linear ticket with the `flaky` label, flake-rate stats, and Datadog deep links, assigned to that team.
- 6It posts the day's quarantine digest to the engineering Slack channel.
Set it up
What you configure once, before turning it on.
- 1Connect DatadogMetrics, traces, log search.
- 2Connect LinearIssues, projects, cycles, triage.
- 3Connect GitHubRepos, issues, pull requests, actions.
- 4Connect SlackChannels, DMs, threads, mentions.
- 5Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 6Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 7Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Engineering workflows
Agent reviews model-license fit and suggests compliant swaps on the PR
When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.
Block PRs that add incompatible Hugging Face model licenses
When a pull request adds or bumps a Hugging Face model dependency, it fetches the model card license, checks it against your org's allowed-license policy.
Quarterly Logging Hygiene Audit Agent
An agent-driven quarterly sweep that surveys all Axiom datasets, builds a logging-hygiene scorecard per service.
Post-Merge Log Volume Recheck After Downsampling PR
After a log-level PR merges, waits a day then re-queries Axiom to confirm the targeted stream's volume actually dropped.
Axiom Ingest Cost Spike to Linear Triage Ticket
When Axiom ingest volume spikes beyond its baseline, identifies which service caused it and files a Linear ticket with the offending log stream, sample lines, and a downsampling…
File a Linear license-review ticket for risky model adds
When a PR introduces a Hugging Face model with a non-permissive or unknown license, it opens a Linear issue assigned to the legal-review team with the model, license.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
