ENGINEERING
Nightly flaky-test sweep from Datadog pass-rate metrics
Each night this queries Datadog CI Visibility for tests whose pass rate dipped into the flaky band over the last 7 days, tags them, and files a Linear issue assigned…
How it runs
The automated pipeline, trigger to output.
- TriggerNightly schedule
- ActionQuery Datadog CI test pass rates (7d)Datadog
- LogicKeep tests in the flaky pass-rate band
- ActionTag flaky tests as quarantine in DatadogDatadog
- OutputFile owner-assigned Linear quarantine issueLinear
What it does
It runs a scheduled sweep against Datadog CI Visibility, pulling each test's 7-day pass rate. Tests sitting in the flaky band (passing sometimes, failing sometimes — not consistently broken) are auto-tagged and routed to Linear as owned quarantine tickets, with the consistently-failing and consistently-passing tests left alone.
When to use it
Use it when you already ship CI test data to Datadog and want a daily, metric-driven view of instability rather than reacting to individual run failures.
How it works
- 1A nightly schedule triggers the sweep.
- 2Datadog CI Visibility is queried for per-test pass rate, run count, and owning service over the trailing 7 days.
- 3A logic step keeps tests in the flaky band (e.g. 5%-95% pass rate with enough runs) and drops always-pass and always-fail outliers.
- 4Each surviving test is tagged `quarantine` via the Datadog API for dashboard tracking.
- 5A Linear issue is created per flaky test, assigned to the owning team, with pass-rate trend and recent run links in the body.
Set it up
What you configure once, before turning it on.
- 1Connect DatadogMetrics, traces, log search.
- 2Connect LinearIssues, projects, cycles, triage.
- 3Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 4Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 5Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Engineering workflows
Gate breaking API PRs behind downstream consumer acknowledgement
When a PR introduces a breaking contract change, comments the impact summary back on the PR, applies a blocking label.
Publish a versioned API changelog to Confluence on each release tag
On a new semver release tag, gathers the contract changes since the last release and writes a clean.
Agent reviews model-license fit and suggests compliant swaps on the PR
When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.
Upgrade Impact Router to Module Code Owners
Maps a dependency-bump PR's affected modules to their CODEOWNERS, then DMs each owner on Slack with only the changelog slice that touches code they own.
Re-Voice IVR Prompts on Phone-Tree Config Merge
When a phone-tree config change merges in GitHub, regenerates the ElevenLabs audio for any prompt whose script changed in the diff and opens a follow-up PR adding the new audio…
Upstream Release to Notion Upgrade Brief
When a watched package publishes a new release, fetches the release notes, maps them to the internal modules that depend on it.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
