DEVOPS
Auto-quarantine flaky tests on intermittent CI failures
When a GitHub Actions test run fails, checks whether the failing test passed on a retry; if so, marks it as flaky, applies a quarantine annotation.
How it runs
The automated pipeline, trigger to output.
- TriggerGitHub check_run completed for a test jobGitHub
- LogicFailed once but passed on retry?
- ActionParse test report artifact for failing test nameGitHub
- ActionAppend test to quarantine manifest in repoGitHub
- OutputOpen GitHub issue labeled flaky-test with evidenceGitHub
What it does
Watches GitHub Actions test runs and distinguishes genuine failures from intermittent flakes. When a test fails but passes on an automatic retry within the same workflow, it tags the test as flaky, adds it to a quarantine list, and files a GitHub issue so the team can track and fix it later without blocking the pipeline.
When to use it
Use this when intermittent test failures are forcing engineers to re-run CI by hand and eroding trust in the build. It keeps the main branch deployable while preserving a paper trail of every quarantined test.
How it works
- 1A GitHub `check_run` completed event fires when a workflow test job finishes.
- 2A filter checks the conclusion: only proceed if the job failed at least once but a retry of the same test passed.
- 3The flow reads the JUnit/test report artifact to extract the exact failing test name and file.
- 4It appends the test identifier to a quarantine manifest committed to the repo.
- 5It opens a GitHub issue labeled `flaky-test` with the failure logs, retry evidence, and owning team.
- 6The final step posts the quarantine summary so reviewers know the build was unblocked, not silently passed.
Set it up
What you configure once, before turning it on.
- 1Connect GitHubRepos, issues, pull requests, actions.
- 2Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 3Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 4Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More DevOps workflows
Slack-approved pause for idle Hugging Face Spaces
On a daily scan it finds idle paid Spaces and posts an interactive Slack approval; on approve it pauses the Space and logs the decision to a GitHub issue audit trail.
Block costly Hugging Face Space hardware upgrades in PR review
When a pull request changes a Space's hardware config, it estimates the new monthly cost and posts a GitHub PR comment that flags upgrades crossing a budget ceiling.
Hugging Face Spaces idle-runtime sweep with auto-pause
On a schedule, scans all Hugging Face Spaces for ones running idle past a threshold, pauses them to stop billing, and posts a Slack summary with the estimated monthly savings.
Open a Zoom war-room from a Datadog multi-alert storm
When a Datadog monitor crosses a critical threshold, this workflow dedupes against active incidents, and only for a genuinely new outage it creates a Zoom bridge.
Auto-spin a Zoom war-room when PagerDuty hits SEV-1
When a PagerDuty incident escalates to a critical severity, this workflow creates a dedicated Zoom meeting and posts the bridge link to the incident's Slack channel so responders…
Spin up a war-room on demand from a Slack slash command
When an engineer runs a Slack command, this workflow creates a Zoom bridge, opens a tracking Sentry-linked incident, files a Linear issue for follow-up.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
