DEVOPS
Auto-rerun a single failed test to confirm flakiness
When a CI run fails on one test, dispatches a targeted rerun of just that test a few times; if it passes on retry it is confirmed flaky and quarantined, otherwise it is escalated…
How it runs
The automated pipeline, trigger to output.
- TriggerGitHub workflow_run failedGitHub
- ActionDispatch targeted rerun of failing testGitHub
- LogicTally reruns: flaky vs true failure
- ActionLabel and quarantine confirmed flakesGitHub
- OutputPage on-call for confirmed regressionsPagerDuty
What it does
This template proves whether a failure is flaky instead of guessing. When a CI run fails, it isolates the failing test and dispatches a targeted GitHub Actions rerun of only that test a configurable number of times. A mix of pass and fail confirms a flake; consistent failure confirms a real bug.
When to use it
Use it when full-suite reruns are too slow and you want fast, evidence-based confirmation before quarantining anything. It avoids quarantining tests that are actually broken.
How it works
- 1A GitHub workflow_run failure event fires the trigger.
- 2The flow extracts the single failing test and dispatches a targeted rerun workflow N times.
- 3A logic step tallies the rerun outcomes: any passes among failures means flaky; all-fail means a true regression.
- 4Confirmed flakes get the `flaky` label and are added to the quarantine list via GitHub.
- 5Confirmed regressions trigger a PagerDuty alert to the on-call owner so the breakage is handled immediately.
Set it up
What you configure once, before turning it on.
- 1Connect GitHubRepos, issues, pull requests, actions.
- 2Connect PagerDutyIncidents, on-call, escalations.
- 3Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 4Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 5Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More DevOps workflows
Block costly Hugging Face Space hardware upgrades in PR review
When a pull request changes a Space's hardware config, it estimates the new monthly cost and posts a GitHub PR comment that flags upgrades crossing a budget ceiling.
Auto-spin a Zoom war-room when PagerDuty hits SEV-1
When a PagerDuty incident escalates to a critical severity, this workflow creates a dedicated Zoom meeting and posts the bridge link to the incident's Slack channel so responders…
Page on-call when a Hugging Face Space build is stuck or errored
Polls Hugging Face Space runtime status on a schedule and opens a PagerDuty incident when a Space sits in a build or error state past a deadline, with a Slack heads-up.
Slack-approved pause for idle Hugging Face Spaces
On a daily scan it finds idle paid Spaces and posts an interactive Slack approval; on approve it pauses the Space and logs the decision to a GitHub issue audit trail.
Hugging Face Spaces idle-runtime sweep with auto-pause
On a schedule, scans all Hugging Face Spaces for ones running idle past a threshold, pauses them to stop billing, and posts a Slack summary with the estimated monthly savings.
Open a Zoom war-room from a Datadog multi-alert storm
When a Datadog monitor crosses a critical threshold, this workflow dedupes against active incidents, and only for a genuinely new outage it creates a Zoom bridge.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
