DEVOPS

Auto-rerun a single failed test to confirm flakiness

When a CI run fails on one test, dispatches a targeted rerun of just that test a few times; if it passes on retry it is confirmed flaky and quarantined, otherwise it is escalated…

CategoryDevOps

Enginesim

Difficultyadvanced

Triggerevent

Steps5

Setup~25 min

How it runs

The automated pipeline, trigger to output.

TriggerGitHub workflow_run failedGitHub
ActionDispatch targeted rerun of failing testGitHub
LogicTally reruns: flaky vs true failure
ActionLabel and quarantine confirmed flakesGitHub
OutputPage on-call for confirmed regressionsPagerDuty

What it does

This template proves whether a failure is flaky instead of guessing. When a CI run fails, it isolates the failing test and dispatches a targeted GitHub Actions rerun of only that test a configurable number of times. A mix of pass and fail confirms a flake; consistent failure confirms a real bug.

When to use it

Use it when full-suite reruns are too slow and you want fast, evidence-based confirmation before quarantining anything. It avoids quarantining tests that are actually broken.

How it works

1A GitHub workflow_run failure event fires the trigger.
2The flow extracts the single failing test and dispatches a targeted rerun workflow N times.
3A logic step tallies the rerun outcomes: any passes among failures means flaky; all-fail means a true regression.
4Confirmed flakes get the `flaky` label and are added to the quarantine list via GitHub.
5Confirmed regressions trigger a PagerDuty alert to the on-call owner so the breakage is handled immediately.

Set it up

What you configure once, before turning it on.

1
Connect GitHubRepos, issues, pull requests, actions.
2
Connect PagerDutyIncidents, on-call, escalations.
3
Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
4
Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
5
Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

More DevOps workflows

Block costly Hugging Face Space hardware upgrades in PR review

When a pull request changes a Space's hardware config, it estimates the new monthly cost and posts a GitHub PR comment that flags upgrades crossing a budget ceiling.

Auto-spin a Zoom war-room when PagerDuty hits SEV-1

When a PagerDuty incident escalates to a critical severity, this workflow creates a dedicated Zoom meeting and posts the bridge link to the incident's Slack channel so responders…

Page on-call when a Hugging Face Space build is stuck or errored

Polls Hugging Face Space runtime status on a schedule and opens a PagerDuty incident when a Space sits in a build or error state past a deadline, with a Slack heads-up.

Slack-approved pause for idle Hugging Face Spaces

On a daily scan it finds idle paid Spaces and posts an interactive Slack approval; on approve it pauses the Space and logs the decision to a GitHub issue audit trail.

Hugging Face Spaces idle-runtime sweep with auto-pause

On a schedule, scans all Hugging Face Spaces for ones running idle past a threshold, pauses them to stop billing, and posts a Slack summary with the estimated monthly savings.

Open a Zoom war-room from a Datadog multi-alert storm

When a Datadog monitor crosses a critical threshold, this workflow dedupes against active incidents, and only for a genuinely new outage it creates a Zoom bridge.

Browse all DevOps →

Run it inside a business

This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Software

Agent Hive runs Agent Hive

The team that built Agent Hive, exactly as it runs today.

Marketing

Content Marketing Agency

SEO, blogs, social, and reporting on autopilot.

Operations

Internal Operations

Runbooks, on-call, vendor management — disciplined and audited.

Browse all business templates →Solutions by industry →

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.

Join the Waitlist Browse all workflows →