ENGINEERING
Triage Failing PR Checks and Auto-Label Suspected Flakes
When a pull request check fails, it inspects the failure to decide whether it looks like a known flake or a real regression, labels the PR accordingly.
How it runs
The automated pipeline, trigger to output.
- TriggerGitHub check_run failed on PRGitHub
- ActionFetch failing test names and error outputGitHub
- LogicClassify flake vs regression against known-flake registry
- ActionLabel PR and post explanatory commentGitHub
- LogicProceed only if classified as flake
- OutputCreate or link Linear deflake ticketLinear
What it does
It sits on every failing PR check and answers the question developers waste time on: is this my change, or a flaky test? It matches the failing tests against a known-flake registry and failure-signature heuristics, then labels the PR `suspected-flake` or `likely-real` and comments with the reasoning. Confirmed flakes get a tracked Linear ticket.
When to use it
Use it on busy repos where contributors can't tell whether a red check should block them. It cuts the reflexive "just re-run it" and gives reviewers a defensible signal.
How it works
- 1A GitHub `check_run` failed event fires on a PR.
- 2An action fetches the failing test names and error output from the check.
- 3A logic step compares them against the known-flake registry and failure-signature rules to classify flake versus regression.
- 4A GitHub action applies the matching label and posts an explanatory PR comment.
- 5A logic gate proceeds only when the failure is classified as a flake.
- 6A Linear action creates or links a deflake ticket capturing the PR, test, and signature.
Set it up
What you configure once, before turning it on.
- 1Connect GitHubRepos, issues, pull requests, actions.
- 2Connect LinearIssues, projects, cycles, triage.
- 3Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 4Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 5Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Engineering workflows
Gate breaking API PRs behind downstream consumer acknowledgement
When a PR introduces a breaking contract change, comments the impact summary back on the PR, applies a blocking label.
Publish a versioned API changelog to Confluence on each release tag
On a new semver release tag, gathers the contract changes since the last release and writes a clean.
Agent reviews model-license fit and suggests compliant swaps on the PR
When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.
Upgrade Impact Router to Module Code Owners
Maps a dependency-bump PR's affected modules to their CODEOWNERS, then DMs each owner on Slack with only the changelog slice that touches code they own.
Re-Voice IVR Prompts on Phone-Tree Config Merge
When a phone-tree config change merges in GitHub, regenerates the ElevenLabs audio for any prompt whose script changed in the diff and opens a follow-up PR adding the new audio…
Upstream Release to Notion Upgrade Brief
When a watched package publishes a new release, fetches the release notes, maps them to the internal modules that depend on it.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
