ENGINEERING

On-Demand Flake Classifier via PR Comment

Triggered by a slash-command comment on a pull request, this agent pulls the failing job's logs.

CategoryEngineering
Enginepaperclip
Difficultyintermediate
Triggerevent
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerPR comment '/classify-flake'GitHubGitHub
  • ActionFetch failing logs and run historyGitHubGitHub
  • ActionClassify against flake patternsOpenAI
  • LogicDecide flaky / real / inconclusive
  • OutputReply on PR with verdict and suggestionGitHubGitHub

What it does

This workflow gives engineers an instant second opinion on a red check. When someone comments a command like /classify-flake on a PR, the agent fetches the failing job logs, matches them against known flake signatures (timeouts, race conditions, network blips) and the test's recent history, then replies in-thread with a confidence-scored verdict and, if flaky, a ready-to-apply quarantine suggestion.

When to use it

Use this when developers need a fast, evidence-based call on whether a failure is worth investigating or just noise, without leaving the pull request.

How it works

  1. 1A GitHub issue_comment containing the slash command triggers the flow.
  2. 2An action retrieves the failing check's logs and the test's recent run history.
  3. 3An agent classifies the failure against flake patterns and assigns a confidence score.
  4. 4A logic step decides flaky, real, or inconclusive.
  5. 5The output replies on the PR with the verdict and a suggested quarantine action.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitHubRepos, issues, pull requests, actions.
  2. 2
    Connect OpenAIModels, embeddings, files.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.