ENGINEERING

Flaky-Test Triage Agent with Root-Cause Draft

An agent inspects a flaky test's recent traces, drafts a likely root-cause hypothesis (timing, ordering, network), and posts it on the quarantine issue with a suggested fix.

CategoryEngineering

Enginepaperclip

Difficultyadvanced

Triggerevent

Steps6

Setup~25 min

How it runs

The automated pipeline, trigger to output.

TriggerTest quarantined, needs-triage (GitLab webhook)GitLab
ActionFetch recent traces and test sourceGitLab
ActionClassify flakiness pattern with LLMOpenAI
LogicRoute by confidence (auto vs human)
ActionPost root-cause hypothesis on Linear issueLinear
OutputNotify assignee in SlackSlack

What it does

This agent-driven workflow goes beyond detection: it reads the failing job traces for a quarantined test, classifies the likely cause (race condition, test ordering, network timeout, shared fixture), and writes a structured root-cause hypothesis with a suggested fix directly onto the Linear quarantine issue.

When to use it

Use it to shorten the time a test sits in quarantine. Instead of an engineer starting from a blank trace, they open the issue and find a reasoned first guess and a pointer to the suspicious code.

How it works

1A GitLab webhook fires when a test is newly quarantined and labeled `needs-triage`.
2The flow fetches the last several failing traces and the test source via the GitLab API.
3The agent analyzes the traces with an LLM to classify the flakiness pattern and pinpoint suspect lines.
4A logic step routes low-confidence results to a human and high-confidence ones to auto-comment.
5It posts the root-cause hypothesis and suggested fix as a comment on the Linear issue.
6It notifies the assignee in Slack that triage notes are ready.

Set it up

What you configure once, before turning it on.

1
Connect GitLabRepos, MRs, pipelines, registry.
2
Connect OpenAIModels, embeddings, files.
3
Connect LinearIssues, projects, cycles, triage.
4
Connect SlackChannels, DMs, threads, mentions.
5
Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
6
Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
7
Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

More Engineering workflows

Agent reviews model-license fit and suggests compliant swaps on the PR

When a PR adds a Hugging Face model, an agent reads the model card and license, judges fit against your commercial-use policy.

Block PRs that add incompatible Hugging Face model licenses

When a pull request adds or bumps a Hugging Face model dependency, it fetches the model card license, checks it against your org's allowed-license policy.

Quarterly Logging Hygiene Audit Agent

An agent-driven quarterly sweep that surveys all Axiom datasets, builds a logging-hygiene scorecard per service.

Post-Merge Log Volume Recheck After Downsampling PR

After a log-level PR merges, waits a day then re-queries Axiom to confirm the targeted stream's volume actually dropped.

Axiom Ingest Cost Spike to Linear Triage Ticket

When Axiom ingest volume spikes beyond its baseline, identifies which service caused it and files a Linear ticket with the offending log stream, sample lines, and a downsampling…

File a Linear license-review ticket for risky model adds

When a PR introduces a Hugging Face model with a non-permissive or unknown license, it opens a Linear issue assigned to the legal-review team with the model, license.

Browse all Engineering →

Run it inside a business

This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Software

AI Tools Startup

Ship an AI tool, distribute on every channel, watch the unit economics.

Software

Agent Hive runs Agent Hive

The team that built Agent Hive, exactly as it runs today.

Support

Customer Support Hub

Tier-1, tier-2, refunds, and escalations — same-hour.

Browse all business templates →Solutions by industry →

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.

Join the Waitlist Browse all workflows →