ENGINEERING

Agentic Auto-Deflake Pull Request for Quarantined Tests

For a quarantined test, an agent reads the test and its recent failure logs, drafts a stabilization fix, and opens a draft GitHub pull request linked to the deflake issue.

CategoryEngineering
Enginepaperclip
Difficultyadvanced
Triggerevent
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerLinear issue moved to 'Deflake Ready'LinearLinear
  • ActionFetch test source and recent failure logsGitHubGitHub
  • ActionDiagnose flake cause and draft fix (agent)
  • LogicValidate diff scope is test-only
  • ActionOpen draft pull request with diagnosisGitHubGitHub
  • OutputUpdate Linear issue with PR linkLinearLinear

What it does

It takes a quarantined flaky test and attempts a first-pass fix automatically. An agent pulls the test source plus its recent failing run logs from the ledger, diagnoses common flake causes (timing, shared state, ordering, network), drafts a candidate change, and opens a draft pull request that references the tracking deflake issue for an engineer to review.

When to use it

Use it to get a head start on the deflake backlog — the agent does the tedious diagnosis and first edit so engineers review a proposal instead of starting cold. Best for suites with high flake volume and recurring patterns.

How it works

  1. 1A Linear issue moved to 'Deflake Ready' triggers the flow.
  2. 2The flow fetches the test file from GitHub and recent failure logs from the BigQuery ledger.
  3. 3An agent diagnoses the likely flake cause and drafts a targeted code change.
  4. 4Logic validates the diff touches only the test or its fixtures before proceeding.
  5. 5A draft GitHub pull request is opened with the change and a diagnosis summary.
  6. 6The Linear issue is updated with the PR link and moved to 'In Review'.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect LinearIssues, projects, cycles, triage.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Connect BigQueryDatasets, queries, schemas.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.