DEVOPS

Correlate Flaky Tests with Sentry Errors Before Quarantine

Before quarantining, this workflow checks whether the failing test maps to a known Sentry issue.

CategoryDevOps
Enginesim
Difficultyadvanced
Triggerwebhook
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitHub reports a failed-then-passed testGitHubGitHub
  • ActionQuery Sentry for a matching stack signatureSentrySentry
  • LogicBranch: real bug vs. nondeterministic flake
  • ActionOpen PagerDuty incident for real-bug matchesPagerDutyPagerDuty
  • OutputQuarantine and file a ClickUp ticket for flakesClickUpClickUp

What it does

Not every flaky failure is harmless timing noise; some are real intermittent bugs surfacing in production too. This workflow distinguishes the two by cross-referencing the failing test against Sentry, so true defects get escalated instead of hidden behind a skip.

When to use it

Use it when you suspect some of your "flaky" tests are actually catching real race conditions or backend errors, and you don't want quarantine to mask a production-impacting bug.

How it works

  1. 1A GitHub Actions webhook reports a test that failed then passed on retry.
  2. 2The workflow queries Sentry for issues matching the failing test's stack signature in the same window.
  3. 3A logic branch splits on the result: a matching unresolved Sentry issue means it's a real bug, otherwise it's nondeterministic.
  4. 4Real-bug matches trigger a PagerDuty incident routed to the on-call owner with the Sentry link.
  5. 5Nondeterministic cases get quarantined for one cycle and tracked with a ClickUp ticket as the final delivery.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitHubRepos, issues, pull requests, actions.
  2. 2
    Connect SentryErrors, performance, releases.
  3. 3
    Connect PagerDutyIncidents, on-call, escalations.
  4. 4
    Connect ClickUpDocs + tasks + chats in one workspace.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.