ENGINEERING

Flake vs. Real-Error Correlator: Cross-Check CI Failures Against Sentry

When a GitLab pipeline fails, checks Sentry for matching production errors to decide if the failure reflects a real bug; if not, treats it as flaky and opens a quarantine MR…

CategoryEngineering
Enginesim
Difficultyadvanced
Triggerwebhook
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitLab pipeline failure webhookGitLabGitLab
  • ActionExtract failing test error signature from logsGitLabGitLab
  • ActionSearch Sentry for matching production errorsSentrySentry
  • LogicBranch: correlated prod error vs. clean
  • ActionEscalate as regression issue if correlatedGitLabGitLab
  • OutputOpen quarantine MR + Linear flake ticket if cleanLinearLinear

What it does

This agent avoids quarantining tests that are actually catching real bugs. On a GitLab pipeline failure it extracts the error signature and queries Sentry for matching production events. If the same failure is hurting users in prod it escalates instead of skipping; if Sentry is clean it treats the failure as flaky and quarantines it.

When to use it

Use it when your test failures sometimes mirror genuine production incidents and a naive skip would hide an active outage. The Sentry cross-check is the safety gate before any quarantine.

How it works

  1. 1A GitLab pipeline-failure webhook fires the trigger.
  2. 2The flow extracts the failing test's error signature from the job log.
  3. 3A Sentry query searches recent production events for a matching error fingerprint.
  4. 4A logic branch splits on whether a correlated prod error exists.
  5. 5If correlated, it opens a GitLab issue flagged as a real regression for immediate triage.
  6. 6If clean, it opens a GitLab quarantine MR and a Linear flake ticket noting the absence of prod impact.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitLabRepos, MRs, pipelines, registry.
  2. 2
    Connect SentryErrors, performance, releases.
  3. 3
    Connect LinearIssues, projects, cycles, triage.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.