ENGINEERING

Live p95 outlier pager with bisect-to-commit escalation

When Honeycomb fires a p95-breach trigger, correlates the spike to the most recent deploy, opens a PagerDuty incident only if the regression survives a confirmation re-check.

CategoryEngineering
Enginesim
Difficultyadvanced
Triggerevent
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerHoneycomb p95-breach triggerHoneycomb
  • ActionRe-query Honeycomb to confirm sustained regressionHoneycomb
  • LogicIf cleared exit; else correlate onset to deploy timeline
  • ActionLook up suspect commit from GitLab deploy timelineGitLabGitLab
  • OutputOpen PagerDuty incident with suspect commit and linkPagerDutyPagerDuty

What it does

This workflow listens for a Honeycomb trigger that fires when a query's p95 crosses an alerting threshold in real time. Rather than paging on the first blip, it re-queries Honeycomb a short time later to confirm the regression is sustained, then correlates the spike's onset timestamp against the GitLab deploy timeline to identify the release that most likely introduced it. If confirmed, it opens a PagerDuty incident annotated with the suspect commit and a deep link to the Honeycomb view.

When to use it

Reach for this when a slow-query regression is severe enough to warrant a real page, not just a ticket — checkout, auth, or any latency-critical path. The confirmation re-check suppresses transient noise so on-call only wakes for sustained regressions.

How it works

  1. 1Honeycomb p95-breach trigger fires.
  2. 2Re-query Honeycomb after a delay to confirm the regression is sustained.
  3. 3If it cleared, exit; otherwise correlate onset time to the GitLab deploy timeline for the suspect commit.
  4. 4Open a PagerDuty incident with the suspect commit and Honeycomb link.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect HoneycombDistributed traces and queries.
  2. 2
    Connect GitLabRepos, MRs, pipelines, registry.
  3. 3
    Connect PagerDutyIncidents, on-call, escalations.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.