ENGINEERING

GitLab CI Severe Slowdown PagerDuty Escalation

Fires when a finished GitLab pipeline blows past a hard duration ceiling, confirms it is a sustained regression and not a one-off.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerwebhook
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerGitLab pipeline finishedGitLabGitLab
  • LogicCheck duration against hard ceiling; stop if under
  • ActionFetch recent runs to confirm sustained regressionGitLabGitLab
  • ActionBisect worst stage from job timingsGitLabGitLab
  • OutputTrigger PagerDuty incident with stage detailPagerDutyPagerDuty

What it does

This is the urgent-path counterpart to trend watching. When a pipeline exceeds an absolute duration ceiling, it verifies the slowdown is sustained across the last few runs (so a single flaky run does not page anyone), identifies the worst stage, and opens a PagerDuty incident routed to the platform on-call.

When to use it

Use it when a slow CI directly blocks releases and a multi-minute regression is an operational emergency, not a backlog item. It guards the page with a confirmation check so on-call only wakes for real, repeating regressions.

How it works

  1. 1A GitLab pipeline webhook fires on completion.
  2. 2A logic step checks the total duration against the hard ceiling; runs under it stop.
  3. 3The flow fetches the last few runs to confirm the slowdown persists rather than being a single spike.
  4. 4It bisects the per-job timings to the stage carrying the largest share of the overage.
  5. 5A PagerDuty incident is triggered with the stage, duration, and pipeline link in the payload.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect GitLabRepos, MRs, pipelines, registry.
  2. 2
    Connect PagerDutyIncidents, on-call, escalations.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.