DATA OPS

Tier-1 Pipeline Freshness PagerDuty Escalation

Watches only your business-critical tables and, when one breaches SLA by more than a grace threshold, opens a PagerDuty incident so on-call is woken instead of a Slack message…

CategoryData Ops
Enginesim
Difficultyintermediate
Triggerschedule
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerSchedule: every 5 minutes
  • ActionQuery last-load times for tier-1 tablesSnowflakeSnowflake
  • LogicKeep breaches past SLA + grace threshold
  • LogicDedupe against open incidents
  • OutputOpen or update PagerDuty incidentPagerDutyPagerDuty

What it does

This is the high-severity sibling of a Slack watchdog. It monitors a short list of tier-1 tables that feed billing, exec reporting, or customer-facing data. A breach only escalates if the table is past SLA by more than a configured grace period, which prevents flapping. When it does fire, it opens a PagerDuty incident with the table, lag, and downstream blast radius.

When to use it

Use it for the handful of tables where a missed load is a genuine incident, not a chore. Pair it with the broader Slack watchdog so noisy tables stay in Slack and only the critical ones page.

How it works

  1. 1A schedule runs frequently (for example every 5 minutes).
  2. 2It queries Snowflake for last-load times of the tier-1 table set.
  3. 3A logic gate keeps only tables past SLA beyond the grace threshold.
  4. 4It checks for an existing open incident to avoid duplicates.
  5. 5It opens or updates a PagerDuty incident with lag and blast radius.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect SnowflakeWarehouses, queries, shares.
  2. 2
    Connect PagerDutyIncidents, on-call, escalations.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.