DATA OPS

Critical Source Freshness Breach to PagerDuty

Every 15 minutes checks tier-1 BigQuery source tables; if a revenue- or billing-critical source has stopped loading, it pages the on-call data engineer through PagerDuty…

CategoryData Ops
Enginesim
Difficultyadvanced
Triggerschedule
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • Trigger15-minute critical-source check
  • ActionQuery tier-1 source load timesGoogle BigQueryBigQuery
  • LogicBranch on SLA breach
  • ActionBuild downstream impact lineageGoogle BigQueryBigQuery
  • ActionRaise PagerDuty incidentPagerDutyPagerDuty
  • OutputPost incident + impact to SlackSlack

What it does

Monitors only the source tables you tag as tier-1 (billing, revenue, identity) for ingestion gaps. When one of these critical sources stops receiving rows past its tight freshness threshold, it raises a PagerDuty incident so someone is paged, rather than filing a ticket that waits until morning.

When to use it

Use it for the small set of sources where a stalled pipeline means broken invoices or wrong revenue numbers, and a delayed response is unacceptable. This is the loud, page-someone counterpart to the routine Linear-ticket sentinel.

How it works

  1. 1A 15-minute schedule triggers the check.
  2. 2It queries `INFORMATION_SCHEMA` for the max load timestamp of every source tagged tier-1 in the dbt manifest.
  3. 3A branch separates sources within SLA from those breached.
  4. 4For breached sources, it builds the downstream-impact lineage list of dependent models.
  5. 5It triggers a PagerDuty incident with severity scaled to how many downstream models are affected.
  6. 6It posts the incident link plus impacted-model list to the incident Slack channel.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect PagerDutyIncidents, on-call, escalations.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.