DATA OPS

Correlate Stale dbt Models with Honeycomb Upstream Lag

When a BigQuery model misses its freshness SLA, queries Honeycomb for upstream ingestion/pipeline latency in the same window and posts a root-cause-attributed alert to Slack.

CategoryData Ops
Enginesim
Difficultyintermediate
Triggerschedule
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerEvery 10 minutes (schedule)
  • ActionGet freshness age of watched modelsGoogle BigQueryBigQuery
  • LogicKeep only SLA breaches
  • ActionQuery Honeycomb upstream latency for breach windowHoneycomb
  • LogicClassify breach as upstream-explained or unexplained
  • OutputPost root-cause-attributed alert to SlackSlack

What it does

Turns a bare "model is stale" alert into a diagnosed one. When a watched BigQuery dbt model breaches its freshness SLA, the flow pulls Honeycomb traces for the upstream ingestion service over the matching time window and decides whether the staleness is explained by measured upstream lag or is unexplained (likely a dbt job failure).

When to use it

Use this when stale models are usually caused by a slow or backed-up ingestion pipeline, and you want every freshness alert to arrive with the upstream evidence already attached so on-call doesn't burn time guessing.

How it works

  1. 1A schedule fires every 10 minutes.
  2. 2BigQuery returns the freshness age of each watched model; the flow keeps only SLA breaches.
  3. 3For each breach it queries Honeycomb for p95 ingestion latency and error rate on the upstream dataset during the lag window.
  4. 4A logic step labels each breach 'upstream-lag-explained' or 'unexplained-investigate'.
  5. 5It posts an enriched Slack alert with the model, age, and the Honeycomb verdict plus a deep link.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect HoneycombDistributed traces and queries.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.