DATA OPS

BigQuery Runaway Scheduled-Query Cost Circuit Breaker

Hourly it scans BigQuery job history for scheduled queries that scanned far more bytes than their historical baseline, alerts the team.

CategoryData Ops
Enginesim
Difficultyadvanced
Triggerschedule
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerHourly
  • ActionRead scheduled-query bytes billedGoogle BigQueryBigQuery
  • LogicFlag cost spikes vs baseline
  • ActionDisable runaway transfer configGoogle BigQueryBigQuery
  • OutputPost cost alert to SlackSlack
  • ActionPage on overspend thresholdPagerDutyPagerDuty

What it does

It detects scheduled queries whose bytes-billed suddenly spike well above their normal range, usually from an exploded join, a dropped partition filter, or a backfill gone wrong. It alerts on the anomaly and can pause the transfer config so the next scheduled run doesn't repeat the burn.

When to use it

Use it on a project where a single misbehaving scheduled query can rack up thousands in on-demand cost overnight. Ideal for finance-sensitive analytics warehouses that lack hard slot reservations.

How it works

  1. 1An hourly schedule fires.
  2. 2A BigQuery action reads `INFORMATION_SCHEMA.JOBS` for scheduled-query jobs in the last hour, pulling total bytes billed per query.
  3. 3A logic step compares each job's bytes against its trailing 30-day median and flags jobs exceeding a multiple of baseline.
  4. 4For each runaway, a BigQuery action disables the underlying transfer config to stop the next run.
  5. 5A Slack output posts the query, cost delta, and the pause action taken.
  6. 6A PagerDuty action opens an incident when the projected daily overspend crosses a dollar threshold.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect SlackChannels, DMs, threads, mentions.
  3. 3
    Connect PagerDutyIncidents, on-call, escalations.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.