ENGINEERING

Daily BigQuery slow-query regression hunter to GitHub tuning issue

Each morning, scans yesterday's BigQuery jobs against a rolling baseline, and for any query whose runtime or bytes-billed regressed past threshold.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerschedule
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerDaily schedule after jobs settle
  • ActionQuery yesterday's jobs from INFORMATION_SCHEMA.JOBSGoogle BigQueryBigQuery
  • ActionPull 14-day baseline per query hashGoogle BigQueryBigQuery
  • LogicKeep only queries past runtime/cost threshold
  • ActionFetch query plan and compute cost deltaGoogle BigQueryBigQuery
  • OutputOpen GitHub tuning issue with plan and deltasGitHubGitHub

What it does

This workflow finds queries that got slower or more expensive overnight and files an actionable tuning ticket so the regression doesn't quietly drain your BigQuery budget. Instead of a generic alert, the issue carries the offending query text, its execution-stage plan, and a before/after cost delta so an engineer can start tuning immediately.

When to use it

Run it on data platforms where scheduled queries and dbt models change often and a single bad join or dropped partition filter can 10x slan run's cost. Best for teams who want a paper trail of regressions in GitHub rather than a noisy Slack feed.

How it works

  1. 1A daily schedule fires after the prior day's jobs have settled.
  2. 2BigQuery's INFORMATION_SCHEMA.JOBS is queried for completed jobs and their total_bytes_billed and runtime.
  3. 3The same query hashes are pulled from a 14-day baseline to compute per-query deltas.
  4. 4A logic step keeps only queries that regressed beyond the runtime or cost threshold.
  5. 5For each survivor, the query plan and cost delta are formatted into a tuning report.
  6. 6A GitHub issue is opened with the plan, deltas, and a 'tuning' label as the final output.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.