DATA OPS

BigQuery Regression LLM Root-Cause Explainer

On a detected cost spike it sends the old and new query SQL plus job stats to an LLM, which explains in plain English why the query got more expensive and suggests a concrete fix.

CategoryData Ops

Enginesim

Difficultyadvanced

Triggerschedule

Steps5

Setup~25 min

How it runs

The automated pipeline, trigger to output.

TriggerDaily schedule
ActionFind largest regressor + job statsBigQuery
ActionPull current and previous SQLGitHub
ActionLLM explains root cause + suggests fixOpenAI
OutputSend diagnosis to SlackSlack

What it does

Turns a raw slot-hour spike into a human-readable root-cause: it feeds the before/after SQL and BigQuery job statistics to an LLM that explains the regression (e.g. a dropped partition filter, a new cross join) and proposes a fix.

When to use it

When your team can detect cost spikes but loses time diagnosing *why* a query got slower. Use it to get a first-pass diagnosis attached to every regression alert.

How it works

1A scheduled trigger fires daily.
2A BigQuery query identifies the scheduled query with the largest slot-hour increase versus baseline, along with bytes scanned and stage timing.
3A GitHub action pulls the current and previous SQL for that query.
4An OpenAI step receives both SQL versions and the job stats and returns a root-cause explanation plus a suggested optimization.
5A Slack message delivers the spike metrics, the LLM diagnosis, and the proposed fix.

Set it up

What you configure once, before turning it on.

1
Connect BigQueryDatasets, queries, schemas.
2
Connect GitHubRepos, issues, pull requests, actions.
3
Connect OpenAIModels, embeddings, files.
4
Connect SlackChannels, DMs, threads, mentions.
5
Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
6
Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
7
Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

More Data Ops workflows

Snowflake column type-drift sentinel with Linear fix ticket

Snapshots the data types of every column in your tracked Snowflake schemas on a schedule, diffs against the last snapshot.

Daily BigQuery Scheduled-Query Cost Attribution to Owners

Each morning, totals the prior day's on-demand bytes-billed per scheduled query, maps each query to its owner from a label, and posts a per-owner cost leaderboard to Slack.

BigQuery dropped/renamed column sentinel with PagerDuty incident

Detects when a column is dropped or renamed in your governed BigQuery datasets and, because that breaks downstream queries hard, pages the on-call via PagerDuty and posts…

PR-time Snowflake schema contract check on dbt model changes

When a pull request changes a dbt model, it compares the model's declared output columns against the live Snowflake table it will replace and blocks the merge with a GitHub check…

Agent-triaged warehouse drift with impact analysis and runbook update

On a webhook from your warehouse audit log, an agent investigates the changed column, traces which downstream models and dashboards depend on it.

Cross-warehouse replication schema mismatch reconciler

Compares the column shape of mirrored tables between BigQuery and Snowflake and, when a replicated table has drifted out of sync between the two, opens an Asana task for the data…

Browse all Data Ops →

Run it inside a business

This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Software

Agent Hive runs Agent Hive

The team that built Agent Hive, exactly as it runs today.

Marketing

Content Marketing Agency

SEO, blogs, social, and reporting on autopilot.

Operations

Internal Operations

Runbooks, on-call, vendor management — disciplined and audited.

Browse all business templates →Solutions by industry →

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.

Join the Waitlist Browse all workflows →