DATA OPS
BigQuery Cost Anomaly Root-Cause Triage Agent
When a cost anomaly webhook fires, an agent investigates the offending BigQuery query — inspecting plan, partition usage, and recent edits.
How it runs
The automated pipeline, trigger to output.
- TriggerCost-anomaly webhook receivedHTTP webhook
- ActionInspect query plan, partition + cluster usageBigQuery
- ActionPull recent edit history for the queryBigQuery
- LogicReason over evidence to root-cause + rank fix
- ActionWrite root-cause analysis to Notion triage pageNotion
- OutputNotify query owner in SlackSlack
What it does
Goes beyond detection to diagnosis. On a cost-anomaly signal, an agent pulls the offending query's execution plan, checks whether it scans unpartitioned data or missing clustering, reviews who changed it recently and how, then writes a human-readable root-cause analysis: what regressed, the likely cause, and a concrete fix (add a partition filter, materialize a CTE, narrow a SELECT *). The write-up lands in a Notion triage page and the owner gets pinged in Slack.
When to use it
Use this when raw alerts create triage toil and your data team spends mornings reverse-engineering why a query got expensive. The agent does the first-pass investigation so humans start from a hypothesis.
How it works
- 1A cost-anomaly webhook triggers the agent.
- 2The agent queries BigQuery for the job's plan, bytes scanned, and partition/cluster usage.
- 3It pulls recent edit history to correlate the regression with a change.
- 4It reasons over the evidence to produce a root cause and ranked fix recommendation.
- 5The analysis is written to a Notion triage page and the owner is notified in Slack.
Set it up
What you configure once, before turning it on.
- 1Connect HTTP webhookTrigger any URL on agent actions.
- 2Connect BigQueryDatasets, queries, schemas.
- 3Connect NotionPages, databases, comments.
- 4Connect SlackChannels, DMs, threads, mentions.
- 5Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 6Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 7Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Data Ops workflows
Snowflake column type-drift sentinel with Linear fix ticket
Snapshots the data types of every column in your tracked Snowflake schemas on a schedule, diffs against the last snapshot.
Daily BigQuery Scheduled-Query Cost Attribution to Owners
Each morning, totals the prior day's on-demand bytes-billed per scheduled query, maps each query to its owner from a label, and posts a per-owner cost leaderboard to Slack.
BigQuery dropped/renamed column sentinel with PagerDuty incident
Detects when a column is dropped or renamed in your governed BigQuery datasets and, because that breaks downstream queries hard, pages the on-call via PagerDuty and posts…
PR-time Snowflake schema contract check on dbt model changes
When a pull request changes a dbt model, it compares the model's declared output columns against the live Snowflake table it will replace and blocks the merge with a GitHub check…
Agent-triaged warehouse drift with impact analysis and runbook update
On a webhook from your warehouse audit log, an agent investigates the changed column, traces which downstream models and dashboards depend on it.
Cross-warehouse replication schema mismatch reconciler
Compares the column shape of mirrored tables between BigQuery and Snowflake and, when a replicated table has drifted out of sync between the two, opens an Asana task for the data…
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
