DATA OPS

BigQuery Schema Drift Detector

Snapshots BigQuery dataset schemas daily and diffs them against the last known-good snapshot, opening a Linear issue for the owning team when columns are added, dropped…

CategoryData Ops
Enginesim
Difficultyadvanced
Triggerschedule
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerDaily after load window
  • ActionRead current column schema from BigQueryGoogle BigQueryBigQuery
  • ActionLoad prior snapshot from PostgresPostgreSQLPostgres
  • LogicDiff and classify breaking vs additive
  • ActionOpen Linear issue for breaking changesLinearLinear
  • OutputWrite new snapshot back to PostgresPostgreSQLPostgres

What it does

Captures the column list and types for every table in a watched BigQuery dataset, stores the snapshot, and compares it to yesterday's. When a column is dropped, renamed, retyped, or added, it classifies the change as breaking or additive and files a tracked issue so a producer change can't quietly break a downstream consumer.

When to use it

When upstream teams ship schema changes without warning the analytics or ML teams who consume those tables, causing pipeline failures or silently wrong joins. Use it to turn schema drift into a reviewable ticket instead of a 2am incident.

How it works

  1. 1A daily schedule fires after the main load window.
  2. 2The flow pulls INFORMATION_SCHEMA columns for the watched dataset from BigQuery.
  3. 3It loads the prior snapshot from Postgres and diffs column names and types.
  4. 4A logic step splits changes into breaking (drop/retype) versus additive (new nullable column).
  5. 5Breaking changes open a Linear issue assigned to the table owner with the exact diff; the new snapshot is written back to Postgres for the next run.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect PostgresAny Postgres URL — query, write, migrate.
  3. 3
    Connect LinearIssues, projects, cycles, triage.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.