DATA OPS

BigQuery schema guard for dbt sources

Validates that every BigQuery source table still matches the columns your dbt sources expect, and opens a GitHub issue with a proposed fix when the schema has drifted.

CategoryData Ops
Enginesim
Difficultyadvanced
Triggerschedule
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerPre-build schedule fires
  • ActionRead dbt sources.yml from GitHub repoGitHubGitHub
  • ActionFetch actual columns from BigQuery INFORMATION_SCHEMAGoogle BigQueryBigQuery
  • LogicCompare declared contract to actual schema
  • OutputOpen GitHub issue with broken sources and yml patchGitHubGitHub

What it does

Reads the column contract declared in your dbt `sources.yml`, then checks each referenced BigQuery table to confirm those columns still exist with compatible types. When a source has drifted out from under your models, it files a GitHub issue describing exactly which source and which columns broke.

When to use it

Your dbt project depends on raw tables loaded by Fivetran, Airbyte, or a custom pipeline. You want CI-style confidence that source schemas haven't changed before you run a build that would fail mid-DAG.

How it works

  1. 1A scheduled trigger runs ahead of your nightly dbt build.
  2. 2Read the project's `sources.yml` from GitHub to get the expected columns per source.
  3. 3Query BigQuery's `INFORMATION_SCHEMA.COLUMNS` for each declared source table.
  4. 4A logic step compares declared vs. actual columns and flags missing or retyped fields.
  5. 5If any contract is violated, open a GitHub issue listing the source, the broken columns, and a suggested yml patch.
  6. 6If everything matches, exit quietly so the build proceeds.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect GitHubRepos, issues, pull requests, actions.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.