DATA OPS

Diff Reverse-ETL Rejects in BigQuery and File Triage Tickets

Compares the rows BigQuery tried to sync against what landed in the destination Postgres, isolates the missing reject rows, classifies the likely cause.

CategoryData Ops
Enginesim
Difficultyadvanced
Triggerschedule
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerScheduled run after BigQuery-to-Postgres sync
  • ActionQuery source row keys from BigQuery for the windowGoogle BigQueryBigQuery
  • ActionQuery destination keys present in PostgresPostgreSQLPostgres
  • LogicDiff sets, tag probable cause, group by signature
  • OutputOpen one Linear ticket per failure patternLinearLinear

What it does

When a reverse-ETL push from BigQuery to an operational Postgres database silently drops rows, this workflow reconciles source against destination, pulls the set of rows that never arrived, and groups them by failure signature (type-mismatch, constraint violation, null key). It then files one Linear ticket per distinct pattern instead of one per row, so the backlog is actionable.

When to use it

Use this when your application database is supposed to mirror a BigQuery model but you keep finding gaps and have no idea which rows or why. It produces a clean, deduplicated diff and turns it into triage work.

How it works

  1. 1A schedule runs after the BigQuery-to-Postgres sync completes.
  2. 2The flow queries the source row keys from BigQuery for the sync window.
  3. 3It queries the destination keys actually present in Postgres.
  4. 4A logic step computes the set difference to find rejected rows and tags each with a probable cause.
  5. 5Rows are grouped by failure signature and counted.
  6. 6One Linear issue is opened per signature with affected counts and example keys, labeled for the data platform team.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect PostgresAny Postgres URL — query, write, migrate.
  3. 3
    Connect LinearIssues, projects, cycles, triage.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.