DATA OPS
Daily Cross-Warehouse PII Inventory Snapshot to R2
Builds a unified daily inventory of every classified PII column across both Snowflake and BigQuery, writes the versioned snapshot to R2.
How it runs
The automated pipeline, trigger to output.
- TriggerDaily scheduled snapshot
- ActionQuery Snowflake column catalogSnowflake
- ActionQuery BigQuery column catalogBigQuery
- ActionClassify and merge into unified inventoryOpenAI
- ActionWrite versioned snapshot to R2Cloudflare R2
- LogicDiff against prior snapshot for delta
- OutputPost PII delta summary to SlackSlack
What it does
Once a day it pulls the full column catalog from both Snowflake and BigQuery, classifies columns for PII, merges them into one canonical inventory, and stores a timestamped snapshot in R2. It then compares against yesterday's snapshot and reports the net change — columns added, removed, or newly reclassified — to Slack.
When to use it
Use it when you need an auditable, point-in-time record of where PII lives across multiple warehouses for compliance reviews (SOC 2, GDPR data mapping) and want a single daily signal on how your sensitive-data footprint is drifting.
How it works
- 1A daily scheduled trigger kicks off the snapshot.
- 2Query Snowflake and BigQuery column catalogs in parallel.
- 3An OpenAI step classifies each column and a logic step merges both sources into one normalized inventory.
- 4Write the inventory as a versioned, timestamped object to R2.
- 5Diff today's inventory against the most recent prior snapshot in R2 to compute added, removed, and reclassified columns.
- 6Post the day-over-day PII delta summary to Slack.
Set it up
What you configure once, before turning it on.
- 1Connect SnowflakeWarehouses, queries, shares.
- 2Connect BigQueryDatasets, queries, schemas.
- 3Connect OpenAIModels, embeddings, files.
- 4Connect Cloudflare R2Object storage, S3-compatible.
- 5Connect SlackChannels, DMs, threads, mentions.
- 6Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 7Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 8Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Data Ops workflows
Snowflake column type-drift sentinel with Linear fix ticket
Snapshots the data types of every column in your tracked Snowflake schemas on a schedule, diffs against the last snapshot.
Daily BigQuery Scheduled-Query Cost Attribution to Owners
Each morning, totals the prior day's on-demand bytes-billed per scheduled query, maps each query to its owner from a label, and posts a per-owner cost leaderboard to Slack.
BigQuery dropped/renamed column sentinel with PagerDuty incident
Detects when a column is dropped or renamed in your governed BigQuery datasets and, because that breaks downstream queries hard, pages the on-call via PagerDuty and posts…
PR-time Snowflake schema contract check on dbt model changes
When a pull request changes a dbt model, it compares the model's declared output columns against the live Snowflake table it will replace and blocks the merge with a GitHub check…
Agent-triaged warehouse drift with impact analysis and runbook update
On a webhook from your warehouse audit log, an agent investigates the changed column, traces which downstream models and dashboards depend on it.
Cross-warehouse replication schema mismatch reconciler
Compares the column shape of mirrored tables between BigQuery and Snowflake and, when a replicated table has drifted out of sync between the two, opens an Asana task for the data…
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
