DATA OPS
Reverse-ETL Backfill Investigation Agent
On a manually triggered backfill, an agent reconciles warehouse rows against Salesforce, investigates why specific rows dropped by reading sync logs.
How it runs
The automated pipeline, trigger to output.
- TriggerOperator triggers backfill investigation with batch ID
- ActionQuery Snowflake source and Salesforce landed rowsSalesforce
- ActionPull matching sync error logs from AxiomAxiom
- LogicCluster drops by probable root cause
- ActionDraft root-cause memo with requeue planOpenAI
- OutputPost memo to Slack for stakeholdersSlack
What it does
Handles the messy part of a backfill: not just finding which rows dropped, but explaining why. After comparing the Snowflake source set against landed Salesforce records, an agent pulls the relevant sync error logs from Axiom, groups the drops by likely cause (validation rule, missing required field, dedupe merge), and writes a plain-language root-cause memo with a concrete requeue plan.
When to use it
Use it after a large historical backfill or a botched sync run, when you need a human-readable explanation of the gaps rather than a raw ID list. It is for the moment a stakeholder asks "why did 4,000 accounts not make it" and you need an answer, not a spreadsheet.
How it works
- 1An operator triggers the run manually, supplying the backfill batch identifier.
- 2The agent queries Snowflake for the batch's source rows and Salesforce for what landed.
- 3It computes the dropped set and pulls matching sync error logs from Axiom.
- 4It clusters drops by probable root cause and reasons about each cluster.
- 5It drafts a root-cause memo with per-cluster counts and a requeue recommendation.
- 6It posts the memo to Slack for the data team and stakeholders.
Set it up
What you configure once, before turning it on.
- 1Connect SnowflakeWarehouses, queries, shares.
- 2Connect SalesforceAccounts, opportunities, cases.
- 3Connect AxiomLog streams, queries, dashboards.
- 4Connect SlackChannels, DMs, threads, mentions.
- 5Connect OpenAIModels, embeddings, files.
- 6Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 7Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 8Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More Data Ops workflows
Snowflake column type-drift sentinel with Linear fix ticket
Snapshots the data types of every column in your tracked Snowflake schemas on a schedule, diffs against the last snapshot.
Daily BigQuery Scheduled-Query Cost Attribution to Owners
Each morning, totals the prior day's on-demand bytes-billed per scheduled query, maps each query to its owner from a label, and posts a per-owner cost leaderboard to Slack.
BigQuery dropped/renamed column sentinel with PagerDuty incident
Detects when a column is dropped or renamed in your governed BigQuery datasets and, because that breaks downstream queries hard, pages the on-call via PagerDuty and posts…
PR-time Snowflake schema contract check on dbt model changes
When a pull request changes a dbt model, it compares the model's declared output columns against the live Snowflake table it will replace and blocks the merge with a GitHub check…
Agent-triaged warehouse drift with impact analysis and runbook update
On a webhook from your warehouse audit log, an agent investigates the changed column, traces which downstream models and dashboards depend on it.
Cross-warehouse replication schema mismatch reconciler
Compares the column shape of mirrored tables between BigQuery and Snowflake and, when a replicated table has drifted out of sync between the two, opens an Asana task for the data…
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
