DATA OPS

Auto-detect PII columns from a Postgres data catalog before BigQuery export

Looks up which BigQuery columns are classified as PII in a Postgres governance catalog, masks exactly those columns, and delivers the result to Dropbox.

CategoryData Ops
Enginesim
Difficultyadvanced
Triggerwebhook
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerExport request received via webhookHTTP webhook
  • ActionRun export query in BigQueryGoogle BigQueryBigQuery
  • ActionLook up column PII classifications in PostgresPostgreSQLPostgres
  • LogicMask columns flagged sensitive by catalog
  • ActionUpload masked file to DropboxDropboxDropbox
  • OutputWrite audit record to PostgresPostgreSQLPostgres

What it does

Instead of hardcoding which fields are sensitive, this workflow reads your column-level classification from a Postgres governance catalog. It pulls the export from BigQuery, cross-references every returned column against the catalog, and masks only the columns marked sensitive. The cleaned file is delivered to Dropbox with an audit log row written back to Postgres.

When to use it

Use it when your organization maintains a central data catalog of PII classifications and you want exports to enforce that catalog automatically, so masking stays correct even when table schemas change.

How it works

  1. 1A form webhook receives the export request with target table and filters.
  2. 2BigQuery runs the query and returns rows plus the column list.
  3. 3Postgres is queried for the PII classification of each returned column.
  4. 4A masking step redacts the columns the catalog flags as sensitive.
  5. 5The masked file is uploaded to Dropbox.
  6. 6An audit record (requester, columns masked, row count, timestamp) is inserted into Postgres for compliance.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect PostgresAny Postgres URL — query, write, migrate.
  3. 3
    Connect DropboxFiles and folders.
  4. 4
    Connect HTTP webhookTrigger any URL on agent actions.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.