MARKETING

Auto-Correct Malformed UTMs in the BigQuery Click Tracker

Runs nightly over your BigQuery clickstream, finds rows with malformed UTM parameters, normalizes them against the canonical taxonomy.

CategoryMarketing
Enginesim
Difficultyintermediate
Triggerschedule
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNightly schedule fires
  • ActionQuery BigQuery for rows with non-canonical UTMsGoogle BigQueryBigQuery
  • LogicNormalize values via alias and casing ruleset
  • LogicBranch: confident correction vs. ambiguous
  • ActionWrite corrected rows to cleaned attribution tableGoogle BigQueryBigQuery
  • OutputAppend per-correction audit logGoogle BigQueryBigQuery

What it does

Fixes the data after the fact where it actually hurts: analytics. On a schedule it scans the raw click/event table in BigQuery, detects UTM values that don't match the approved taxonomy (wrong casing, trailing spaces, known aliases like "fb" for "facebook", deprecated source names), and applies deterministic correction rules. Corrected rows are written to a clean attribution table, and a small audit log records every change.

When to use it

Use it when historical and incoming clickstream data is fragmenting your channel reports because of inconsistent tagging you can't fix at the source. Best for teams whose marketing data already lands in BigQuery and who need a reliable normalization layer feeding dashboards.

How it works

  1. 1A nightly schedule fires the workflow.
  2. 2It queries BigQuery for rows in the raw events table whose UTM values fall outside the approved set.
  3. 3Logic maps each malformed value through a normalization ruleset (alias table, casing, whitelist) to its canonical form.
  4. 4Rows that can be confidently corrected are written to the cleaned attribution table; ambiguous ones are flagged for review.
  5. 5An audit row per correction is appended to BigQuery so the cleanup is fully traceable.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  3. 3
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  4. 4
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.