AI AGENTS

A/B Experiment Reader: Ship/Kill/Iterate from BigQuery

When an experiment hits its planned end date, an agent pulls the results from BigQuery, evaluates statistical significance and lift against guardrails.

CategoryAI Agents
Enginesim
Difficultyintermediate
Triggerschedule
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerExperiment end date reached (schedule)
  • ActionQuery results table in BigQueryGoogle BigQueryBigQuery
  • LogicCheck significance, lift, and guardrails
  • LogicClassify ship / kill / iterate
  • OutputPost verdict + rationale to SlackSlack

What it does

Reads a completed experiment's metrics straight from your BigQuery results table, judges whether the winning variant cleared significance and minimum-lift thresholds without tripping any guardrail metric, and delivers a one-word verdict — ship, kill, or iterate — backed by the numbers it used.

When to use it

Use this when experiments run on a fixed schedule and you want a consistent, bias-free first read before the team debates. It replaces the manual "someone exports the dashboard and eyeballs p-values" ritual.

How it works

  1. 1A scheduled trigger fires on the experiment's configured end date.
  2. 2A BigQuery action runs the results query, returning per-variant conversion, sample size, p-value, and guardrail deltas.
  3. 3A logic step checks significance (p < 0.05), minimum lift, and that no guardrail regressed beyond tolerance.
  4. 4The agent classifies the outcome as ship (significant positive lift), kill (significant negative or flat), or iterate (underpowered or mixed).
  5. 5A Slack message posts the verdict, the decisive metrics, and a plain-English rationale to the experiment channel.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect BigQueryDatasets, queries, schemas.
  2. 2
    Connect SlackChannels, DMs, threads, mentions.
  3. 3
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  4. 4
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  5. 5
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.