AI AGENTS
A/B Experiment Reader: Ship/Kill/Iterate from BigQuery
When an experiment hits its planned end date, an agent pulls the results from BigQuery, evaluates statistical significance and lift against guardrails.
How it runs
The automated pipeline, trigger to output.
- TriggerExperiment end date reached (schedule)
- ActionQuery results table in BigQueryBigQuery
- LogicCheck significance, lift, and guardrails
- LogicClassify ship / kill / iterate
- OutputPost verdict + rationale to SlackSlack
What it does
Reads a completed experiment's metrics straight from your BigQuery results table, judges whether the winning variant cleared significance and minimum-lift thresholds without tripping any guardrail metric, and delivers a one-word verdict — ship, kill, or iterate — backed by the numbers it used.
When to use it
Use this when experiments run on a fixed schedule and you want a consistent, bias-free first read before the team debates. It replaces the manual "someone exports the dashboard and eyeballs p-values" ritual.
How it works
- 1A scheduled trigger fires on the experiment's configured end date.
- 2A BigQuery action runs the results query, returning per-variant conversion, sample size, p-value, and guardrail deltas.
- 3A logic step checks significance (p < 0.05), minimum lift, and that no guardrail regressed beyond tolerance.
- 4The agent classifies the outcome as ship (significant positive lift), kill (significant negative or flat), or iterate (underpowered or mixed).
- 5A Slack message posts the verdict, the decisive metrics, and a plain-English rationale to the experiment channel.
Set it up
What you configure once, before turning it on.
- 1Connect BigQueryDatasets, queries, schemas.
- 2Connect SlackChannels, DMs, threads, mentions.
- 3Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 4Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 5Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More AI Agents workflows
Custom Metrics Cardinality Spike Pager
A webhook from a Datadog monitor fires when custom-metric cardinality jumps; an agent pinpoints the offending metric and tag, estimates the added cost.
Sentry-to-Confluence Runbook Updater
When a Sentry issue is resolved, the agent finds the matching Confluence runbook page and proposes an inline update with the verified fix.
Stale Doc-PR Chaser for Runbook Gaps
On a daily schedule the agent finds runbook doc PRs that were opened from resolved incidents but never reviewed, summarizes what each one fixes.
Resolved Incident to Public Troubleshooting Doc
For customer-facing errors resolved in Sentry, the agent drafts a sanitized troubleshooting entry and opens a PR to your ReadMe documentation.
On-Call Runbook Gap Closer: Resolved Sentry Issues to Doc PRs
An agent reads each newly resolved Sentry issue, compares the actual fix against your existing runbook, and opens a GitHub PR adding the missing remediation steps.
Weekly On-Call Doc-Gap Digest
Each week the agent reviews every Sentry issue resolved in the last 7 days, ranks the ones whose runbook coverage is missing or thin.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
