DEVOPS
Auto-rollback any live Worker version on an Axiom error spike
Continuously watches Axiom for a sudden error-rate spike on the live Worker version and, when one is confirmed against a rolling baseline, instantly reverts the traffic split…
How it runs
The automated pipeline, trigger to output.
- TriggerSchedule: every 2 min
- ActionQuery Axiom for live rate vs. baselineAxiom
- LogicConfirm spike over two consecutive checks
- ActionRevert split to last-good versionCloudflare
- OutputPage team in Discord with detailsDiscord
What it does
Acts as a safety net for whatever Worker version is currently live, canary or not. On a tight schedule it pulls the live version's error rate from Axiom and compares it to a rolling baseline. A confirmed spike triggers an immediate revert of the Cloudflare traffic split back to the recorded last-good version, so a bad deploy or upstream failure self-heals in seconds.
When to use it
Use it as always-on protection independent of any specific rollout. It catches regressions that slip past a canary gate, plus error storms caused by config drift or a failing dependency rather than the code itself.
How it works
- 1A schedule fires every couple of minutes.
- 2Axiom returns the live version's current error rate and the trailing-hour baseline.
- 3A logic step flags a spike only if the rate exceeds baseline by the multiplier across two consecutive checks, suppressing single-blip noise.
- 4On a confirmed spike, Cloudflare reverts the split to the last-good version ID.
- 5The team is paged in Discord with the before/after error rates and the reverted version.
Set it up
What you configure once, before turning it on.
- 1Connect AxiomLog streams, queries, dashboards.
- 2Connect CloudflareWorkers, Pages, R2, KV — the edge stack.
- 3Connect DiscordCommunity channels + voice + bots.
- 4Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 5Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 6Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More DevOps workflows
Slack-approved pause for idle Hugging Face Spaces
On a daily scan it finds idle paid Spaces and posts an interactive Slack approval; on approve it pauses the Space and logs the decision to a GitHub issue audit trail.
Block costly Hugging Face Space hardware upgrades in PR review
When a pull request changes a Space's hardware config, it estimates the new monthly cost and posts a GitHub PR comment that flags upgrades crossing a budget ceiling.
Hugging Face Spaces idle-runtime sweep with auto-pause
On a schedule, scans all Hugging Face Spaces for ones running idle past a threshold, pauses them to stop billing, and posts a Slack summary with the estimated monthly savings.
Open a Zoom war-room from a Datadog multi-alert storm
When a Datadog monitor crosses a critical threshold, this workflow dedupes against active incidents, and only for a genuinely new outage it creates a Zoom bridge.
Auto-spin a Zoom war-room when PagerDuty hits SEV-1
When a PagerDuty incident escalates to a critical severity, this workflow creates a dedicated Zoom meeting and posts the bridge link to the incident's Slack channel so responders…
Spin up a war-room on demand from a Slack slash command
When an engineer runs a Slack command, this workflow creates a Zoom bridge, opens a tracking Sentry-linked incident, files a Linear issue for follow-up.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
