AI AGENTS
PagerDuty Incident to Auto-Executed Runbook with Honeycomb Verification
On a PagerDuty incident, the agent matches it to a stored runbook, executes safe diagnostic and mitigation shell steps.
How it runs
The automated pipeline, trigger to output.
- TriggerPagerDuty incident openedPagerDuty
- ActionLook up matching runbook
- LogicGate steps against auto-safe allowlist
- ActionExecute mitigation via sandboxed shellShell
- ActionVerify recovery in HoneycombHoneycomb
- OutputPost outcome and rollback note to incidentPagerDuty
What it does
Closes the loop between alert and action. When PagerDuty opens an incident, the agent finds the matching runbook, runs its low-risk mitigation steps via a controlled shell, and then confirms the fix actually worked by re-checking Honeycomb metrics — not just by assuming the command succeeded.
When to use it
Use it for well-understood, recurring incidents that already have a documented runbook (restart a stuck worker, drain a bad node, flush a cache). It removes the 3am toil while keeping a verification gate so a blind fix never masks the real problem.
How it works
- 1PagerDuty triggers on a new incident matching a runbook-tagged service.
- 2The agent looks up the corresponding runbook and parses its steps.
- 3Logic checks the steps are in the auto-safe allowlist; risky ones are skipped.
- 4It executes the approved mitigation commands through a sandboxed shell.
- 5It queries Honeycomb to confirm error rate and latency returned to baseline.
- 6It posts the outcome and a rollback recommendation back to the PagerDuty incident timeline.
Set it up
What you configure once, before turning it on.
- 1Connect PagerDutyIncidents, on-call, escalations.
- 2Connect HoneycombDistributed traces and queries.
- 3Connect ShellRun sandboxed commands inside the workspace.
- 4Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 5Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 6Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More AI Agents workflows
Custom Metrics Cardinality Spike Pager
A webhook from a Datadog monitor fires when custom-metric cardinality jumps; an agent pinpoints the offending metric and tag, estimates the added cost.
Sentry-to-Confluence Runbook Updater
When a Sentry issue is resolved, the agent finds the matching Confluence runbook page and proposes an inline update with the verified fix.
Stale Doc-PR Chaser for Runbook Gaps
On a daily schedule the agent finds runbook doc PRs that were opened from resolved incidents but never reviewed, summarizes what each one fixes.
Resolved Incident to Public Troubleshooting Doc
For customer-facing errors resolved in Sentry, the agent drafts a sanitized troubleshooting entry and opens a PR to your ReadMe documentation.
On-Call Runbook Gap Closer: Resolved Sentry Issues to Doc PRs
An agent reads each newly resolved Sentry issue, compares the actual fix against your existing runbook, and opens a GitHub PR adding the missing remediation steps.
Weekly On-Call Doc-Gap Digest
Each week the agent reviews every Sentry issue resolved in the last 7 days, ranks the ones whose runbook coverage is missing or thin.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
