AI AGENTS
On-call agent: PagerDuty incident diagnose and dry-run plan
On a new PagerDuty incident, an agent runs read-only shell diagnostics, builds a step-by-step remediation plan, and posts it as an actionable plan for the responder to approve.
How it runs
The automated pipeline, trigger to output.
- TriggerPagerDuty incident createdPagerDuty
- ActionRun read-only shell diagnostics from allowlistShell
- LogicCorrelate findings, rank likely causes
- ActionBuild numbered remediation plan with dry-run
- OutputPost plan to PagerDuty notes and SlackPagerDuty
What it does
When PagerDuty pages, the agent does the first 10 minutes of triage for you. It runs only safe, read-only shell checks, correlates them against the linked runbook, and produces a numbered remediation plan with a clear dry-run of each step.
When to use it
Use it for noisy services where responders waste time gathering the same diagnostics every page. The agent never mutates state — it proposes; you decide.
How it works
- 1A PagerDuty incident webhook delivers the alert, service, and severity.
- 2The agent selects the matching runbook and runs a fixed allowlist of read-only shell commands (logs, disk, process, health endpoints).
- 3It summarizes findings and ranks likely causes.
- 4It assembles a numbered remediation plan, marking each step as the exact command that would run.
- 5The plan and dry-run are posted to the incident's PagerDuty notes and the responder Slack channel.
- 6The responder approves or edits before any mutating action is taken elsewhere.
Set it up
What you configure once, before turning it on.
- 1Connect PagerDutyIncidents, on-call, escalations.
- 2Connect ShellRun sandboxed commands inside the workspace.
- 3Connect SlackChannels, DMs, threads, mentions.
- 4Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 5Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 6Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More AI Agents workflows
Stale Doc-PR Chaser for Runbook Gaps
On a daily schedule the agent finds runbook doc PRs that were opened from resolved incidents but never reviewed, summarizes what each one fixes.
On-Call Runbook Gap Closer: Resolved Sentry Issues to Doc PRs
An agent reads each newly resolved Sentry issue, compares the actual fix against your existing runbook, and opens a GitHub PR adding the missing remediation steps.
Datadog Bill Spike Attribution Agent
When a daily Datadog cost check detects a spend jump, an agent attributes the increase to the specific services and metric types driving it and posts a ranked breakdown to Slack.
Sentry-to-Confluence Runbook Updater
When a Sentry issue is resolved, the agent finds the matching Confluence runbook page and proposes an inline update with the verified fix.
Custom Metrics Cardinality Spike Pager
A webhook from a Datadog monitor fires when custom-metric cardinality jumps; an agent pinpoints the offending metric and tag, estimates the added cost.
Resolved Incident to Public Troubleshooting Doc
For customer-facing errors resolved in Sentry, the agent drafts a sanitized troubleshooting entry and opens a PR to your ReadMe documentation.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
