AI AGENTS
Datadog Monitor Triage with Dry-Run First
When a Datadog monitor alerts, an agent reads the linked Confluence runbook, executes the read-only diagnostic steps automatically.
How it runs
The automated pipeline, trigger to output.
- TriggerDatadog monitor alertDatadog
- ActionLoad linked runbook from ConfluenceConfluence
- LogicClassify steps as read-only vs mutating
- ActionAuto-run read-only diagnosticsShell
- ActionPost findings and dry-run preview to SlackSlack
- LogicWait for approval on mutating fix
- OutputRun approved fix and report resultShell
What it does
Separates safe diagnostics from risky fixes. On a Datadog alert the agent auto-runs the read-only investigation steps from the runbook, gathers the output, and then presents the proposed mutating fix as a preview that a human must approve before execution.
When to use it
Use it when most of your triage is harmless inspection (check disk, tail logs, query metrics) but the actual fix is dangerous (restart, scale, drain). It removes the busywork while keeping a human on the trigger for the one step that matters.
How it works
- 1A Datadog monitor alert triggers the workflow with the monitor ID and tags.
- 2The agent loads the runbook page linked in the alert from Confluence.
- 3Logic classifies each runbook step as read-only or mutating.
- 4Read-only diagnostics run immediately via shell and the output is collected.
- 5The agent posts findings plus a dry-run preview of the mutating fix to Slack and waits for approval.
- 6On approval the mutating command runs; the final result is posted back to the thread.
Set it up
What you configure once, before turning it on.
- 1Connect DatadogMetrics, traces, log search.
- 2Connect ConfluenceSpaces, pages, blueprints.
- 3Connect ShellRun sandboxed commands inside the workspace.
- 4Connect SlackChannels, DMs, threads, mentions.
- 5Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 6Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 7Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More AI Agents workflows
Custom Metrics Cardinality Spike Pager
A webhook from a Datadog monitor fires when custom-metric cardinality jumps; an agent pinpoints the offending metric and tag, estimates the added cost.
Sentry-to-Confluence Runbook Updater
When a Sentry issue is resolved, the agent finds the matching Confluence runbook page and proposes an inline update with the verified fix.
Stale Doc-PR Chaser for Runbook Gaps
On a daily schedule the agent finds runbook doc PRs that were opened from resolved incidents but never reviewed, summarizes what each one fixes.
Resolved Incident to Public Troubleshooting Doc
For customer-facing errors resolved in Sentry, the agent drafts a sanitized troubleshooting entry and opens a PR to your ReadMe documentation.
On-Call Runbook Gap Closer: Resolved Sentry Issues to Doc PRs
An agent reads each newly resolved Sentry issue, compares the actual fix against your existing runbook, and opens a GitHub PR adding the missing remediation steps.
Weekly On-Call Doc-Gap Digest
Each week the agent reviews every Sentry issue resolved in the last 7 days, ranks the ones whose runbook coverage is missing or thin.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
