IT OPS
Draft an all-clear update when the underlying alert recovers
When the Datadog monitor that opened an incident recovers, this drafts a plain-language "resolved" update and routes it to Slack for a final human sign-off before closing…
How it runs
The automated pipeline, trigger to output.
- TriggerDatadog monitor recovers to OKDatadog
- LogicMatch recovery to an open public incident
- ActionDraft plain-language all-clear messageOpenAI
- ActionPost all-clear draft to Slack for sign-offSlack
- OutputPublish resolution + close incident on status pageHTTP webhook
What it does
It handles the often-forgotten end of an incident: the all-clear. When the monitor that triggered an open incident returns to OK, the workflow drafts a calm, customer-readable resolution message ("This issue has been resolved and all systems are operating normally") and asks for one last human confirmation before marking the public incident resolved.
When to use it
Use it when incidents tend to linger as "open" on your status page long after the real problem is fixed, because nobody remembers to post the closing update. It pairs naturally with an alert-to-draft opener so the same incident is cleanly opened and closed.
How it works
- 1A Datadog monitor transitions from Alert back to OK and fires the workflow.
- 2A logic step matches the recovery to an open public incident; if none is open, it stops.
- 3An LLM step drafts a short resolution update referencing the original impact and confirming normal operation.
- 4The all-clear draft is posted to Slack for sign-off, with a button that publishes the resolution and closes the incident on the status page.
Set it up
What you configure once, before turning it on.
- 1Connect DatadogMetrics, traces, log search.
- 2Connect OpenAIModels, embeddings, files.
- 3Connect SlackChannels, DMs, threads, mentions.
- 4Connect HTTP webhookTrigger any URL on agent actions.
- 5Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
- 6Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
- 7Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.
More IT Ops workflows
Recurring Sensor Fault Root-Cause Investigator
On a schedule, an agent reviews recent Monday work orders and BigQuery telemetry to identify equipment with repeating faults, drafts a root-cause hypothesis with a recommended fix.
Daily Building Anomaly Digest to MS Teams
Each morning queries BigQuery for the prior day's flagged sensor anomalies, summarizes them by site and system into a ranked briefing.
Agentic Inactive-Seat Reclamation Review
An agent investigates each idle SaaS seat by correlating SSO login gaps with HR status and ticket history, classifies it as reclaim, hold, or escalate, and drafts a reasoned…
Reconcile SSO logins against expense spend to find unmanaged tools
Joins SSO usage data with expense/payment records in Snowflake to surface tools that are being used but not paid for, or paid for but never logged.
Approved-Seat Deprovision Execution
Fires when an IT approver confirms a seat for removal, then executes deprovisioning via the IdP API and logs the action to an audit table and a Linear cleanup ticket.
HVAC Anomaly Detection to Severity-Routed Work Orders
Ingests building HVAC telemetry via webhook, flags out-of-band temperature, pressure, or runtime readings.
Run it inside a business
This workflow drops into a full company template. Import the org, and this is one of the playbooks its agents run.

Run this workflow in your colony.
14-day trial. No DevOps. No Sales call. Provisioned in under a minute.
