AI & RAG

Slack /why-spiked Command Backed by Runbook RAG

Lets engineers type a Slack slash command naming a metric or dashboard and get an instant retrieval-augmented answer about likely causes and fixes, cited from indexed runbooks.

CategoryAI & RAG
Enginesim
Difficultyintermediate
Triggerchat
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerEngineer runs /why-spiked slash command in SlackSlack
  • ActionQuery Datadog for metric values and anomaliesDatadogDatadog
  • ActionVector-search runbook chunks for relevant stepsPostgreSQLPostgres
  • ActionCompose cited diagnosis from retrieved textOpenAI
  • OutputReply in Slack thread with answer and citationsSlack

What it does

Gives anyone in Slack a self-serve way to ask why a metric is misbehaving. An engineer runs the command with a metric name or dashboard link; the flow gathers current Datadog readings, retrieves matching runbook guidance, and replies in-thread with a cited diagnosis — no alert required.

When to use it

Use it for the constant stream of 'is this normal?' questions in team channels, or when someone is investigating a slow dashboard outside of a formal incident. It puts the runbook knowledge base one command away instead of behind a wiki search.

How it works

  1. 1An engineer invokes the Slack slash command with a metric, monitor, or dashboard reference.
  2. 2The flow queries Datadog for that metric's recent values, anomalies, and related tags.
  3. 3It runs a vector search over runbook chunks in Postgres to find relevant troubleshooting steps.
  4. 4An LLM composes an answer that explains probable causes and quotes only the retrieved runbook text.
  5. 5The reply is posted back into the same Slack thread with citation links and the live metric snapshot.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect SlackChannels, DMs, threads, mentions.
  2. 2
    Connect DatadogMetrics, traces, log search.
  3. 3
    Connect PostgresAny Postgres URL — query, write, migrate.
  4. 4
    Connect OpenAIModels, embeddings, files.
  5. 5
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  6. 6
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  7. 7
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.