SUMMARIZATION

Axiom Log-Cost Spike Detector with AI Root-Cause to PagerDuty

When any service's hourly log ingest jumps above its baseline, summarizes the likely cause from recent log patterns and opens a PagerDuty incident with the explanation attached.

CategorySummarization
Enginesim
Difficultyintermediate
Triggerschedule
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerHourly schedule
  • ActionQuery Axiom: current vs 14d baseline ingest by serviceAxiom
  • LogicFilter to services over spike threshold
  • ActionFetch sample of dominant log linesAxiom
  • ActionSummarize likely root cause + cost impactOpenAI
  • OutputOpen PagerDuty incident with summaryPagerDutyPagerDuty

What it does

Watches Axiom log-ingest rates per service on an hourly cadence. When a service's volume exceeds its rolling baseline by a configured threshold, it samples the offending logs, asks an LLM to characterize what changed (new error loop, debug logging left on, retry storm), and files a PagerDuty incident so cost runaways get caught the same hour they start.

When to use it

Use it when a single misconfigured deploy can 10x your log bill overnight. This catches the spike while it's small and hands on-call a written hypothesis instead of a raw graph.

How it works

  1. 1An hourly schedule trigger runs the check.
  2. 2An Axiom query computes current-hour ingest by service against each service's 14-day baseline.
  3. 3A logic step filters to only services breaching the spike threshold; if none, the run ends quietly.
  4. 4For each breaching service, Axiom returns a sample of the dominant log lines.
  5. 5An OpenAI step writes a concise root-cause hypothesis and estimated incremental cost.
  6. 6A PagerDuty incident opens with the service, the delta, and the summary.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect AxiomLog streams, queries, dashboards.
  2. 2
    Connect OpenAIModels, embeddings, files.
  3. 3
    Connect PagerDutyIncidents, on-call, escalations.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.