AI & RAG

Weekly Audit of Answer-Bot Grounding and Citations

Samples the past week of answer-bot responses, re-verifies each cited claim against the frozen corpus with an LLM judge.

CategoryAI & RAG
Enginesim
Difficultyadvanced
Triggerschedule
Steps5
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerWeekly scheduled audit run
  • ActionRead sampled answers and cited chunk IDs from Supabase logSupabaseSupabase
  • ActionScore citation faithfulness with OpenAI judgeOpenAI
  • LogicCollect answers below the faithfulness threshold
  • OutputPost flagged-answers report to SlackSlack

What it does

Keeps your grounded answer bots honest. Each week it pulls a sample of logged answers, re-checks whether every cited passage truly supports the claim it backs, and scores each answer for faithfulness. Answers that drift from their sources are flagged so Compliance can review before users are misled.

When to use it

Use it as an ongoing quality gate once an answer bot is live — to catch citation hallucinations, stale references, and over-confident answers that should have been refusals. Pairs well with the corpus-freeze indexer for a closed audit loop.

How it works

  1. 1A weekly scheduled run starts the audit.
  2. 2A sample of recent answers and their cited chunk IDs is read from the Supabase answer log.
  3. 3For each answer, the cited passages are re-fetched and an OpenAI judge scores whether they genuinely support the claim.
  4. 4A logic step collects answers scoring below the faithfulness bar.
  5. 5A Slack report posts the flagged answers with scores and links for human review.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect SupabaseTables, auth, storage, edge functions.
  2. 2
    Connect OpenAIModels, embeddings, files.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.