SOCIAL MEDIA

Toxicity gate before auto-replying to social mentions

When a draft reply to an inbound social mention is generated, it is scored for toxicity by a HuggingFace classifier.

CategorySocial Media
Enginesim
Difficultyintermediate
Triggerwebhook
Steps6
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerNew brand mention receivedHTTP webhook
  • ActionDraft reply with OpenAIOpenAI
  • ActionScore draft for toxicity (HuggingFace)Hugging FaceHugging Face
  • LogicToxicity score below ceiling?
  • ActionPublish clean replySocial publishing
  • OutputRoute flagged draft to Slack for reviewSlack

What it does

This workflow puts a safety checkpoint between your AI reply-drafter and the public internet. Every draft reply to an incoming mention is run through a HuggingFace toxicity classifier. If the draft scores below your threshold it publishes automatically; if it trips the threshold it is held and pushed to a Slack channel for a human to approve, edit, or kill.

When to use it

Use it when you let an assistant respond to mentions at volume but a single off-tone reply would damage the brand. It gives you autopilot speed on the safe majority while guaranteeing a human sees anything risky before it goes live.

How it works

  1. 1A new brand mention arrives via webhook from your social listening source.
  2. 2OpenAI drafts a contextual reply to the mention.
  3. 3HuggingFace scores the draft text for toxicity and returns a probability.
  4. 4A logic step compares the score to your configured ceiling.
  5. 5If clean, the reply is published to the originating platform.
  6. 6If flagged, the draft, score, and original mention are posted to Slack for human review instead of publishing.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect HTTP webhookTrigger any URL on agent actions.
  2. 2
    Connect OpenAIModels, embeddings, files.
  3. 3
    Connect Hugging FaceModels, datasets, spaces — the open-source hub.
  4. 4
    Connect Social publishingCross-post to X, LinkedIn, Instagram, TikTok, and 4 more in one call.
  5. 5
    Connect SlackChannels, DMs, threads, mentions.
  6. 6
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  7. 7
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  8. 8
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.