AI AGENTS

Replicate Successor Auto-Bump Merge Request with Eval Gate

When a pinned Replicate version is deprecated and its successor passes the golden-prompt eval clean, the agent opens a GitLab merge request that bumps the pinned hash…

CategoryAI Agents
Enginepaperclip
Difficultyadvanced
Triggerschedule
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerSchedule detects a deprecated pinned version
  • ActionDry-run successor version on golden promptsReplicateReplicate
  • LogicGate on whether all prompts pass drift tolerance
  • ActionEdit pinned hash and open GitLab merge request on passGitLabGitLab
  • ActionOpen GitLab review issue with failing diffs on failGitLabGitLab
  • OutputNotify Slack with link and eval verdictSlack

What it does

This agent closes the loop from detection to code change. On a deprecation, it dry-runs the successor version against your golden prompts; if every prompt passes the drift gate, it edits the config file to bump the pinned version hash and opens a ready-to-merge GitLab merge request. If any prompt regresses, it instead opens a human-review ticket rather than shipping a risky bump.

When to use it

Use it when you trust a clean eval to auto-prepare the code change, so trivial successor bumps merge in minutes while only genuine regressions reach a human.

How it works

  1. 1A schedule detects a deprecated pinned version.
  2. 2The agent dry-runs the successor against the golden prompt set.
  3. 3A logic gate checks whether all prompts pass within drift tolerance.
  4. 4On pass, the agent edits the pinned hash in the repo file and opens a GitLab merge request.
  5. 5On fail, it opens a GitLab review issue with the failing diffs instead.
  6. 6It notifies Slack with the MR or issue link and the eval verdict.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect ReplicateImage, video, and model inference.
  2. 2
    Connect GitLabRepos, MRs, pipelines, registry.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.