DEVOPS

AI visual judge that scores previews and pages on-call for breakage

Sends preview screenshots to a vision model that judges layout, broken images, and overflow.

CategoryDevOps
Enginesim
Difficultyadvanced
Triggerwebhook
Steps6
Setup~25 min

How it runs

The automated pipeline, trigger to output.

  • TriggerVercel preview deploy readyVercelVercel
  • ActionCapture route screenshotsBrowserbase
  • ActionVision model scores layout and defectsOpenAI
  • LogicBranch: score below pass threshold?
  • ActionOpen PagerDuty incident with findingsPagerDutyPagerDuty
  • OutputSet failing GitHub commit statusGitHubGitHub

What it does

Instead of relying only on pixel diffs, this workflow asks a vision model to act as a QA reviewer. It captures each critical route on the Vercel preview and prompts the model to flag broken layouts, missing or 404 images, text overflow, and obviously broken components, returning a structured score and an issue list. Builds below the quality bar are treated as production incidents.

When to use it

Use it when pixel diffing misses semantic breakage, like an image that loads as a gray box or a button that overflows its container on a fresh page that has no baseline. It suits teams who want a judgment call on net-new pages where a baseline comparison is impossible.

How it works

  1. 1A Vercel preview-ready webhook starts the run.
  2. 2A headless browser captures full-page screenshots of the configured routes.
  3. 3Each screenshot is sent to a vision model with a rubric to score quality and list defects.
  4. 4A branch checks whether the aggregate score is below the pass threshold.
  5. 5If it fails, a PagerDuty incident is opened with the offending pages and findings.
  6. 6A failing GitHub commit status is set so the deploy cannot be promoted.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect VercelDeploys, runtime logs, analytics.
  2. 2
    Connect BrowserbaseHeadless browsers, sessions, replays.
  3. 3
    Connect OpenAIModels, embeddings, files.
  4. 4
    Connect PagerDutyIncidents, on-call, escalations.
  5. 5
    Connect GitHubRepos, issues, pull requests, actions.
  6. 6
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  7. 7
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  8. 8
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.