ENGINEERING

Weekly flake-rate scorecard from Datadog CI Visibility

Pulls per-test flakiness metrics from Datadog CI Visibility each week, ranks the worst offenders by flake rate and merge-blocking impact.

CategoryEngineering
Enginesim
Difficultyintermediate
Triggerschedule
Steps5
Setup~15 min

How it runs

The automated pipeline, trigger to output.

  • TriggerWeekly schedule fires
  • ActionQuery Datadog CI Visibility flake metricsDatadogDatadog
  • LogicRank offenders and compute week-over-week trend
  • ActionOpen Linear issue for tests over critical thresholdLinearLinear
  • OutputPost weekly scorecard to SlackSlack

What it does

Gives engineering leadership a regular, data-backed view of which tests are wasting the most CI time. It queries Datadog CI Visibility for flaky-test and rerun metrics, ranks tests by flake rate and the number of pipelines they blocked, and turns it into a digestible weekly scorecard.

When to use it

Use it when CI test results already flow into Datadog and you want a recurring leadership-facing summary rather than per-incident alerts. Ideal for sprint reviews and reliability standups.

How it works

  1. 1A weekly schedule fires the workflow.
  2. 2It queries Datadog CI Visibility for flaky-test and rerun metrics over the trailing seven days.
  3. 3A logic step ranks tests by flake rate and blocked-pipeline count, and computes the change versus the prior week.
  4. 4It optionally opens a Linear issue for any test crossing a critical flake threshold.
  5. 5It posts a formatted scorecard with the top offenders and trend arrows to the team Slack channel.

Set it up

What you configure once, before turning it on.

  1. 1
    Connect DatadogMetrics, traces, log search.
  2. 2
    Connect LinearIssues, projects, cycles, triage.
  3. 3
    Connect SlackChannels, DMs, threads, mentions.
  4. 4
    Set each agent's modelWe leave models unset so you pick the tier — fast + cheap, or top-quality.
  5. 5
    Tune it to your dataEdit the prompts, filters, and field mappings so it matches how your team works.
  6. 6
    Test, then turn it onRun once against a sample, confirm the output, then enable the trigger.

Run this workflow in your colony.

14-day trial. No DevOps. No Sales call. Provisioned in under a minute.