Delivery Manager 14 min

DORA Metrics — A Practical Implementation Guide

DORA metrics are the gold standard for measuring software delivery performance. But most teams measure them wrong, report them without context, or use them punitively. Here's how to implement them properly.

What DORA Metrics Actually Measure

The DevOps Research and Assessment (DORA) programme, now part of Google Cloud, identified four key metrics that predict software delivery performance and organisational outcomes. The 2025 DORA report found that 90% of organisations now use AI in development, but warned that AI "amplifies what is already working well, as well as magnifies existing dysfunctions." The metrics remain the diagnostic tool for understanding whether your delivery system is healthy.

The Four Core Metrics

Deployment Frequency (DF): How often the team successfully releases to production.

Elite: multiple times per day
High: between once per day and once per week
Medium: between once per week and once per month
Low: less than once per month

Lead Time for Changes (LT): Time from code commit to code running in production.

Elite: less than one hour
High: between one day and one week
Medium: between one week and one month
Low: more than one month

Change Failure Rate (CFR): Percentage of deployments causing a degradation in service.

Elite: 0-5%
High: 5-10%
Medium: 10-15%
Low: 16-30%

Mean Time to Recovery (MTTR): How quickly the team restores service after an incident.

Elite: less than one hour
High: less than one day
Medium: between one day and one week
Low: more than one week

A fifth metric — Reliability — was added in recent reports, measuring whether the team meets its service level objectives.

The Key Insight

Speed and stability are not trade-offs. Elite teams deploy more frequently AND have lower failure rates AND recover faster. The research consistently shows that investing in deployment automation, testing, and observability improves all four metrics simultaneously.

Implementation Step by Step

Step 1: Define Your Measurement Points

Before collecting data, agree on precise definitions:

Deployment Frequency:

What counts as a "deployment"? (Production only, or staging too?)
Do hotfixes count? (Yes — they're deployments)
Do configuration changes count? (Depends on your risk profile)
Measurement: count of successful production deployments per time period

Lead Time for Changes:

Start point: first commit on the branch (not story creation — that's lead time for features, a different metric)
End point: code running in production (not merged to main, not deployed to staging)
Measurement: median time from first commit to production deployment

Change Failure Rate:

What counts as a "failure"? (Rollback, hotfix, degraded service, customer-impacting bug)
Does a failed deployment that's caught by automated tests count? (No — it never reached production)
Measurement: (failed deployments ÷ total deployments) × 100

Mean Time to Recovery:

Start point: incident detected (alert fired, not customer report)
End point: service restored to normal operation (not root cause identified)
Measurement: median time from detection to restoration

Step 2: Instrument Your Pipeline

DORA metrics come from your CI/CD pipeline and incident management system, not from Jira or manual tracking.

Data sources:

Deployment Frequency: CI/CD tool (GitHub Actions, Jenkins, GitLab CI) — count production deployment events
Lead Time: Git + CI/CD — timestamp of first commit vs timestamp of production deployment
Change Failure Rate: Incident management tool (PagerDuty, incident.io, Opsgenie) linked to deployment events
MTTR: Incident management tool — time from alert to resolution

Tools that calculate DORA automatically:

Sleuth, LinearB, Swarmia, Jellyfish, Faros AI, Datadog DORA, GitLab Value Stream Analytics
Or build your own dashboard from CI/CD event data + incident data

Step 3: Establish a Baseline

Measure for 4-6 weeks before setting targets. Your baseline tells you where you are today — without judgment. Many teams are surprised to find their lead time is measured in weeks, not days, once they measure it honestly.

Present the baseline to the team as data, not as a problem. "Here's where we are. Where would we like to be in 6 months?"

Step 4: Identify Bottlenecks

Use the metrics to diagnose where the delivery system is constrained:

High lead time but high deployment frequency: The pipeline is fast once code is ready, but code takes a long time to get ready. Look at: code review wait times, test suite duration, environment provisioning.
Low deployment frequency but low lead time: The pipeline is fast but deployments are batched. Look at: release approval processes, change advisory boards, manual testing gates.
High change failure rate: Deployments are risky. Look at: test coverage, deployment automation, feature flags, canary releases.
High MTTR: Recovery is slow. Look at: observability, runbooks, on-call processes, rollback automation.

Step 5: Improve Systematically

Pick one metric to improve at a time. The most common improvement sequence:

1. Reduce lead time first — this usually involves automating the pipeline, reducing code review wait times, and eliminating manual gates 2. Increase deployment frequency — once lead time is short, deploy more often with smaller batches 3. Reduce change failure rate — smaller, more frequent deployments are inherently less risky; add automated testing and progressive delivery 4. Reduce MTTR — invest in observability, automated rollback, and incident response processes

Common Implementation Mistakes

Measuring from the wrong point: Lead time measured from story creation (that's flow efficiency, not DORA lead time). Lead time measured from PR creation (misses development time). Be precise about start and end points.

Using averages instead of medians: Averages are skewed by outliers. One 30-day deployment masks twenty 1-day deployments. Always use median (p50) and track p90 for outlier awareness.

Comparing teams: DORA metrics are for team self-improvement, not cross-team comparison. Different teams have different contexts, different risk profiles, and different deployment targets.

Setting targets without understanding constraints: "Deploy daily" is meaningless if the team has a manual QA gate that takes 3 days. Fix the constraint first, then the metric improves naturally.

Ignoring the human system: The 2025 DORA report emphasises that AI and tooling amplify existing team dynamics. If trust is low, faster pipelines just surface dysfunction faster. Fix the team dynamics alongside the technical system.

Reporting DORA Metrics

For the team (weekly)

Show trends, not snapshots. A dashboard with:

4-week rolling median for each metric
Trend direction (improving, stable, declining)
Notable events annotated (new team member, major refactor, incident)

For leadership (monthly)

Show progress against targets with context:

Current performance tier (Elite/High/Medium/Low) for each metric
Quarter-over-quarter trend
Key investments made (pipeline automation, testing, observability)
Key blockers to further improvement (organisational, not just technical)

What NOT to report

Individual developer metrics (commits per day, PRs per week)
Team-vs-team comparisons
Metrics without context ("CFR went up" without explaining the new feature launch)
Targets without investment ("deploy daily" without funding pipeline automation)

The Delivery Manager's Role

The Delivery Manager owns the delivery system, not just the delivery output. DORA metrics are the diagnostic tool for that system:

Champion measurement: Ensure the team has the tooling and data to track DORA metrics accurately
Facilitate improvement: Help the team identify bottlenecks and prioritise investments
Protect the team: Prevent metrics from being weaponised by leadership
Connect to business outcomes: Show how DORA improvements translate to faster time-to-market, fewer incidents, and happier customers
Report honestly: Present metrics with context, including what's not yet measured

---

Download the [Delivery Dashboard template](/templates) for a ready-to-use DORA metrics reporting format.

More playbooks

Scrum Master · 11 min