● Calculations run locally in your browser. Some site features, such as usage analytics or shared results, may use network requests.
Example — MTTR Calculator (SRE)

MTTR Calculation for Q1 Production Incidents: Mean and Median Analysis

Calculate Mean Time to Recover (MTTR) from Q1 production incident data. Compare mean vs median MTTR, identify outliers, and benchmark against DORA elite performance tiers.

Worked example

Input
Q1 Production Incidents: Incident 1: 15 minutes (cache restart) Incident 2: 8 minutes (auto-rollback) Incident 3: 180 minutes (database corruption) Incident 4: 12 minutes (config change revert) Incident 5: 25 minutes (dependency restart) Incident 6: 45 minutes (third-party API issue) Total: 6 incidents
Output
MTTR Analysis — Q1 Production Incidents Incident durations (minutes): 15, 8, 180, 25, 12, 45 MTTR (mean): 47.5 minutes MTTR (median): 20 minutes ← more representative Total downtime: 285 minutes (4.75 hours) Distribution: < 30 min: 4 incidents (67%) — fast recovery 30-60 min: 1 incident (17%) — moderate > 60 min: 1 incident (17%) — slow recovery (outlier: 180 min) Outlier analysis: Incident 3 (180 min database corruption) contributed 63% of total downtime Removing this outlier: MTTR (mean) = 21 minutes DORA tier comparison: Elite: < 1 hour MTTR ← you qualify (median: 20 min) High: < 1 day MTTR Medium: < 1 week Low: > 1 week Focus area: Database corruption recovery (incident 3) 180-minute MTTR suggests missing: automated backup restore, runbook, or point-in-time recovery
The mean MTTR (47.5 min) is heavily skewed by the 180-minute database incident. Median MTTR (20 min) better represents typical recovery performance. DORA Elite teams achieve median MTTR < 1 hour — this team qualifies. The priority action is improving recovery from the database corruption class of incidents, which alone consumed 63% of Q1 downtime.
What to do next Post-mortem the 180-minute database incident specifically: what would have made recovery faster? Automated PITR (Point-In-Time Recovery) automation, a tested database recovery runbook, and a recovery fire drill would likely cut that incident to 30-45 minutes. Set a goal: no single incident > 60 minutes in Q2.

Use the MTTR Calculator (SRE) to run this on your own input.

View reliability tooling options →
External site · Independent provider · We may receive a commission · Not a recommendation

Frequently asked questions

Should MTTR include detection time or just recovery time?

MTTR (Mean Time to Recover/Resolve) traditionally measures from incident detection (alert firing) to resolution. Some organizations measure MTTR from incident start (when the issue actually began, including the detection gap). The second approach produces higher MTTRs but is more honest about customer impact. Document your definition clearly — inconsistencies between teams make MTTR comparisons meaningless.

What is the relationship between MTTR and deployment frequency?

DORA research shows high-performing teams have both high deployment frequency AND low MTTR — these are correlated. Teams deploying frequently get better at incident response (more practice), have smaller change sets to diagnose, and have faster rollback paths. Teams that deploy infrequently deploy larger changes with more complex failure modes, resulting in longer MTTR. Improving deployment frequency is one of the highest-leverage MTTR improvement investments.

Calculated using the MTTR Calculator (SRE)