Home/Examples/SRE Toil Calculation: ROI of Automating Manual Deployments
● Calculations run locally in your browser. Some site features, such as usage analytics or shared results, may use network requests.
Example — Toil Calculator
SRE Toil Calculation: ROI of Automating Manual Deployments
Calculate the toil burden of manual deployments and the ROI of automating them. Applies the Google SRE toil definition to quantify time cost and payback period for deployment automation.
SRE Toil Analysis — Manual Deployments
Toil burden:
15 deployments/week × 45 min = 675 min/week = 11.25 hrs/week
Annual: 11.25 × 52 = 585 hours/year
Cost: 585 × $200/hr = $117,000/year
Toil characteristics (all met = qualifies as toil):
✓ Manual (requires human execution)
✓ Repetitive (same 12 steps every time)
✓ No enduring value (each deployment is ephemeral)
✓ Scales with volume (more services = more toil)
✓ Automatable
Automation ROI:
Build effort: 80 hours × $200/hr = $16,000
Annual savings: $117,000
Payback period: 80 hrs / 11.25 hrs/wk = 7.1 weeks
3-year ROI: ($117,000 × 3 - $16,000) = $335,000 net
Bonus benefits (unquantified):
Reduced deployment errors (human error eliminated)
Faster deployments (45 min → ~5 min automated)
DORA deployment frequency improvement
Google SRE defines toil as manual, repetitive, automatable work that scales with service load and provides no enduring value. Manual deployments fit all five criteria. With a 7.1-week payback period, deployment automation has among the best ROI of any reliability investment. The 45-minute manual process often compresses to 5 minutes automated — enabling higher deployment frequency and lower change failure rates (DORA metrics improve).
What to do next
Automate the 12-step deployment by building a GitHub Actions or ArgoCD workflow. Start with the most error-prone steps (log verification, smoke tests) — these provide the most reliability benefit beyond time savings. Target full automation within 6 weeks to realize the payback period.
Use the Toil Calculator to run this on your own input.
Does the 50% toil cap apply to individual engineers or team average?
The Google SRE 50% toil ceiling applies to each individual engineer's time, not just team average. An average can hide situations where one engineer carries disproportionate toil (the 'toil magnet' pattern). Track toil per-person and redistribute if any engineer exceeds 50% — concentrated toil leads to burnout and single points of knowledge failure.
Is on-call work considered toil?
Responding to pages and investigating alerts is toil if the alerts are false positives or the response procedure is repetitive. Genuinely novel incident investigation is not toil — it's engineering work that produces learnings and improvements. The distinction matters because toil should be eliminated or automated, while novel incident work should be documented in post-mortems to prevent recurrence. Alert fatigue (paging for known non-issues) is one of the most impactful forms of toil to eliminate.