Explain MTBF and MTTR.

Updated Apr 28, 2026

Short answer

MTBF (Mean Time Between Failures) measures reliability; MTTR (Mean Time To Repair) measures maintainability/availability.

Deep explanation

Availability and Reliability are the cornerstones of production-grade systems. MTBF (Mean Time Between Failures) measures reliability; MTTR (Mean Time To Repair) measures maintainability/availability. Ensuring high uptime requires both preventing failures and minimizing recovery time.

Real-world example

A website staying up during a traffic spike by using multiple servers.

Common mistakes

  • Assuming a system is reliable just because it is available (e.g., it's 'up' but returns errors).

Follow-up questions

  • How many minutes of downtime is 99.99% availability?

More Availability & Reliability interview questions

View all →