Explain Chaos Engineering basics.

Updated Apr 28, 2026

Short answer

The discipline of experimenting on a software system in production in order to build confidence in its capability to withstand turbulent conditions.

Deep explanation

Intermediate reliability engineering involves handling distributed failures and defining metrics. The discipline of experimenting on a software system in production in order to build confidence in its capability to withstand turbulent conditions.

Real-world example

A mobile app retrying to connect to a server when the signal is weak.

Common mistakes

  • Retrying indefinitely without a cap, which can crash the server when it comes back up.

Follow-up questions

  • What are the three states of a Circuit Breaker?

More Availability & Reliability interview questions

View all →