What is batch normalization and why is it used?

Updated May 17, 2026

Short answer

Batch normalization stabilizes and accelerates training by normalizing layer inputs.

Deep explanation

It normalizes activations using batch mean and variance, then scales and shifts them using learnable parameters. This reduces internal covariate shift and allows higher learning rates.

Real-world example

Used in ResNet to speed up convergence in image classification.

Common mistakes

Applying batch norm incorrectly during inference or forgetting evaluation mode.

Follow-up questions

What is internal covariate shift?
Why does batch norm act as regularizer?

Short answer

Deep explanation

Real-world example

Common mistakes

Follow-up questions

More Neural Networks interview questions