What is batch normalization and why is it used?

Updated May 17, 2026

Short answer

Batch normalization stabilizes and accelerates training by normalizing layer inputs.

Deep explanation

It normalizes activations using batch mean and variance, then scales and shifts them using learnable parameters. This reduces internal covariate shift and allows higher learning rates.

Real-world example

Used in ResNet to speed up convergence in image classification.

Common mistakes

  • Applying batch norm incorrectly during inference or forgetting evaluation mode.

Follow-up questions

  • What is internal covariate shift?
  • Why does batch norm act as regularizer?

More Neural Networks interview questions

View all →