What is mini-batch noise stability trade-off?

Updated May 16, 2026

Short answer

It is the trade-off between noisy gradients and stable convergence in mini-batch GD.

Deep explanation

Smaller batches introduce noise which helps exploration but reduces stability. Larger batches stabilize updates but may lead to poorer generalization and higher computation cost.

Real-world example

Training deep learning models with batch sizes like 32, 64, 128.

Common mistakes

  • Assuming larger batch size always improves performance.

Follow-up questions

  • What is large batch training issue?
  • Why does noise help?

More Gradient Descent interview questions

View all →