What is Batch Size in Deep Learning and how does it affect training stability and generalization?

Updated May 16, 2026

Short answer

Batch size is the number of training samples processed before updating model weights, and it significantly impacts convergence speed, stability, and generalization.

Deep explanation

Batch size is a core hyperparameter in deep learning that controls how many samples are used to compute gradients before updating model weights.

Core idea: Instead of updating weights after every sample, gradients are averaged over a batch.

Types:

  1. Small batch size:
  • Noisy gradients.
  • Better generalization.
  • Slower training.
  1. Large batch size:
  • Stable gradients.
  • Faster computation on GPUs.
  • Risk of poor generalization.

Why batch size matters:

  • Affects gradient noise.
  • Impacts convergence behavior.
  • Influences memory usage.…

Unlock with a Pro subscription to view this section.

View pricing

Real-world example

No real-world example available yet.

Unlock with a Pro subscription to view this section.

Upgrade to Pro

Common mistakes

No common mistakes listed yet.

Unlock with a Pro subscription to view this section.

Upgrade to Pro

Follow-up questions

No follow-up questions available yet.

Unlock with a Pro subscription to view this section.

Upgrade to Pro

More Deep Learning interview questions

View all →