seniorKeras
Why does increasing batch size sometimes degrade Keras model accuracy?
Updated May 16, 2026
Short answer
Large batch sizes reduce gradient noise, which can harm generalization.
Deep explanation
Smaller batches introduce stochasticity that helps escape sharp minima, improving generalization. Large batches converge to sharper minima, often reducing test performance despite faster training.
Unlock with a Pro subscription to view this section.
View pricingReal-world example
No real-world example available yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProCommon mistakes
No common mistakes listed yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProFollow-up questions
No follow-up questions available yet.
Unlock with a Pro subscription to view this section.
Upgrade to Pro