What is momentum in Gradient Descent?

Updated May 16, 2026

Short answer

Momentum accelerates Gradient Descent by accumulating past gradients.

Deep explanation

Momentum helps smooth updates by combining current gradient with previous updates, reducing oscillations and speeding convergence in ravines.

Real-world example

Faster convergence in deep neural network training.

Common mistakes

  • Setting momentum too high causing instability.

Follow-up questions

  • What is beta in momentum?
  • How is it different from SGD?

More Gradient Descent interview questions

View all →