What is Nesterov Accelerated Gradient?

Updated May 16, 2026

Short answer

Nesterov momentum improves standard momentum by looking ahead before computing gradient.

Deep explanation

Nesterov Accelerated Gradient (NAG) computes gradient at the predicted future position, leading to more accurate and stable updates compared to standard momentum.

Real-world example

Faster training convergence in deep learning frameworks like TensorFlow.

Common mistakes

  • Confusing it with standard momentum.

Follow-up questions

  • Why is NAG better?
  • Where is it commonly used?

More Gradient Descent interview questions

View all →