What is Nesterov Accelerated Gradient?
Updated May 16, 2026
Short answer
Nesterov momentum improves standard momentum by looking ahead before computing gradient.
Deep explanation
Nesterov Accelerated Gradient (NAG) computes gradient at the predicted future position, leading to more accurate and stable updates compared to standard momentum.
Real-world example
Faster training convergence in deep learning frameworks like TensorFlow.
Common mistakes
- Confusing it with standard momentum.
Follow-up questions
- Why is NAG better?
- Where is it commonly used?