seniorGradient Descent
What is curvature-adaptive learning rate?
Updated May 16, 2026
Short answer
It adjusts learning rate based on local curvature of the loss surface.
Deep explanation
Curvature-adaptive methods scale learning rates inversely with curvature (second derivative information). In steep curvature directions, smaller steps are taken; in flat regions, larger steps are allowed. This improves stability and convergence speed.
Real-world example
Adaptive optimizers like RMSProp and Adam approximate this idea.
Common mistakes
- Ignoring curvature leads to unstable updates.
Follow-up questions
- What is diagonal approximation?
- Why not full Hessian?