seniorGradient Descent
What is the role of spectral properties of Hessian in Gradient Descent?
Updated May 16, 2026
Short answer
Hessian eigenvalues determine convergence speed and stability of Gradient Descent.
Deep explanation
The spectrum of the Hessian matrix (its eigenvalues) determines curvature along different directions. Large eigenvalues cause instability, while small ones cause slow convergence. The ratio of largest to smallest eigenvalue defines conditioning and directly affects GD efficiency.
Real-world example
Training deep networks with poorly conditioned feature spaces.
Common mistakes
- Ignoring curvature distribution and only tuning learning rate.
Follow-up questions
- What is spectral clustering intuition here?
- Why does GD zig-zag?