seniorGradient Descent
What is catastrophic curvature in deep learning optimization?
Updated May 16, 2026
Short answer
Catastrophic curvature refers to extreme curvature variations that destabilize training.
Deep explanation
In deep networks, sudden spikes in curvature can cause gradients to explode or vanish rapidly. These regions make optimization unpredictable and require adaptive methods or normalization techniques to stabilize training.
Real-world example
Training instability in very deep transformer architectures.
Common mistakes
- Ignoring second-order effects in optimization.
Follow-up questions
- What causes it?
- How to mitigate it?