seniorGradient Descent
What is Polyak-Lojasiewicz (PL) condition in optimization?
Updated May 16, 2026
Short answer
PL condition ensures global convergence without requiring convexity.
Deep explanation
The PL condition states that the squared norm of the gradient is proportional to the function suboptimality. This guarantees linear convergence of Gradient Descent even for certain non-convex functions. It is stronger than smoothness but weaker than convexity, making it powerful in modern ML theory.
Real-world example
Deep learning models showing fast convergence despite non-convex loss surfaces.
Common mistakes
- Assuming PL implies convexity (it does not).
Follow-up questions
- Why is PL important in deep learning?
- How is PL verified?