What is gradient descent in non-convex optimization?

Updated May 16, 2026

Short answer

It refers to optimizing functions with multiple local minima and saddle points.

Deep explanation

Non-convex optimization is challenging because Gradient Descent may converge to local minima or saddle points instead of global minima. Modern deep learning relies on heuristics like initialization, momentum, and stochasticity to handle this.

Real-world example

Training deep neural networks and transformer models.

Common mistakes

  • Assuming convergence guarantees to global optimum.

Follow-up questions

  • Why is deep learning non-convex?
  • How is it still effective?

More Gradient Descent interview questions

View all →