seniorGradient Descent
What is stochastic approximation theory in Gradient Descent?
Updated May 16, 2026
Short answer
Stochastic approximation studies convergence of algorithms using noisy gradient estimates.
Deep explanation
It provides mathematical guarantees for convergence when only noisy observations of gradients are available. Robbins-Monro algorithm is a foundational result that underpins SGD theory.
Real-world example
Online learning systems adapting to streaming data.
Common mistakes
- Assuming deterministic convergence in noisy settings.
Follow-up questions
- What is Robbins-Monro algorithm?
- Why is it important?