What is SGD in Linear Regression?
Updated May 16, 2026
Short answer
Stochastic Gradient Descent updates parameters using one sample at a time.
Deep explanation
It is faster for large datasets and introduces noise that can help escape local minima.
Real-world example
Used in online learning systems like ad ranking.
Common mistakes
- Ignoring learning rate tuning.
Follow-up questions
- Why is SGD noisy?
- What is mini-batch GD?