What is stochastic gradient descent?

Updated May 15, 2026

Short answer

SGD updates model parameters using one sample at a time.

It approximates full gradient descent using single or small batches, making training faster and scalable.

Used in training large-scale deep learning models.