What is vanishing gradient problem?

Updated May 16, 2026

Short answer

Vanishing gradient occurs when gradients become extremely small during backpropagation.

Deep explanation

In deep networks, repeated multiplication of small derivatives causes gradients to shrink, slowing or stopping learning in early layers. This is common with sigmoid/tanh activations.

Real-world example

Deep neural networks failing to learn long-term dependencies.

Common mistakes

Using sigmoid in deep networks without mitigation.

Follow-up questions

How to fix it?
What is exploding gradient?

Short answer

Deep explanation

Real-world example

Common mistakes

Follow-up questions

More Gradient Descent interview questions