How does model initialization strategy in large neural networks affect bias and variance during training?

Updated May 15, 2026

Short answer

Poor initialization increases bias by slowing convergence and increases variance through unstable gradients.

Deep explanation

Model initialization determines the starting point of optimization in neural networks. Poor initialization can lead to vanishing or exploding gradients, causing slow convergence (high bias) or unstable updates (high variance).

Modern initialization strategies like Xavier and He initialization aim to maintain variance across layers, stabilizing gradient flow. In large-scale architectures, initialization becomes even more critical due to depth and distributed training effects.

Proper initialization ensures faster convergence and more stable optimization trajectories.

Unlock with a Pro subscription to view this section.

View pricing