How does model initialization strategy in large neural networks affect bias and variance during training?
Updated May 15, 2026
Short answer
Poor initialization increases bias by slowing convergence and increases variance through unstable gradients.
Deep explanation
Model initialization determines the starting point of optimization in neural networks. Poor initialization can lead to vanishing or exploding gradients, causing slow convergence (high bias) or unstable updates (high variance).
Modern initialization strategies like Xavier and He initialization aim to maintain variance across layers, stabilizing gradient flow. In large-scale architectures, initialization becomes even more critical due to depth and distributed training effects.
Proper initialization ensures faster convergence and more stable optimization trajectories.
Unlock with a Pro subscription to view this section.
View pricingReal-world example
No real-world example available yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProCommon mistakes
No common mistakes listed yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProFollow-up questions
No follow-up questions available yet.
Unlock with a Pro subscription to view this section.
Upgrade to Pro