seniorNLP
What is catastrophic scaling instability in LLM training?
Updated May 17, 2026
Short answer
Scaling instability refers to divergence or degraded performance when increasing model size or batch size.
Deep explanation
As models scale, optimization becomes unstable due to gradient noise, learning rate sensitivity, and normalization issues. Techniques like learning rate warmup, gradient clipping, and adaptive optimizers are critical.
Unlock with a Pro subscription to view this section.
View pricingReal-world example
No real-world example available yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProCommon mistakes
No common mistakes listed yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProFollow-up questions
No follow-up questions available yet.
Unlock with a Pro subscription to view this section.
Upgrade to Pro