What is variance in machine learning and why does it cause overfitting?

Updated May 17, 2026

Short answer

Variance is the sensitivity of a model to small fluctuations in training data.

Deep explanation

Variance measures how much a model’s predictions change when trained on different subsets of data. High variance models learn noise along with patterns, leading to overfitting. Complex models like deep decision trees or high-degree polynomial regression typically have high variance. Reducing variance often involves regularization, pruning, or using ensembles.

Real-world example

A deep decision tree memorizing training customer behavior but failing on new customers.

Common mistakes

  • Thinking variance only relates to dataset size.

Follow-up questions

  • How do ensembles reduce variance?
  • What is the effect of pruning?

More Supervised Learning interview questions

View all →