What is model correlation in ensemble learning and why does it matter?

Updated May 16, 2026

Short answer

Model correlation measures how similarly different models in an ensemble make errors, and lower correlation improves ensemble performance.

Deep explanation

Ensemble learning relies on diversity among models. If all models make the same mistakes, combining them offers no benefit. Model correlation quantifies this dependency between model predictions or errors. Lower correlation means models disagree more often in complementary ways, allowing aggregation (averaging or voting) to cancel errors. Techniques like bagging, feature subsampling, and different algorithms are used to reduce correlation. Highly correlated models behave like a single model, defeating the purpose of ensembling.

Real-world example

In credit scoring, combining two identical logistic regression models adds no value due to high correlation.

Common mistakes

  • Assuming more models always improve performance regardless of correlation.

Follow-up questions

  • How do you measure diversity in ensembles?
  • What increases model diversity?

More Ensemble Learning interview questions

View all →