What is feature correlation and why can it harm supervised learning models?

Updated May 17, 2026

Short answer

Feature correlation occurs when input variables are strongly related, causing redundancy and instability.

Deep explanation

Highly correlated features can inflate variance in linear models and make coefficient interpretation unstable. It can also lead to multicollinearity, where model estimates become unreliable. Techniques like PCA, feature selection, or regularization are used to mitigate this issue.

Real-world example

Two features like 'income' and 'salary' in a financial dataset carrying redundant information.

Common mistakes

  • Assuming correlation always improves predictive power.

Follow-up questions

  • What is multicollinearity?
  • How does L2 regularization help?

More Supervised Learning interview questions

View all →