seniorSupervised Learning
What is feature correlation and why can it harm supervised learning models?
Updated May 17, 2026
Short answer
Feature correlation occurs when input variables are strongly related, causing redundancy and instability.
Deep explanation
Highly correlated features can inflate variance in linear models and make coefficient interpretation unstable. It can also lead to multicollinearity, where model estimates become unreliable. Techniques like PCA, feature selection, or regularization are used to mitigate this issue.
Real-world example
Two features like 'income' and 'salary' in a financial dataset carrying redundant information.
Common mistakes
- Assuming correlation always improves predictive power.
Follow-up questions
- What is multicollinearity?
- How does L2 regularization help?