seniorSupervised Learning
What is entropy and information gain in decision tree learning?
Updated May 17, 2026
Short answer
Entropy measures impurity in data, and information gain measures reduction in entropy after a split.
Deep explanation
Entropy quantifies uncertainty in a dataset. A pure node has low entropy, while a mixed node has high entropy. Information gain is the reduction in entropy after splitting data based on a feature. Decision trees choose splits that maximize information gain, leading to purer child nodes and better classification performance.
Real-world example
Customer segmentation where splitting improves homogeneity of customer groups.
Common mistakes
- Assuming information gain always prefers features with many unique values.
Follow-up questions
- What is Gini impurity vs entropy?
- Why can information gain be biased?