juniorDecision Trees
What is information gain in Decision Trees?
Updated May 16, 2026
Short answer
Information gain measures how much uncertainty is reduced after splitting a dataset.
Deep explanation
Information gain is the reduction in entropy after a dataset is split on an attribute. The attribute with the highest information gain is chosen for splitting because it best separates the classes.
Real-world example
Used in marketing segmentation to identify which customer attribute best separates buyers and non-buyers.
Common mistakes
- Assuming higher information gain always means better generalization.
Follow-up questions
- Can information gain be negative?
- What biases affect information gain?