juniorDecision Trees
What is entropy in Decision Trees?
Updated May 16, 2026
Short answer
Entropy measures the randomness or impurity in a dataset used to decide the best split in a Decision Tree.
Deep explanation
Entropy is a concept from information theory that quantifies uncertainty. In Decision Trees, it is used to measure how mixed a dataset is. A dataset with only one class has entropy 0, while a perfectly balanced dataset has higher entropy. The algorithm chooses splits that reduce entropy the most (maximize information gain).
Real-world example
Used in spam detection to split emails into spam or not spam based on word distributions.
Common mistakes
- Assuming entropy always behaves linearly—it is logarithmic.
Follow-up questions
- How is entropy calculated mathematically?
- What is zero entropy?