What is entropy in Decision Trees?

Updated May 16, 2026

Short answer

Entropy measures the randomness or impurity in a dataset used to decide the best split in a Decision Tree.

Deep explanation

Entropy is a concept from information theory that quantifies uncertainty. In Decision Trees, it is used to measure how mixed a dataset is. A dataset with only one class has entropy 0, while a perfectly balanced dataset has higher entropy. The algorithm chooses splits that reduce entropy the most (maximize information gain).

Real-world example

Used in spam detection to split emails into spam or not spam based on word distributions.

Common mistakes

  • Assuming entropy always behaves linearly—it is logarithmic.

Follow-up questions

  • How is entropy calculated mathematically?
  • What is zero entropy?

More Decision Trees interview questions

View all →