How do Decision Trees decide the best split among many candidate features?
Updated May 16, 2026
Short answer
Decision Trees evaluate all candidate features and thresholds, selecting the split that maximizes impurity reduction (information gain or Gini decrease).
Deep explanation
At each node, the algorithm performs an exhaustive search over all features. For continuous features, it sorts values and evaluates midpoints between unique values as potential thresholds. For each candidate split, it computes impurity reduction (Gini, entropy, or variance reduction). The best split is the one that yields the highest weighted improvement in node purity. This greedy search is repeated recursively at every node, which is computationally expensive but effective in practice.
Unlock with a Pro subscription to view this section.
View pricingReal-world example
No real-world example available yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProCommon mistakes
No common mistakes listed yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProFollow-up questions
No follow-up questions available yet.
Unlock with a Pro subscription to view this section.
Upgrade to Pro