juniorDecision Trees
What is “splitting” in a Decision Tree?
Updated Feb 20, 2026
Short answer
Splitting is the process of dividing data into smaller groups based on a condition or feature.
Deep explanation
In a decision tree, splitting happens at each node. The algorithm selects a feature (like age, income, or temperature) and a rule (like “greater than 30”). It then divides the dataset into subsets that are more “pure” or organized. The goal is to make each split improve the ability to correctly classify or predict outcomes. Measures like Gini impurity or entropy are often used to decide the best split.
Real-world example
In weather prediction:
- Is humidity > 70%?
- Yes → likely rain
- No → check temperature
- Hot → sunny
- Cold → cloudy
Each question splits the weather data into more specific groups.
Common mistakes
- - Choosing random splits instead of optimal ones.
- - Thinking more splits always improve accuracy (too many splits can overfit).
- - Confusing splitting with final prediction (splitting is just a step).
Follow-up questions
- What is Gini impurity?
- What is entropy in decision trees?
- How does a tree decide the best feature to split on?