What is “splitting” in a Decision Tree?

Updated Feb 20, 2026

Short answer

Splitting is the process of dividing data into smaller groups based on a condition or feature.

Deep explanation

In a decision tree, splitting happens at each node. The algorithm selects a feature (like age, income, or temperature) and a rule (like “greater than 30”). It then divides the dataset into subsets that are more “pure” or organized. The goal is to make each split improve the ability to correctly classify or predict outcomes. Measures like Gini impurity or entropy are often used to decide the best split.

Real-world example

In weather prediction:

Is humidity > 70%?
- Yes → likely rain
- No → check temperature
  - Hot → sunny
  - Cold → cloudy

Each question splits the weather data into more specific groups.

Common mistakes

- Choosing random splits instead of optimal ones.
- Thinking more splits always improve accuracy (too many splits can overfit).
- Confusing splitting with final prediction (splitting is just a step).

Follow-up questions

What is Gini impurity?
What is entropy in decision trees?
How does a tree decide the best feature to split on?

Short answer

Deep explanation

Real-world example

Common mistakes

Follow-up questions

More Decision Trees interview questions