How does decision tree splitting work in data mining?

Updated May 15, 2026

Short answer

Decision trees split data based on feature conditions that maximize purity.

Deep explanation

At each node, algorithms evaluate all possible splits using metrics like Gini impurity or entropy. The best split is the one that most reduces impurity. This process is recursive until stopping criteria are met, such as maximum depth or minimum samples per leaf.

Real-world example

Credit risk scoring systems splitting customers based on income and credit history.

Common mistakes

  • Allowing trees to grow without pruning, leading to overfitting.

Follow-up questions

  • What is Gini impurity?
  • What is pruning?

More Data Mining interview questions

View all →