How do Decision Trees behave with missing not at random (MNAR) data?
Updated May 16, 2026
Short answer
Decision Trees can misinterpret MNAR missingness as signal, leading to biased splits.
Deep explanation
When data is Missing Not At Random (MNAR), the probability of missingness depends on the unobserved value itself. Decision Trees may treat missingness as informative if encoded implicitly, causing splits that exploit missing patterns rather than true feature relationships. This leads to biased models. Some implementations handle missing values via surrogate splits or dedicated missing branches, but MNAR still remains a fundamental challenge requiring domain understanding or explicit modeling.
Unlock with a Pro subscription to view this section.
View pricingReal-world example
No real-world example available yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProCommon mistakes
No common mistakes listed yet.
Unlock with a Pro subscription to view this section.
Upgrade to ProFollow-up questions
No follow-up questions available yet.
Unlock with a Pro subscription to view this section.
Upgrade to Pro