What is data leakage in model evaluation?

Updated May 17, 2026

Short answer

Data leakage occurs when test information influences training.

Deep explanation

It leads to overly optimistic performance because the model indirectly learns from test data.

Real-world example

Including future sales data when predicting past demand.

Common mistakes

  • Fitting preprocessing on full dataset before splitting.

Follow-up questions

  • How to prevent leakage?
  • What is target leakage?

More Model Evaluation interview questions

View all →