What is the role of training, validation, and test sets?

Updated May 17, 2026

Short answer

Training data fits the model, validation data tunes it, and test data evaluates final performance.

Deep explanation

The training set is used to learn model parameters. The validation set is used for hyperparameter tuning and model selection. The test set is used only once to estimate real-world performance. Proper separation prevents data leakage and ensures unbiased evaluation.

Real-world example

In medical diagnosis models, validation helps tune thresholds while test data ensures safe deployment performance.

Common mistakes

Using test data repeatedly for tuning hyperparameters.

Follow-up questions

What is cross-validation?
Why is data leakage dangerous?

Short answer

Deep explanation

Real-world example

Common mistakes

Follow-up questions

More Supervised Learning interview questions