What is bootstrap sampling in bagging?

Updated May 16, 2026

Short answer

Bootstrap sampling creates multiple datasets by sampling with replacement from the original dataset.

Deep explanation

In bagging, each model is trained on a bootstrap sample, which is generated by randomly sampling data points with replacement. This means some samples may repeat while others may be excluded. Each model sees a slightly different dataset, increasing diversity among models. Aggregating their predictions reduces variance and improves robustness.

Real-world example

Random Forest uses bootstrap sampling to train each decision tree on different data subsets.

Common mistakes

  • Thinking bootstrap samples are smaller than original dataset (they are same size but duplicated).

Follow-up questions

  • Why does replacement matter?
  • What is out-of-bag error?

More Ensemble Learning interview questions

View all →