What is data augmentation in supervised learning?

Updated May 17, 2026

Short answer

Data augmentation artificially increases dataset size by transforming existing data.

Deep explanation

Data augmentation creates new training examples by applying transformations like rotation, scaling, noise injection, or cropping. It improves generalization by exposing the model to varied input conditions. It is widely used in computer vision and increasingly in NLP and tabular learning.

Real-world example

Image classification systems trained with rotated and flipped images.

Common mistakes

  • Applying unrealistic transformations that distort data meaning.

Follow-up questions

  • Why does augmentation reduce overfitting?
  • Is augmentation used in tabular data?

More Supervised Learning interview questions

View all →