What is data augmentation in supervised learning?

Updated May 17, 2026

Short answer

Data augmentation artificially increases dataset size by transforming existing data.

Deep explanation

Data augmentation creates new training examples by applying transformations like rotation, scaling, noise injection, or cropping. It improves generalization by exposing the model to varied input conditions. It is widely used in computer vision and increasingly in NLP and tabular learning.

Real-world example

Image classification systems trained with rotated and flipped images.

Common mistakes

Applying unrealistic transformations that distort data meaning.

Follow-up questions

Why does augmentation reduce overfitting?
Is augmentation used in tabular data?

Short answer

Deep explanation

Real-world example

Common mistakes

Follow-up questions

More Supervised Learning interview questions