seniorSupervised Learning
What is data augmentation in supervised learning?
Updated May 17, 2026
Short answer
Data augmentation artificially increases dataset size by transforming existing data.
Deep explanation
Data augmentation creates new training examples by applying transformations like rotation, scaling, noise injection, or cropping. It improves generalization by exposing the model to varied input conditions. It is widely used in computer vision and increasingly in NLP and tabular learning.
Real-world example
Image classification systems trained with rotated and flipped images.
Common mistakes
- Applying unrealistic transformations that distort data meaning.
Follow-up questions
- Why does augmentation reduce overfitting?
- Is augmentation used in tabular data?