Data Augmentation

Fundamentals

A regularization technique that artificially expands the training dataset by applying label-preserving transformations to existing examples, forcing the model to learn invariances.

Imagine a photographer taking the same photo from different angles, in different lighting, to make one image worth ten.

Data augmentation creates modified versions of training examples by applying transformations that change the input but not the label. For images, this includes random cropping, flipping, rotation, color jittering, and scaling. For text, techniques include synonym replacement, back-translation, and random insertion/deletion. For audio, time stretching and pitch shifting are common.

The technique works as a form of regularization by encoding prior knowledge about invariances. Flipping an image of a cat horizontally still shows a cat , by training on both the original and flipped versions, the model learns horizontal flip invariance without needing twice as many unique examples. This is particularly valuable when labeled data is expensive or limited, which is common in medical imaging, satellite imagery, and specialized domains.

More advanced augmentation methods include mixup (blending two training images and their labels), cutout (masking random patches), and automated augmentation search (learning optimal augmentation policies). Data augmentation is one of the most reliably effective techniques for improving model generalization in computer vision, often providing larger gains than architectural changes. It directly addresses the fundamental problem that models need to see more variation than real datasets provide.

Last updated: February 22, 2026

Data Augmentation

Related Terms