Machine Learning

Adversarial Training

A defense technique where adversarial examples are included in the training data to make the model more robust against attacks. The model learns to handle both normal and adversarial inputs.

Why It Matters

Adversarial training is the most effective known defense against adversarial attacks, making models significantly more robust for safety-critical applications.

Example

Generating adversarial versions of training images and including them in training, teaching the classifier to correctly identify objects even when adversarial noise is present.

Think of it like...

Like a martial artist who practices against opponents who use unconventional techniques — the unexpected practice makes them better prepared for real fights.

Related Terms