Exploding Gradient
A training problem where gradients become extremely large during backpropagation, causing weight updates to be so drastic that the model becomes unstable and training diverges.
Why It Matters
Exploding gradients can completely derail model training. Gradient clipping is the standard solution and is applied by default in most modern frameworks.
Example
During training, the loss suddenly spikes to NaN (not a number) because a gradient grew to an enormous value and caused weights to overflow.
Think of it like...
Like a microphone feedback loop — a small signal gets amplified over and over until it becomes an ear-splitting screech that destroys the signal entirely.
Related Terms
Vanishing Gradient Problem
A training difficulty in deep networks where gradients become exponentially smaller as they are propagated back through many layers, making it nearly impossible for early layers to learn.
Gradient Clipping
A technique that caps gradient values at a maximum threshold during training to prevent exploding gradients. If a gradient exceeds the threshold, it is scaled down.
Backpropagation
The primary algorithm used to train neural networks. It calculates how much each weight in the network contributed to the error, then adjusts weights backward from the output layer to reduce future errors.