The exploding gradient problem was first described in an academic paper written titled "The problem of learning long-term dependencies in recurrent networks". The exploding gradient problem is a difficulty which can occur when training artificial neural networks using gradient descent by back propagation.

When large error gradients accumulate the model may become unstable and impair effective learning. Change in model weights can create an unstable network. The values of weights can become so large and cause overflow. A gradient is the direction and magnitude calculated during the training of a neural network it is used to teach the network weights in the right direction by the right amount. When there is an error gradient, explosion of components may grow exponentially.

Exploding gradient problem can be addressed by redesigning the network model, using rectified linear activation, using long short term memory (LSTM) networks, gradient clipping and weight regularization.Another solution to the exploding gradient problem is to prevent gradients from becoming to0 big applying a process known as gradient clipping that places a predefined threshold on each gradient. Gradient clipping ensures the gradients stay heading towards the same direction but with shorter lengths.

## Timeline

## People

## Further reading

Learning Long-Term Dependancies with Gradient Descent is Difficult

Yoshua Bengio, Patrice Simard, Paolo Frasconi

Academic

On the difficulty of training recurrent neural networks

Razvan Pascanu, Tomas Mikolov and Yoshua Bengio

Academic paper

The curious case of the vanishing & exploding gradient

Eniola Alese

Web

June 5, 2018

Understanding the exploding gradient problem

Razvan Pascanu, Tomas Mikolov and Yoshua Bengio

Academic paper