A fundamental problem that occurs to gradients in early layers of a neural network

Edits on 29 Aug 2019

Edits made to:
**Description** (+82 characters)
**Article** (+1016 characters)
**Further reading** (+1 rows) (+4 cells) (+102 characters)
**Categories** (+1 topics)
**Related Topics** (+2 topics)

A fundamental problem that occurs to gradients in early layers of a neural network

The unstable gradient problem is a fundamental problem that occurs in a neural network, that entails that a gradient in a deep neural network tends to either explode or vanish in early layers.

The unstable gradient problem is not necessarily the vanishing gradient problem or the exploding gradient problem, but is rather due to the fact that gradient in early layers is the product of terms from all proceeding layers. More layers make the network an intrinsically unstable solution. Balancing all products of terms is the only way each layer in a neural network can close at the same speed and avoid vanishing or exploding gradients. Balanced product of terms occurring by chance becomes more and more unlikely with more layers. Neural networks therefor have layers that learn at different speeds, without being given any mechanisms or underlying reason for balancing learning speeds.

When magnitudes of gradients accumulate, unstable networks are more likely to occur, which is a cause of poor prediction results.

Title

Author

Link

Type

Date

Neural networks and deep learning

Michael Nielsen

Web

Edits on 26 Aug 2019

Edits made to:

A fundamental problem that occurs to gradients in early layers of a neural network

No more activity to show.