 # Momentum (support vector machine)

A machine learning strategy that helps accelerate stochastic gradient descent in the relevant direction while dampening oscillations.

### All edits by  Daniel Frumkin

Edits on 6 March, 2019
Article (+5/-5 characters)
Article

The idea of SGD with momentum can be conceptualized with an analogy from physics in which a ball gains and loses momentum as it rolls around on hilly terrain. Imagining that a learning algorithm's loss can be interpreted as the height of a hilly terrain, it's then possible to relate the gradient of the loss function with the forceforce of a ball rolling up and down the hills. Specifically, the force is equal to the (negative) gradient of the loss function, where the loss function is represented as the potential energy of the ball at any point on a hill.

Article (+29/-29 characters)
Article

This means that: Force = −∇U,

This means that Force = −∇U, where U = mgh (i.e. U is the potential energy of the ball). Setting the ball's initial velocity equal to zero at some location is analogous to initializing the parameters with random numbers.

Article (+1 images) (+1498 characters)
Further reading (+2 rows) (+8 cells) (+304 characters)
Article

The idea of SGD with momentum can be conceptualized with an analogy from physics in which a ball gains and loses momentum as it rolls around on hilly terrain. Imagining that a learning algorithm's loss can be interpreted as the height of a hilly terrain, it's then possible to relate the gradient of the loss function with the force of a ball rolling up and down the hills. Specifically, the force is equal to the (negative) gradient of the loss function, where the loss function is represented as the potential energy of the ball at any point on a hill. This means that Force = −∇U, where U = mgh (i.e. U is the potential energy of the ball). Setting the ball's initial velocity equal to zero at some location is analogous to initializing the parameters with random numbers.

Optimizing to minimize the loss function can then be seen as equivalent to trying to get the ball to reach the deepest valley in the terrain, where the loss function is smallest. When the slope of a hill is very high, the ball's momentum at the bottom will push it up and over shorter hills. When slope decreases, momentum and velocity of the ball also decrease, eventually resulting in the ball coming to a rest in a valley. In other words, the momentum strategy is simulating the parameter vector (i.e. the ball) as rolling on the hilly terrain. The goal is to descend the slope of the hill faster than without momentum, while still controlling the velocity of the descent to prevent overshooting the valley altogether.

Title
Author
Type

Momentum Acceleration of Least-Squares Support Vector Machines

Jorge López, Álvaro Barbero, José R. Dorronsoro

Solving the model - SGD, Momentum and Adaptive Learning Rate

Paras Dahal

Web

Description (+133 characters)
Article (+248 characters)
Categories (+1 topics)
Related Topics (+7 topics) ### Momentum (support vector machine)

A machine learning strategy that helps accelerate stochastic gradient descent in the relevant direction while dampening oscillations.

Article

Momentum is a machine learning strategy that helps accelerate stochastic gradient descent (SGD) in the relevant direction while dampening oscillations. This is also referred to as "SGD with momentum" and is useful for training deep neural networks.

Categories  