Gradient descent is a first-order iterative optimization algorithm for finding a local minimum of a differentiable function. The idea is to take repeated steps in the opposite direction of the gradient (or approximate gradient) of the function at the current point, because this is the direction of steepest descent. Conversely, stepping in the direction of the gradient will lead to a local maximum of that function; the procedure is then known as gradient ascent.
Convergence and efficiency of subgradient methods for quasiconvex minimization
Krzysztof C. Kiwiel
Momentum and Learning Rate Adaptation