Deep Learning from first principles in Python, R and Octave – Part 7

This article optimization methods used in Stochastic Gradient Descent (SGD) Specifically the article discusses & implement the following gradient descent optimization techniques a.) Vanilla SGD b).Learning rate decay c). Momentum method d.) RMSProp e). Adaptive Moment Estimation (ADAM)
Want to leave a comment?