BGD (Batch Gradient Descent) 批梯度下降法
SGD (Stochastic Gradient Descent) 随机梯度下降法
MBGD (Mini-batch Gradient Descent) 小批梯度下降法
MGD (Momentum Gradient Descent) 动量梯度下降法
NAG (Nesterov Accelerated Gradient) 涅斯捷罗夫加速梯度下降法
AGD (Adaptive Gradient Descent) 自适应梯度下降法
Adadelta (Adadelta Gradient Descent) 自适应 Δ \Delta Δ梯度下降法
RMSprop (Root Mean Square propagation Gradient Descent) 均方根传递梯度下降法
Adam (Adaptive Moment Estimation Gradient Descent) 自适应矩估计梯度下降算法
相关链接
[1] https://drivingc.com/p/5bdfa51fd249870dca3afe22
[2] https://www.cnblogs.com/guoyaohua/p/8542554.html
[3] https://blog.csdn.net/wfei101/article/details/79938305?utm_medium=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-6.channel_param&depth_1-utm_source=distribute.pc_relevant.none-task-blog-BlogCommendFromMachineLearnPai2-6.channel_param
[4] https://www.cnblogs.com/tabatabaye/articles/1112475.html
[5] https://blog.csdn.net/u012328159/article/details/80311892
[6] http://www.atyun.com/2257.html
[7] https://blog.csdn.net/u011497262/article/details/88787905?utm_medium=distribute.pc_relevant.none-task-blog-title-1&spm=1001.2101.3001.4242
[8] https://www.cnblogs.com/yifdu25/p/8183587.html