第一篇《Intro to optimization in deep learning: Gradient Descent》https://blog.paperspace.com/intro-to-optimization-in-deep-learning-gradient-descent/
第二篇《Intro to optimization in deep learning: Momentum, RMSProp and Adam》https://blog.paperspace.com/intro-to-optimization-in-deep-learning-gradient-descent/
第三篇《Intro to Optimization in Deep Learning: Vanishing Gradients and Choosing the Right Activation Function》https://blog.paperspace.com/vanishing-gradients-activation-function/