https://zhuanlan.zhihu.com/p/41379279 SGDR 带重启的随机梯度下降 stochastic gradient descent with restartshttps://gist.github.com/jeremyjordan/5a222e04bb78c242f5763ad40626c452 找寻最优学习速率 https://gist.github.com/jeremyjordan/ac0229abd4b2b7000aca1643e88e0f02 Visualizing the Loss Landscape of Neural Nets