http://blog.csdn.net/luo123n/article/details/48239963 别忘看评语 http://sebastianruder.com/optimizing-gradient-descent/index.html#gradientdescentvariants AdaptiveGradient (ADAGRAD)