论文解释:https://blog.csdn.net/u010352603/article/details/80590129#11-memorization-和-generalization https://xueqiu.com/9217191040/110449554 训练方法:小批量随机梯度下降https://blog.csdn.net/xiang_freedom/article/details/78395145 AdaGrad:https://blog.csdn.net/program_developer/article/details/80756008