基于非线性最小二乘高斯牛顿法提出一种SGD更好的更新神经网络的算法:二节收敛,计算量不高和一节差不多。可扩充到分类问题。这个算法的思想也和NTK(neural tangent kernel)问题有点类似,并叙述了他们之间的关系。
1.基础知识:
(1)最小二乘问题:https://blog.csdn.net/jing___yu/article/details/100063967
(2)神经正切核: https://blog.csdn.net/ddzr972435946/article/details/102163161
2.原文链接:
https://www.groundai.com/project/a-gram-gauss-newton-method-learning-overparameterized-deep-neural-networks-for-regression-problems/1
3.ppt:
http://tianle.mit.edu/sites/default/files/documents/Tianle_GGN_PKU.pdf
4.算法: