参考如下:
https://oldpan.me/archives/careful-train-loss-nan-inf
https://blog.csdn.net/qq_38906523/article/details/81357895
https://blog.csdn.net/u013732444/article/details/73344628
https://blog.csdn.net/accumulate_zhang/article/details/79890624
https://blog.csdn.net/u012910595/article/details/78843031
我的问题是梯度太大,减小学习率一个等级就可以了。也可以设置梯度裁剪,当梯度超过设定值,则另梯度等于设定值。