pytorch笔记：container 和 autograd

_森罗万象

已于 2024-03-29 11:01:44 修改

阅读量292

点赞数

分类专栏：学习笔记文章标签： pytorch 笔记深度学习

于 2023-06-08 22:49:24 首次发布

本文链接：https://blog.csdn.net/weixin_52812620/article/details/131117221

版权

学习笔记专栏收录该内容

52 篇文章 1 订阅

订阅专栏

来自B站视频，官网教程，API查阅

container 源码主要包括 Sequential，ModuleList，ModuleDict，ParameterList，ParameterDict，都继承自 Module
Sequential 有 forword 函数，有运算功能和存放功能，ModuleList 只有存放功能，如果放在python的列表中就不能被nn.Module的方法访问了

We can only obtain the grad properties for the leaf nodes of the computational graph, which have requires_grad property set to True. For all other nodes in our graph, gradients will not be available.

We can only perform gradient calculations using backward once on a given graph, for performance reasons. If we need to do several backward calls on the same graph, we need to pass retain_graph=True to the backward call.

In many cases, we have a scalar loss function, and we need to compute the gradient with respect to some parameters. However, there are cases when the output function is an arbitrary tensor. In this case, PyTorch allows you to compute so-called Jacobian product, and not the actual gradient.

向量（或矩阵） $y$ 对 $x$ 求导，pytorch实际上计算的是 $\frac{\delta l}{\delta y}\frac{\delta y}{\delta x}$ ，tensor.backword传入的第一个参数就是 $\frac{\delta l}{\delta y}$ ，是一个向量； $\frac{\delta y}{\delta x}$ 是雅可比矩阵，最后得到的是向量（或矩阵）
Automatic Differentiation in Machine Learning: a Survey 的图2中提到了符号微分，数值微分，自动微分（前向）。表二提到了前向计算和对 $x_1$ 的前向自动微分，表3提到了前向自动微分和反向自动微分