autograd 官方定义
来看看官方文档中对autograd的解释:
Conceptually, autograd keeps a record of data (tensors) and all executed operations (along with the resulting new tensors) in a directed acyclic graph (DAG) consisting of Function objects. In this DAG, leaves are the input tensors, roots are the output tensors. By tracing this graph from roots to leaves, you can automatically compute the gradients using the chain rule.
In a forward pass, autograd does two things simultaneously:
run the requested operation to compute a resulting tensor
maintain the operation’s gradient function in the DAG.
The backward pass kicks off when .backward() is called on the DAG root. autograd then:
computes the gradients from each .grad_fn,
accumulates them in the respective tensor’s .grad attribute
using the chain rule, propagates all the way to the leaf tensors.
from: https://pytorch.org/tutorials/beginner/basics/autogradqs_tutorial.html#more-on-computational-graphs
划重点:
自动求导机制通过有向无环图(directed acyclic graph ,DAG)实现
在DAG中,记录数据(对应tensor.data)以及操作(对应tensor.grad_fn)
操作在pytorch中统称为Function,如加法、减法、乘法、ReLU、conv、Pooling等,统统是Function