获取Pytorch的反向梯度–以Depthwise为例
前言
在开发自己的训练框架或者手写算子的时候,经常要确认的一件事情是,怎么确定自己的手写的算子是正确的,那么baseline就是开源框架(pytorch,caffe,tensorflow等),对于这种框架而言,如何获取求导的梯度+正向的结果。本文以Depthwise_backward为例。
Depthwise_backward
初始化一个算子,特别注意的地方在于,所有的反向求导的输入out.backward(gradient=outgrad)这里都是要初始化全一的向量的,这样才能保证所有的结果是正确的。
总体代码
import numpy as np
import torch
import torch.nn as nn
import torch.nn.functional as F
from torch.autograd import Variable
def tensor_hook(grad):
print('tensor hook')
print('grad:', grad)
return grad
class MyNet(nn.Module):
def __init__(self):
super(MyNet, self).__init__()
self.f1 = nn.Conv2d(3,3,4,1,0,groups = 3,bias = False)
def forward(self, input):
self.input = input
output = self.f1(input)
return output
def my_hook(self, module, grad_input, grad_output):
print('original grad:', grad_input)
print('original outgrad:', grad_output)
if __name__ == '__main__':
net = MyNet()
p = net.state_dict()
params = {}
for k,v in p.items():
params["weight"] = v.detach().numpy()
x = torch.randn((1,3,322,322))
params["data"] = x.numpy()
x.requires_grad = True
x.register_hook(tensor_hook)
out = net(x)
out.retain_grad()
outgrad = torch.from_numpy(np.ones(out.shape,dtype = "float32"))
out.backward(gradient=outgrad)
for name, param in net.named_parameters():
params["dw"] = param.grad.numpy()
params["dx"] = x.grad.numpy()
params["dout"] = outgrad.numpy()