PyTorch定义新的自动求导(Autograd) 函数
pytorch官网提供了定义新的求导函数的方法(链接放在文章末尾了),官网举的例子,可能我比较笨,愣是反应了好一会儿才理解。这篇博客主要讲 PyTorch求导的计算,以及定义新的自动求导函数,它怎么用。
pytorch求导时,用的是链式求导法则,例如:
y
=
k
∗
f
u
n
1
(
c
∗
x
)
+
b
y = k* fun1(c * x) + b
y=k∗fun1(c∗x)+b,其中
f
u
n
1
(
p
)
=
p
2
fun1(p) = p^{2}
fun1(p)=p2 (
y
y
y的表达式里
p
=
c
∗
x
p = c * x
p=c∗x)
比如这个式子,我们来求一下
k
,
c
和
b
k , c 和 b
k,c和b 关于
y
y
y 的导数 :
- d k = ∂ y ∂ k = f u n 1 ( c ∗ x ) = ( c x ) 2 \mathrm{d}k = \frac{\partial y}{\partial k} = fun1(c*x) = (cx)^{2} dk=∂k∂y=fun1(c∗x)=(cx)2
- d b = ∂ y ∂ b = 1 \mathrm{d}b = \frac{\partial y}{\partial b} = 1 db=∂b∂y=1
- d c = ∂ y ∂ f u n 1 ∗ ∂ f u n 1 ∂ p ∗ ∂ p ∂ c = k ∗ 2 p ∗ x = k ∗ 2 c x ∗ x \mathrm{d}c = \frac{\partial y}{\partial fun1} * \frac{\partial fun1}{\partial p} * \frac{\partial p}{\partial c} = k *2p * x = k *2cx * x dc=∂fun1∂y∗∂p∂fun1∗∂c∂p=k∗2p∗x=k∗2cx∗x
假设 x = 12 , k = 0.5 , b = 0.7 , c = 0.2 x=12, k=0.5, b=0.7, c=0.2 x=12,k=0.5,b=0.7,c=0.2, 那么我们算出来的梯度
- d k = f u n 1 ( c ∗ x ) = f u n 1 ( 2.4 ) = 5.76 dk = fun1(c*x) = fun1(2.4) = 5.76 dk=fun1(c∗x)=fun1(2.4)=5.76
- d b = 1 db = 1 db=1
- d c = k ∗ ( 2 ∗ ( c ∗ x ) ) ∗ x = 28.8 dc = k *(2*(c*x)) * x= 28.8 dc=k∗(2∗(c∗x))∗x=28.8
我们来验证一下
import torch
def fun1(par):
out = par**2
return out
x = torch.tensor([12])
k = torch.full((), 0.5, requires_grad=True) #定义可训练参数k,b,c 初始值均设为0.5,0.7和0.2
b = torch.full((), 0.7, requires_grad=True)
c = torch.full((), 0.2, requires_grad=True)
y = k*fun1(c * x) +b
y.backward()
print(k.grad)
print(b.grad)
print(c.grad)
运行结果如下:
推理正确;
那么,如果我们此时想自定义
∂
f
u
n
1
∂
p
\frac{\partial fun1}{\partial p}
∂p∂fun1
也就是自定义fun1函数的求导过程。
我们定义
∂
f
u
n
1
∂
p
=
p
\frac{\partial fun1}{\partial p} = p
∂p∂fun1=p
那么:
d
c
=
∂
y
∂
f
u
n
1
∗
∂
f
u
n
1
∂
p
∗
∂
p
∂
c
=
k
∗
p
∗
x
=
k
∗
c
x
∗
x
=
14.4
\mathrm{d}c = \frac{\partial y}{\partial fun1} * \frac{\partial fun1}{\partial p} * \frac{\partial p}{\partial c} = k*p*x = k *cx * x = 14.4
dc=∂fun1∂y∗∂p∂fun1∗∂c∂p=k∗p∗x=k∗cx∗x=14.4
验证代码如下:
import torch
class fun1(torch.autograd.Function):
@staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
out = input**2 #定义前向传播函数fun1= input**2
return out
@staticmethod
def backward(ctx, grad_output):
input, = ctx.saved_tensors
return grad_output * input #定义dfun1/dinput = input
fun = fun1.apply
x = torch.tensor([12])
k = torch.full((), 0.5, requires_grad=True) #定义可训练参数k,b,初始值均设为0.3
b = torch.full((), 0.7, requires_grad=True)
c = torch.full((), 0.2, requires_grad=True)
y = k*fun(c * x) +b
y.backward()
print(c.grad)
运行结果如下:
我们定义
∂
f
u
n
1
∂
p
=
1
\frac{\partial fun1}{\partial p} = 1
∂p∂fun1=1
那么:
d
c
=
∂
y
∂
f
u
n
1
∗
∂
f
u
n
1
∂
p
∗
∂
p
∂
c
=
k
∗
1
∗
x
=
6
\mathrm{d}c = \frac{\partial y}{\partial fun1} * \frac{\partial fun1}{\partial p} * \frac{\partial p}{\partial c} = k *1 * x = 6
dc=∂fun1∂y∗∂p∂fun1∗∂c∂p=k∗1∗x=6
验证代码如下:
import torch
class fun1(torch.autograd.Function):
@staticmethod
def forward(ctx, input):
ctx.save_for_backward(input)
out = input**2 #定义前向传播函数fun1= input**2
return out
@staticmethod
def backward(ctx, grad_output):
input, = ctx.saved_tensors
return grad_output * 1 #定义dfun1/dinput = 1
fun = fun1.apply
x = torch.tensor([12])
k = torch.full((), 0.5, requires_grad=True) #定义可训练参数k,b,初始值均设为0.3
b = torch.full((), 0.7, requires_grad=True)
c = torch.full((), 0.2, requires_grad=True)
y = k*fun(c * x) +b
y.backward()
print(c.grad)
运行结果如下:
总结
总的来说,这篇博客就是解释了官方定义的求导函数是怎么求导的,给出了计算过程;解释了官方自定义求导函数,定义的是哪一部分的函数。
官方链接:PyTorch:定义新的 Autograd 函数