--------------------------序-----------------------------------------------
主要针对[1]提到的"Jacobian向量积"
这篇文章的核心思想就是:
code2.py的代码比code1.py的代码运行更快.
运行环境:
Ubuntu18.10
Jupyter Notebook(python3.6)
注意不要使用终端运行下面的代码,会导致图片无法显示的.
-----------------code1.py代码如下---------------------------------------------------------
import theano
import numpy as np
import theano.tensor as T
import time
start= time.time()
# x_val = np.random.randn(3)
# u_val = np.random.randn(3)
x_val=[2,-0.4,1]
u_val=[-1.5,0.5,-0.5]
x = T.vector('x')#转化为张量
# f = T.sin(T.sin(T.sin(x)))
f=T.sin(x)
u = T.vector('u')
#-----------------------------------------------------------------------
jvp = T.Rop(f, x, u)
jvp_compiled = theano.function([x, u], jvp)#这句代码是与code2.py不一样的地方
result=jvp_compiled(x_val, u_val)
print("result=",result)
end=time.time()
print("消耗时间=",end-start)
display(Image(theano.printing.pydotprint(jvp, return_image=True, var_with_name_simple=True)))
运行结果如下:
-----------------code2.py代码如下---------------------------------------------------------
import theano
import numpy as np
import theano.tensor as T
import time
from IPython.display import display,Image
start= time.time()
# x_val = np.random.randn(3)
# u_val = np.random.randn(3)
x_val=[2,-0.4,1]
u_val=[-1.5,0.5,-0.5]
# print("x_val=",x_val)
# print("u_val=",u_val)
x = T.vector('x')
# f = T.sin(T.sin(T.sin(x)))
f=T.sin(x)
u = T.vector('u')
def alternative_Rop(f, x, u):
v = f.type('v') # Dummy variable v of same type as f
print("v=",v)
g = T.Lop(f, x, v) # Jacobian of f left multiplied by v
return T.Lop(g, v, u)
#-----------------------------------------------------------------------
alternative_jvp = alternative_Rop(f, x, u)
alternative_jvp_compiled = theano.function([x, u], alternative_jvp)
result=alternative_jvp_compiled(x_val, u_val)
print("result=",result)
# result= [0.27014816 0.39571335 -0.13225414]
end=time.time()
print("消耗时间=",end-start)
display(Image(theano.printing.pydotprint(alternative_jvp, return_image=True, var_with_name_simple=True)))
运行结果如下:
----------------------------------------------下面是theano中的Lop与Rop的作用-------------------------------------------------------
设定:
f
=
x
⋅
W
f=x·W
f=x⋅W
Jacobian式:
∂
f
∂
W
\frac{\partial f}{\partial W}
∂W∂f,
那么功能如下:
T.Lop(f,W,V) | T.Rop(f,W,V) |
---|---|
( ∂ f ∂ W ) T V (\frac{\partial f}{\partial W})^TV (∂W∂f)TV | ∂ f ∂ W V \frac{\partial f}{\partial W}V ∂W∂fV |
-------------------------------------下面是alternative_Rop函数中的代码与数学公式的具体对应关系-------------------------------
alternative_Rop的核心思想:
引入冗余变量(dummy variable)V,然后连续使用两次T.Lop操作来实现T.Rop的效果,
所以取名叫做alternative_Rop.
代码 | 数学公式 | 备注 |
---|---|---|
g=T.Lop(f, x, v) | g = ( ∂ f ∂ x ) T V g=(\frac{\partial f}{\partial x})^TV g=(∂x∂f)TV | 第1次Lop操作 |
T.Lop(g, v, u) | ( ∂ g ∂ V ) T u = ∂ f ∂ x u (\frac{\partial g}{\partial V})^Tu=\frac{\partial f}{\partial x}u (∂V∂g)Tu=∂x∂fu | 第2次Lop操作 |
代码的最终目标是实现:
∂
f
∂
x
u
\frac{\partial f}{\partial x}u
∂x∂fu
-----------------------------------------------------结论----------------------------------------------------------------------------------------
上述理论证明:
两次Lop确实可以实现Rop的效果.
上述代码运行结果证明:
alternative_Rop的Graph(上述代码运行后得到的彩图)比T.Rop更为简洁.
alternative_Rop的速度明显比T.Rop更快.
Reference:
[1]A new trick for calculating Jacobian vector products
[2]theano中的Rop和Lop的详细解释