theano tutorial(五)计算偏导数

最新推荐文章于 2020-05-06 16:09:59 发布

pmt123456

最新推荐文章于 2020-05-06 16:09:59 发布

阅读量794

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/pmt123456/article/details/51182463

版权

python 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

#梯度计算
import numpy
import theano
import theano.tensor as T
from theano import pp

x=T.scalar('x')
y=x**2
gy=T.grad(y,x)
print(pp(gy))#pp()打印梯度的符号表达式
#((fill((x ** TensorConstant{2}), TensorConstant{1.0}) * TensorConstant{2}) * (x ** (TensorConstant{2} - TensorConstant{1})))
#fill((x ** 2), 1.0)指创建一个x**2大小的矩阵，并填充1
f=theano.function([x],gy)
print(f(4))

#numpy.allclose:Returns True if two arrays are element-wise equal within a tolerance.
print(numpy.allclose(f(94.2), 188.4))

#经过优化的梯度符号表达式(TensorConstant{2.0} * x)
print(pp(f.maker.fgraph.outputs[0]))

优化之后，graph之中就只有一个Apply节点，输入变成了两个

更复杂的梯度计算

logistic

x=T.dmatrix('x')
s=T.sum(1/(1+T.exp(-x)))
gs=T.grad(s,x)
dlogistic=theano.function([x],gs)
print(dlogistic([[0,1],[-1,-2]]))

计算Jacobian矩阵

雅可比矩阵是一阶偏导数以一定方式排列成的矩阵

顺序计算y中每一个元素相对于x的偏导数

scan是Theano中的一个操作符，允许以符号表达式的方式写入每一个循环的公式里面

# Jacobian
import theano
import theano.tensor as T
x=T.dvector('x')
y=x**2


# def scan(fn,
#          sequences=None,
#          outputs_info=None,
#          non_sequences=None,
#          n_steps=None,
#          truncate_gradient=-1,
#          go_backwards=False,
#          mode=None,
#          name=None,
#          profile=False,
#          allow_gc=None,
#          strict=False):
#fn:定义了每一次scan的操作
#sequences：定义了进行迭代操作的对象
#non_sequences:用于传给fn的list
#lambda匿名函数，感觉有点像函数对象
#shape[i]第i维的长
J,updates=theano.scan(lambda i,y,x:T.grad(y[i],x),sequences=T.arange(y.shape[0]), non_sequences=[y,x])
f=theano.function([x],J,updates=updates)
print(f([4,4]))

Hessian矩阵：

一个自变量为向量的实值函数的二阶偏导数组成的方块矩阵

theano中输出为scalar 输入为vector

和jacobbi的区别T.grad(cost,x)中cost是一个scalar

#coding=utf-8

# Jacobian
import theano
import theano.tensor as T
x=T.dvector('x')
y=x**2
cost=y.sum()
gy=T.grad(cost,x)
H,updates=theano.scan(lambda i,gy,x:T.grad(gy[i],x),sequences=T.arange(gy.shape[0]), non_sequences=[gy,x])
f=theano.function([x],H,updates=updates)
print(f([4,4]))

jacobian R-operation 和L-operation 没有看懂。。。

总结

1.grad函数是以符号表达（公式）的方式工作的

2.grad是一个宏观的的表达式，因为他的内部可以迭代的进行运算

3.标量只能直接被grad计算出来，数组是迭代的对每一个数组元素进行计算

4.左乘和右乘的问题留个问号