theano tutorial(六)Loop

最新推荐文章于 2020-09-24 04:30:24 发布

pmt123456

最新推荐文章于 2020-09-24 04:30:24 发布

阅读量560

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/pmt123456/article/details/51188970

版权

python 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

Scan

1.reduction 和 map 是 scan 的特例

2.scan一个输入序列，每一个step输出一个output

3.sum()可以通过对整个函数进行z+x(i)操作，z的初值为0

4.scan是theano对looping的封装

5.使用scan的优势：

迭代此时可以成为symbolic graph的一部分

减少GPU的转换次数

在一个序列的每一步计算一个梯度值

通过监测实际的内存使用来减少总的内存消耗

scan在计算的时候，可以访问以前n步的输出结果

scan定义

# def scan(fn,
#          sequences=None,
#          outputs_info=None,
#          non_sequences=None,
#          n_steps=None,
#          truncate_gradient=-1,
#          go_backwards=False,
#          mode=None,
#          name=None,
#          profile=False,
#          allow_gc=None,
#          strict=False):
#fn:定义了进行scan的操作的函数对象，函数可以在外部定义好，也可以在内部再定义.在内部在定义的fn一般用lambda来定义需要用到的参数，在外部就def好的函数，fn直接函数名即可。
#sequences：就是需要迭代的序列（输入序列），它的值将会传给fn作为前面的参数。如果在output_info里有initial，那这个参数可以省略。
#outputs_info描述了需要用到的初始化值，以及是否需要用到前几次迭代输出的结果，dict(initial=X, taps=[-2, -1])表示使用序列x作为初始化值，taps表示会用到前一次和前两次输出的结果。如果当前迭代输出为x(t)，则计算中使用了(x(t-1)和x(t-2)。
#non_sequences:描述了非序列的输入（参数），它的值传给fn后面的参数，且每次迭代的A都是不变的。
#lambda匿名函数，感觉有点像函数对象
#shape[i]第i维的长

# Scan
import theano
import theano.tensor as T
import numpy as np
X=T.matrix('X')
W=T.matrix('W')
b_sym=T.vector("b_sym")

#return:(outputs_list, update_dictionary)
#outputs_list，和output_info（outputs的初始状态）顺序的一样
#update_dictionary怎样在每个迭代步骤之后shared 变量
results,updates=theano.scan(lambda v:T.tanh(T.dot(v,W)+b_sym),sequences=X)
compute_elementwise=theano.function(inputs=[W,X,b_sym],outputs=results)

x=np.eye(2,dtype=theano.config.floatX)
w=np.ones((2,2),dtype=theano.config.floatX)
b = np.ones((2), dtype=theano.config.floatX)
b[1] = 2 

print(compute_elementwise(x, w, b))

# 和numpy计算出的结果一样
print(np.tanh(x.dot(w) + b))

每次

#coding=utf-8

import theano
import theano.tensor as T
import numpy as np

# define tensor variables
X = T.vector("X")
W = T.matrix("W")
b_sym = T.vector("b_sym")
Y = T.matrix("Y")


results, updates = theano.scan(lambda y,x_tm1: T.dot(x_tm1, W),
          sequences=[Y], outputs_info=[X])
compute_seq = theano.function(inputs=[X, W, Y], outputs=results)

# test values
x = np.zeros((2), dtype=theano.config.floatX)
x[1] = 1
w = np.ones((2, 2), dtype=theano.config.floatX)
y = np.ones((5, 2), dtype=theano.config.floatX)
y[0, :] = -3


print(compute_seq(x, w, y))

# comparison with numpy
x_res = np.zeros((5, 2), dtype=theano.config.floatX)
x_res[0] = x.dot(w)
for i in range(1, 5):
    x_res[i] = x_res[i - 1].dot(w)
print(x_res)

把官网教程里面的改了，比较好比较结果，每次抽取一个output_info 形式的作为一个step进行计算

[[  1.   1.]
 [  2.   2.]
 [  4.   4.]
 [  8.   8.]
 [ 16.  16.]]
[[  1.   1.]
 [  2.   2.]
 [  4.   4.]
 [  8.   8.]
 [ 16.  16.]]

更好的理解sequence和output_info

#coding=utf-8

import theano
import theano.tensor as T
import numpy as np

# define tensor variables
X = T.vector("X")
W = T.matrix("W")
b_sym = T.vector("b_sym")
U = T.matrix("U")
Y = T.matrix("Y")
V = T.matrix("V")
P = T.matrix("P")

# results, updates = theano.scan(lambda y, p, x_tm1: T.tanh(T.dot(x_tm1, W) + T.dot(y, U) + T.dot(p, V)),
#           sequences=[Y, P[::-1]], outputs_info=[X])#outputs_info=[X]代表公式中的x(t-1)
#注意以下P[::-1]的用法
#下面改写一下sequence和outputs_info,结果是一样的
results, updates = theano.scan(lambda y, p, x_tm1: T.tanh(T.dot(x_tm1, W) + T.dot(y, U) + T.dot(p, V)),
           sequences=[dict(input=Y,taps=[0]), P[::-1]], outputs_info=[dict(initial=X,taps=[-1])])
compute_seq = theano.function(inputs=[X, W, Y, U, P, V], outputs=results)

# test values
x = np.zeros((2), dtype=theano.config.floatX)
x[1] = 1
w = np.ones((2, 2), dtype=theano.config.floatX)
y = np.ones((5, 2), dtype=theano.config.floatX)
y[0, :] = -3
u = np.ones((2, 2), dtype=theano.config.floatX)
p = np.ones((5, 2), dtype=theano.config.floatX)
p[0, :] = 3
v = np.ones((2, 2), dtype=theano.config.floatX)

print(compute_seq(x, w, y, u, p, v))

# comparison with numpy
x_res = np.zeros((5, 2), dtype=theano.config.floatX)
x_res[0] = np.tanh(x.dot(w) + y[0].dot(u) + p[4].dot(v))
for i in range(1, 5):
    x_res[i] = np.tanh(x_res[i - 1].dot(w) + y[i].dot(u) + p[4-i].dot(v))
print(x_res)

Computing norms of lines of X

#coding=utf-8

import theano
import theano.tensor as T
import numpy as np

X=T.matrix("X")
#注意这里貌似抽取的是X的一行，如果要抽取一列应该是sequences=[X.T]
results,updates=theano.scan(lambda x_i:T.sqrt((x_i**2).sum()),sequences=[X])
compute_norm_lines=theano.function(inputs=[X],outputs=results)

#diag(v,k)如果k>0，对角线在主轴上面，k<0在下面
x=np.diag(np.arange(1,6,dtype=theano.config.floatX),1)
print(compute_norm_lines(x))

print(np.sqrt(x**2).sum(1))

[ 1.  2.  3.  4.  5.  0.]
[ 1.  2.  3.  4.  5.  0.]

pmt123456

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
theano tutorial(六)Loop

Scan1.reduction 和 map 是 scan 的特例2.scan一个输入序列，每一个step输出一个output3.sum()可以通过对整个函数进行z+x(i)操作，z的初值为04.scan是theano对looping的封装5.使用scan的优势：迭代此时可以成为symbolic graph的一部分减少GPU的转换次数
复制链接

扫一扫

专栏目录