theano scan 笔记

最新推荐文章于 2021-10-11 16:09:03 发布

warrioR_wx

最新推荐文章于 2021-10-11 16:09:03 发布

阅读量1k

点赞数 1

分类专栏： deep learning

本文链接：https://blog.csdn.net/wangxinginnlp/article/details/50886643

版权

deep learning 专栏收录该内容

12 篇文章 0 订阅

订阅专栏

Theano tutoria 有关于scan的说明：http://deeplearning.net/software/theano/library/scan.html

首先，两个简单的例子。

第一种：给定循环步数n_steps

计算那个A**k

<span style="font-size:14px;">import theano
import theano.tensor as T

k = T.iscalar("k")
A = T.vector("A")

# Symbolic description of the result
result, updates = theano.scan(fn=lambda prior_result, A: prior_result * A,
                              outputs_info=T.ones_like(A),
                              non_sequences=A,
                              n_steps=k)

# We only care about A**k, but scan has provided us with A**1 through A**k.
# Discard the values that we don't care about. Scan is smart enough to
# notice this and not waste memory saving them.
final_result = result[-1]

# compiled function that returns A**k
power = theano.function(inputs=[A,k], outputs=final_result, updates=updates)

print(power(range(10),2))
print(power(range(10),4))</span>

第二种：多项式加法

<span style="font-size:14px;">import numpy

coefficients = theano.tensor.vector("coefficients")
x = T.scalar("x")

max_coefficients_supported = 10000

# Generate the components of the polynomial
components, updates = theano.scan(fn=lambda coefficient, power, free_variable: coefficient * (free_variable ** power),
                                  outputs_info=None,
                                  sequences=[coefficients, theano.tensor.arange(max_coefficients_supported)],
                                  non_sequences=x)
# Sum them up
polynomial = components.sum()

# Compile a function
calculate_polynomial = theano.function(inputs=[coefficients, x], outputs=polynomial)

# Test
test_coefficients = numpy.asarray([1, 0, 2], dtype=numpy.float32)
test_value = 3
print(calculate_polynomial(test_coefficients, test_value))
print(1.0 * (3 ** 0) + 0.0 * (3 ** 1) + 2.0 * (3 ** 2))</span>

然后，既然是用scan倒腾循环，那么记住三个重要的东西：

1. 什么在变

2. 什么不变

3. 每次循环是什么结果

tutorial里面有句总结的话：

The general order of function parameters to fn is:

sequences (if any), prior result(s) (if needed), non-sequences (if any)

对应的，每次循环需要调用一次fn函数，fn函数中的参数顺序就是：

在变的东西，上次的结果，不变的东西

然后，上次结果是什么样子的呢？看outputs_info

2016.12.17

今天被小航问题给蒙住了。

问题：

import os, time
import sys
import timeit
import numpy
import theano
import theano.typed_list
import theano.tensor as T

X = T.fmatrix("X")
emb = theano.shared(name='embeddings',
                         value=0.2 * numpy.random.uniform(-1.0, 1.0, (50 + 1, 10)).astype('float32'))
wx = theano.shared(name='wx', value=0.2 * numpy.random.uniform(-1.0, 1.0, (10, 5)).astype('float32'))
wh = theano.shared(name='wh',
                        value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 5)).astype('float32'))
w = theano.shared(name='w',
                       value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 15)).astype('float32'))
bh = theano.shared(name='bh', value=numpy.zeros(5, dtype='float32'))
b = theano.shared(name='b', value=numpy.zeros(15, dtype='float32'))
h0 = theano.shared(name='h0', value=numpy.zeros(5, dtype='float32'))

def recurrence(x_t, h_tm1):
    h_t = T.nnet.sigmoid(T.dot(x_t, wx) + T.dot(h_tm1, wh) + bh)
    s_t = T.nnet.softmax(T.dot(h_t, w) + b)
    return [h_t, s_t]


[h, s], _ = theano.scan(fn=recurrence, sequences=X, outputs_info=[h0, None], n_steps=X.shape[0])
fn = theano.function(inputs=[X], outputs=[s], allow_input_downcast=True, on_unused_input='ignore')

输入的X是2d matrix numpy.random.randint(1.,20.,(3,10)).astype('float32')，输出的s的shape。

初步一看，很简单。输入的(3,10)的matrix，经过scan每一步都做隐层表示和softmax输出，进行3步后的输出应该是s-->(3,15)。

但是正确的输出结果是s-->(3,1,15)。他的问题是中间那个维度是什么，怎么加进来的？

怎么就多了一个维度呢？

1. 我记得以前的实验中中间不会加入那个维度的，所以我好奇是否那个隐层表示是否也加了一个维度？

import os, time
import sys
import timeit
import numpy
import theano
import theano.typed_list
import theano.tensor as T

X = T.fmatrix("X")
emb = theano.shared(name='embeddings',
                         value=0.2 * numpy.random.uniform(-1.0, 1.0, (50 + 1, 10)).astype('float32'))
wx = theano.shared(name='wx', value=0.2 * numpy.random.uniform(-1.0, 1.0, (10, 5)).astype('float32'))
wh = theano.shared(name='wh',
                        value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 5)).astype('float32'))
w = theano.shared(name='w',
                       value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 15)).astype('float32'))
bh = theano.shared(name='bh', value=numpy.zeros(5, dtype='float32'))
b = theano.shared(name='b', value=numpy.zeros(15, dtype='float32'))
h0 = theano.shared(name='h0', value=numpy.zeros(5, dtype='float32'))

def recurrence(x_t, h_tm1):
    h_t = T.nnet.sigmoid(T.dot(x_t, wx) + T.dot(h_tm1, wh) + bh)
    s_t = T.nnet.softmax(T.dot(h_t, w) + b)
    return [h_t, s_t]


[h, s], _ = theano.scan(fn=recurrence, sequences=X, outputs_info=[h0, None], n_steps=X.shape[0])
fn = theano.function(inputs=[X], outputs=[h,s], allow_input_downcast=True, on_unused_input='ignore')

结果发现，h-->(3,5), s-->(3,1,15)。

隐层没有加入那个维度，所以应该不是scan自己加入的，我自己理解的scan没有问题。

2. softmax的问题？

import os, time
import sys
import timeit
import numpy
import theano
import theano.typed_list
import theano.tensor as T

X = T.fmatrix("X")
emb = theano.shared(name='embeddings',
                         value=0.2 * numpy.random.uniform(-1.0, 1.0, (50 + 1, 10)).astype('float32'))
wx = theano.shared(name='wx', value=0.2 * numpy.random.uniform(-1.0, 1.0, (10, 5)).astype('float32'))
wh = theano.shared(name='wh',
                        value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 5)).astype('float32'))
w = theano.shared(name='w',
                       value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 15)).astype('float32'))
bh = theano.shared(name='bh', value=numpy.zeros(5, dtype='float32'))
b = theano.shared(name='b', value=numpy.zeros(15, dtype='float32'))
h0 = theano.shared(name='h0', value=numpy.zeros(5, dtype='float32'))

def recurrence(x_t, h_tm1):
    h_t = T.nnet.sigmoid(T.dot(x_t, wx) + T.dot(h_tm1, wh) + bh)
    s_t = T.nnet.sigmoid(T.dot(h_t, w) + b)
    return [h_t, s_t]


[h, s], _ = theano.scan(fn=recurrence, sequences=X, outputs_info=[h0, None], n_steps=X.shape[0])
fn = theano.function(inputs=[X], outputs=[h,s], allow_input_downcast=True, on_unused_input='ignore')

我把output那个地方改了下，让s_t也只做sigmoid操作，

T.nnet.sigmoid(T.dot(h_t, w) + b)

结果发现，h-->(3,5), s-->(3,15)。 还真的是softmax的问题。

3. 继续查看theano的softmax函数 http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html

发现softmax的return是

Returns:	a symbolic 2D tensor

接着做一个实验：

exp1

X = T.fvector("X")
out=T.nnet.softmax(X)
fn = theano.function(inputs=[X], outputs=[out], allow_input_downcast=True, on_unused_input='ignore')

b = numpy.zeros(15, dtype='float32')

c = fn(b)[0]
print numpy.shape(b)
print numpy.shape(c)

实验结果 b-->(15,) c-->(1,15)

exp2

X = T.fmatrix("X")
out=T.nnet.softmax(X)
fn = theano.function(inputs=[X], outputs=[out], allow_input_downcast=True, on_unused_input='ignore')

b = numpy.zeros([2,5], dtype='float32')

c = fn(b)[0]
print numpy.shape(b)
print numpy.shape(c)

实验结果 b-->(2,5) c-->(2,5)

果然，是softmax的问题。

softmax介绍中说The softmax function will, when applied to a matrix, compute the softmax values row-wise.

softmax会对每行进行softmax计算。

如果我们softmax的输入时类似exp1的vector-->(n,)，softmax函数会把他认为是一行然后进行softmax计算并返回一个(1, n)。

warrioR_wx

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
1
评论
theano scan 笔记

Theano tutoria 有关于scan的说明：http://deeplearning.net/software/theano/library/scan.html首先，两个简单的例子。第一种：给定循环步数n_steps计算那个A**kimport theanoimport theano.tensor as Tk = T.iscalar("k")A = T
复制链接

扫一扫