theano scan 笔记

Theano tutoria 有关于scan的说明:http://deeplearning.net/software/theano/library/scan.html


首先,两个简单的例子。


第一种:给定循环步数n_steps

计算那个A**k

<span style="font-size:14px;">import theano
import theano.tensor as T

k = T.iscalar("k")
A = T.vector("A")

# Symbolic description of the result
result, updates = theano.scan(fn=lambda prior_result, A: prior_result * A,
                              outputs_info=T.ones_like(A),
                              non_sequences=A,
                              n_steps=k)

# We only care about A**k, but scan has provided us with A**1 through A**k.
# Discard the values that we don't care about. Scan is smart enough to
# notice this and not waste memory saving them.
final_result = result[-1]

# compiled function that returns A**k
power = theano.function(inputs=[A,k], outputs=final_result, updates=updates)

print(power(range(10),2))
print(power(range(10),4))</span>



第二种:多项式加法

<span style="font-size:14px;">import numpy

coefficients = theano.tensor.vector("coefficients")
x = T.scalar("x")

max_coefficients_supported = 10000

# Generate the components of the polynomial
components, updates = theano.scan(fn=lambda coefficient, power, free_variable: coefficient * (free_variable ** power),
                                  outputs_info=None,
                                  sequences=[coefficients, theano.tensor.arange(max_coefficients_supported)],
                                  non_sequences=x)
# Sum them up
polynomial = components.sum()

# Compile a function
calculate_polynomial = theano.function(inputs=[coefficients, x], outputs=polynomial)

# Test
test_coefficients = numpy.asarray([1, 0, 2], dtype=numpy.float32)
test_value = 3
print(calculate_polynomial(test_coefficients, test_value))
print(1.0 * (3 ** 0) + 0.0 * (3 ** 1) + 2.0 * (3 ** 2))</span>


然后,既然是用scan倒腾循环,那么记住三个重要的东西:

1.  什么在变

2.  什么不变

3.  每次循环是什么结果


tutorial里面有句总结的话:

The general order of function parameters to fn is:

sequences (if any), prior result(s) (if needed), non-sequences (if any)


对应的,每次循环需要调用一次fn函数,fn函数中的参数顺序就是:

在变的东西,上次的结果,不变的东西


然后,上次结果是什么样子的呢?看outputs_info





2016.12.17

今天被小航问题给蒙住了。


问题:

import os, time
import sys
import timeit
import numpy
import theano
import theano.typed_list
import theano.tensor as T

X = T.fmatrix("X")
emb = theano.shared(name='embeddings',
                         value=0.2 * numpy.random.uniform(-1.0, 1.0, (50 + 1, 10)).astype('float32'))
wx = theano.shared(name='wx', value=0.2 * numpy.random.uniform(-1.0, 1.0, (10, 5)).astype('float32'))
wh = theano.shared(name='wh',
                        value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 5)).astype('float32'))
w = theano.shared(name='w',
                       value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 15)).astype('float32'))
bh = theano.shared(name='bh', value=numpy.zeros(5, dtype='float32'))
b = theano.shared(name='b', value=numpy.zeros(15, dtype='float32'))
h0 = theano.shared(name='h0', value=numpy.zeros(5, dtype='float32'))

def recurrence(x_t, h_tm1):
    h_t = T.nnet.sigmoid(T.dot(x_t, wx) + T.dot(h_tm1, wh) + bh)
    s_t = T.nnet.softmax(T.dot(h_t, w) + b)
    return [h_t, s_t]


[h, s], _ = theano.scan(fn=recurrence, sequences=X, outputs_info=[h0, None], n_steps=X.shape[0])
fn = theano.function(inputs=[X], outputs=[s], allow_input_downcast=True, on_unused_input='ignore')

输入的X是2d matrix numpy.random.randint(1.,20.,(3,10)).astype('float32'),输出的s的shape。


初步一看,很简单。输入的(3,10)的matrix,经过scan每一步都做隐层表示和softmax输出,进行3步后的输出应该是s-->(3,15)。


但是正确的输出结果是s-->(3,1,15)。他的问题是中间那个维度是什么,怎么加进来的?


怎么就多了一个维度呢?


1. 我记得以前的实验中中间不会加入那个维度的,所以我好奇是否那个隐层表示是否也加了一个维度?

import os, time
import sys
import timeit
import numpy
import theano
import theano.typed_list
import theano.tensor as T

X = T.fmatrix("X")
emb = theano.shared(name='embeddings',
                         value=0.2 * numpy.random.uniform(-1.0, 1.0, (50 + 1, 10)).astype('float32'))
wx = theano.shared(name='wx', value=0.2 * numpy.random.uniform(-1.0, 1.0, (10, 5)).astype('float32'))
wh = theano.shared(name='wh',
                        value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 5)).astype('float32'))
w = theano.shared(name='w',
                       value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 15)).astype('float32'))
bh = theano.shared(name='bh', value=numpy.zeros(5, dtype='float32'))
b = theano.shared(name='b', value=numpy.zeros(15, dtype='float32'))
h0 = theano.shared(name='h0', value=numpy.zeros(5, dtype='float32'))

def recurrence(x_t, h_tm1):
    h_t = T.nnet.sigmoid(T.dot(x_t, wx) + T.dot(h_tm1, wh) + bh)
    s_t = T.nnet.softmax(T.dot(h_t, w) + b)
    return [h_t, s_t]


[h, s], _ = theano.scan(fn=recurrence, sequences=X, outputs_info=[h0, None], n_steps=X.shape[0])
fn = theano.function(inputs=[X], outputs=[h,s], allow_input_downcast=True, on_unused_input='ignore')

结果发现,h-->(3,5), s-->(3,1,15)。 

隐层没有加入那个维度,所以应该不是scan自己加入的,我自己理解的scan没有问题。


2. softmax的问题?

import os, time
import sys
import timeit
import numpy
import theano
import theano.typed_list
import theano.tensor as T

X = T.fmatrix("X")
emb = theano.shared(name='embeddings',
                         value=0.2 * numpy.random.uniform(-1.0, 1.0, (50 + 1, 10)).astype('float32'))
wx = theano.shared(name='wx', value=0.2 * numpy.random.uniform(-1.0, 1.0, (10, 5)).astype('float32'))
wh = theano.shared(name='wh',
                        value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 5)).astype('float32'))
w = theano.shared(name='w',
                       value=0.2 * numpy.random.uniform(-1.0, 1.0, (5, 15)).astype('float32'))
bh = theano.shared(name='bh', value=numpy.zeros(5, dtype='float32'))
b = theano.shared(name='b', value=numpy.zeros(15, dtype='float32'))
h0 = theano.shared(name='h0', value=numpy.zeros(5, dtype='float32'))

def recurrence(x_t, h_tm1):
    h_t = T.nnet.sigmoid(T.dot(x_t, wx) + T.dot(h_tm1, wh) + bh)
    s_t = T.nnet.sigmoid(T.dot(h_t, w) + b)
    return [h_t, s_t]


[h, s], _ = theano.scan(fn=recurrence, sequences=X, outputs_info=[h0, None], n_steps=X.shape[0])
fn = theano.function(inputs=[X], outputs=[h,s], allow_input_downcast=True, on_unused_input='ignore')


我把output那个地方改了下,让s_t也只做sigmoid操作,
T.nnet.sigmoid(T.dot(h_t, w) + b)


结果发现,h-->(3,5), s-->(3,15)。 还真的是softmax的问题。 


3. 继续查看theano的softmax函数 http://deeplearning.net/software/theano/library/tensor/nnet/nnet.html

发现softmax的return是

Returns: a symbolic 2D tensor 
接着做一个实验:

exp1

X = T.fvector("X")
out=T.nnet.softmax(X)
fn = theano.function(inputs=[X], outputs=[out], allow_input_downcast=True, on_unused_input='ignore')

b = numpy.zeros(15, dtype='float32')

c = fn(b)[0]
print numpy.shape(b)
print numpy.shape(c)

实验结果 b-->(15,)  c-->(1,15)


exp2

X = T.fmatrix("X")
out=T.nnet.softmax(X)
fn = theano.function(inputs=[X], outputs=[out], allow_input_downcast=True, on_unused_input='ignore')

b = numpy.zeros([2,5], dtype='float32')

c = fn(b)[0]
print numpy.shape(b)
print numpy.shape(c)

实验结果 b-->(2,5)  c-->(2,5)


果然,是softmax的问题。

softmax介绍中说The softmax function will, when applied to a matrix, compute the softmax values row-wise.

softmax会对每行进行softmax计算。

如果我们softmax的输入时类似exp1的vector-->(n,),softmax函数会把他认为是一行然后进行softmax计算并返回一个(1, n)。











  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值