Theano权重子集更新

      新入门Theano,官方文档最后一节讲到Theano中部分权重(权重子集)更新问题“How to update a subset of weights?”,按照教程自己写了一个实例,但是f = theano.function(…, updates=updates)报错。Bug信息提示,updates = inc_subtensor(subset, g*lr)得到的是一个tensor类型,而updates要求的输入类型是二元元组:The updates parameter must be an OrderedDict/dict or a list of lists/tuples with 2 elements。

      经过多次修改尝试但未能解决,最终还是在万能的Stack Overflow上找到了inc_subtensor用于updates的方法。自行在Logistic Regression算法上做了一点尝试,验证方法可行:

import theano
import theano.tensor as T
import numpy

x = theano.shared(numpy.random.rand(100,50).astype(theano.config.floatX), name = 'x')
w = theano.shared(numpy.random.rand(50,1).astype(theano.config.floatX), name = 'w')
y = theano.shared(numpy.random.randint(size=(100,1), low=0, high=2).astype(theano.config.floatX), name = 'y')

part = 40
wsubset = w[0:part,:]
wrest = w[part:,:]
print "w before train:"
print w.get_value().T

p_1 = 1 / (1 + T.exp(-T.dot(x[:,0:part], wsubset) - T.dot(x[:,part:], wrest)))  # 1/(1+exp(-w'x))
predict = p_1 > 0.5

crossEnt = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # cross entropy function
cost = crossEnt.mean() + 0.01 * (wsubset ** 2).sum() # cost function with regularization term
#cost = T.sum((y - 1/(1 + T.exp(-T.dot(x[:,0:part], wsubset) - T.dot(x[:,part:], wrest))))**2)

wgrad = T.grad(cost, wsubset) # gradient w.r.t. wsubset
update = (w, T.inc_subtensor(wsubset, -0.01 * wgrad)) # update a subset of the weight

trainFn = theano.function(inputs=[], outputs=[predict], updates=[update])
predFn = theano.function(inputs =[], outputs =[predict] ) # predict function

for i in range(5000):
    trainFn()

print "w after train:"
print w.get_value().T
print "y is :"
print y.get_value().T
print "predict is:"
print (predFn()[0] * 1.0).T

      在上面这段代码中,梯度下降算法只迭代更新权重w的前40个elements,剩余的10个elements保持不变,另外1/(1+exp(-w’x + b))的偏置项b也省略了。是一个很naive的例子,没有实际的意义,但是作为一个练习还是没问题的。。。

主要代码:需要提前对权重矩阵进行拆分,grad和update都是由需要更新的权重子集wsubset来构建graph图的。

w = theano.shared(numpy.random.rand(50,1).astype(theano.config.floatX), name = 'w')
wsubset = w[0:part,:]
wrest = w[part:,:]

p_1 = 1 / (1 + T.exp(-T.dot(x[:,0:part], wsubset) - T.dot(x[:,part:], wrest)))  #
cost = crossEnt.mean() + 0.01 * (wsubset ** 2).sum() 
wgrad = T.grad(cost, wsubset) # gradient w.r.t. wsubset
update = (w, T.inc_subtensor(wsubset, -0.01 * wgrad))

trainFn = theano.function(inputs=[], outputs=[predict], updates=[update])

实验结果:可以看到w矩阵在训练前后最后10个elements是保持不变的,这就达到了我们的目的。

w before train:
[[ 0.34600419  0.67398912  0.09942167  0.65765017  0.44213673  0.06654485
   0.39846805  0.3888059   0.83535087  0.87614214  0.1428479   0.69523871
   0.59748024  0.89421201  0.16198015  0.90665674  0.66680759  0.29132733
   0.97294956  0.34204745  0.28578022  0.005306    0.82625932  0.36869088
   0.61629105  0.58408296  0.54571205  0.83845872  0.38558939  0.66588008
   0.70807606  0.58614755  0.44821101  0.11765263  0.6195485   0.81328052
   0.74707526  0.84718859  0.10713185  0.16338864  0.39414939  0.39094746
   0.97880673  0.35624492  0.13801318  0.93115759  0.97082269  0.14509809
   0.96431786  0.16936433]]
w after train:
[[-0.09856597 -0.3590492   0.13191809 -0.39871272 -0.1845082  -0.20230994
   0.11841918  0.38574994 -1.26536679 -0.1392861  -0.00187506  0.12097881
  -0.14895041 -0.35272926  0.4578245  -0.85317516  0.09256358 -0.19773743
  -0.07583583 -0.21877731 -0.84497571 -0.63426024 -0.44498774  0.03201531
   0.00287166  0.03242523 -0.92445505 -0.12279754 -0.08953576  0.38422242
   0.29207328 -0.12609322  0.27217883  0.21954003  0.18007286  0.0674418
  -0.76156878 -0.10139606  0.04785168 -0.46169016  0.39414939  0.39094746
   0.97880673  0.35624492  0.13801318  0.93115759  0.97082269  0.14509809
   0.96431786  0.16936433]]
y is :
[[ 0.  1.  1.  1.  0.  1.  0.  1.  0.  1.  0.  1.  0.  0.  1.  0.  1.  0.
   0.  0.  1.  0.  1.  0.  1.  1.  0.  0.  1.  0.  0.  1.  1.  0.  0.  1.
   1.  0.  1.  1.  0.  1.  1.  1.  1.  0.  1.  0.  0.  0.  1.  1.  1.  1.
   0.  0.  0.  0.  0.  0.  0.  1.  1.  0.  1.  1.  1.  1.  1.  1.  0.  1.
   1.  0.  0.  0.  0.  0.  0.  0.  0.  1.  1.  0.  1.  0.  0.  0.  0.  1.
   0.  0.  0.  0.  1.  1.  0.  1.  0.  0.]]
predict is:
[[ 0.  0.  1.  1.  1.  0.  0.  1.  0.  1.  1.  1.  0.  1.  1.  0.  1.  0.
   0.  0.  0.  0.  0.  1.  1.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.
   1.  0.  1.  1.  1.  1.  1.  1.  1.  1.  0.  0.  0.  0.  1.  1.  1.  0.
   1.  1.  0.  0.  1.  0.  0.  0.  1.  0.  1.  1.  1.  0.  1.  1.  0.  1.
   1.  0.  0.  0.  0.  1.  1.  1.  0.  0.  0.  0.  1.  0.  0.  0.  0.  1.
   0.  0.  0.  0.  0.  0.  0.  1.  0.  1.]]

最后,这个LR模型的拟合能力怎么这么弱。。。别担心,代码没错,把w的有效维数(代码中的part变量)增加到400试试~

考资料:
Theano教程:http://deeplearning.net/software/theano/tutorial/faq_tutorial.html

Stack Overflow:http://stackoverflow.com/questions/15917849/how-can-i-assign-update-subset-of-tensor-shared-variable-in-theano

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值