Theano权重子集更新

最新推荐文章于 2022-11-27 15:46:13 发布

AlexInML

最新推荐文章于 2022-11-27 15:46:13 发布

阅读量1.3k

点赞数

分类专栏：深度学习文章标签： Theano 权值子集更新权重

本文链接：https://blog.csdn.net/wangjian1204/article/details/50414396

版权

深度学习专栏收录该内容

16 篇文章 0 订阅

订阅专栏

新入门Theano，官方文档最后一节讲到Theano中部分权重（权重子集）更新问题“How to update a subset of weights?”，按照教程自己写了一个实例，但是f = theano.function(…, updates=updates)报错。Bug信息提示，updates = inc_subtensor(subset, g*lr)得到的是一个tensor类型，而updates要求的输入类型是二元元组：The updates parameter must be an OrderedDict/dict or a list of lists/tuples with 2 elements。

经过多次修改尝试但未能解决，最终还是在万能的Stack Overflow上找到了inc_subtensor用于updates的方法。自行在Logistic Regression算法上做了一点尝试，验证方法可行：

import theano
import theano.tensor as T
import numpy

x = theano.shared(numpy.random.rand(100,50).astype(theano.config.floatX), name = 'x')
w = theano.shared(numpy.random.rand(50,1).astype(theano.config.floatX), name = 'w')
y = theano.shared(numpy.random.randint(size=(100,1), low=0, high=2).astype(theano.config.floatX), name = 'y')

part = 40
wsubset = w[0:part,:]
wrest = w[part:,:]
print "w before train:"
print w.get_value().T

p_1 = 1 / (1 + T.exp(-T.dot(x[:,0:part], wsubset) - T.dot(x[:,part:], wrest)))  # 1/(1+exp(-w'x))
predict = p_1 > 0.5

crossEnt = -y * T.log(p_1) - (1-y) * T.log(1-p_1) # cross entropy function
cost = crossEnt.mean() + 0.01 * (wsubset ** 2).sum() # cost function with regularization term
#cost = T.sum((y - 1/(1 + T.exp(-T.dot(x[:,0:part], wsubset) - T.dot(x[:,part:], wrest))))**2)

wgrad = T.grad(cost, wsubset) # gradient w.r.t. wsubset
update = (w, T.inc_subtensor(wsubset, -0.01 * wgrad)) # update a subset of the weight

trainFn = theano.function(inputs=[], outputs=[predict], updates=[update])
predFn = theano.function(inputs =[], outputs =[predict] ) # predict function

for i in range(5000):
    trainFn()

print "w after train:"
print w.get_value().T
print "y is :"
print y.get_value().T
print "predict is:"
print (predFn()[0] * 1.0).T

在上面这段代码中，梯度下降算法只迭代更新权重w的前40个elements，剩余的10个elements保持不变，另外1/(1+exp(-w’x + b))的偏置项b也省略了。是一个很naive的例子，没有实际的意义，但是作为一个练习还是没问题的。。。

主要代码：需要提前对权重矩阵进行拆分，grad和update都是由需要更新的权重子集wsubset来构建graph图的。

w = theano.shared(numpy.random.rand(50,1).astype(theano.config.floatX), name = 'w')
wsubset = w[0:part,:]
wrest = w[part:,:]

p_1 = 1 / (1 + T.exp(-T.dot(x[:,0:part], wsubset) - T.dot(x[:,part:], wrest)))  #
cost = crossEnt.mean() + 0.01 * (wsubset ** 2).sum() 
wgrad = T.grad(cost, wsubset) # gradient w.r.t. wsubset
update = (w, T.inc_subtensor(wsubset, -0.01 * wgrad))

trainFn = theano.function(inputs=[], outputs=[predict], updates=[update])

实验结果：可以看到w矩阵在训练前后最后10个elements是保持不变的，这就达到了我们的目的。

w before train:
[[ 0.34600419  0.67398912  0.09942167  0.65765017  0.44213673  0.06654485
   0.39846805  0.3888059   0.83535087  0.87614214  0.1428479   0.69523871
   0.59748024  0.89421201  0.16198015  0.90665674  0.66680759  0.29132733
   0.97294956  0.34204745  0.28578022  0.005306    0.82625932  0.36869088
   0.61629105  0.58408296  0.54571205  0.83845872  0.38558939  0.66588008
   0.70807606  0.58614755  0.44821101  0.11765263  0.6195485   0.81328052
   0.74707526  0.84718859  0.10713185  0.16338864  0.39414939  0.39094746
   0.97880673  0.35624492  0.13801318  0.93115759  0.97082269  0.14509809
   0.96431786  0.16936433]]
w after train:
[[-0.09856597 -0.3590492   0.13191809 -0.39871272 -0.1845082  -0.20230994
   0.11841918  0.38574994 -1.26536679 -0.1392861  -0.00187506  0.12097881
  -0.14895041 -0.35272926  0.4578245  -0.85317516  0.09256358 -0.19773743
  -0.07583583 -0.21877731 -0.84497571 -0.63426024 -0.44498774  0.03201531
   0.00287166  0.03242523 -0.92445505 -0.12279754 -0.08953576  0.38422242
   0.29207328 -0.12609322  0.27217883  0.21954003  0.18007286  0.0674418
  -0.76156878 -0.10139606  0.04785168 -0.46169016  0.39414939  0.39094746
   0.97880673  0.35624492  0.13801318  0.93115759  0.97082269  0.14509809
   0.96431786  0.16936433]]
y is :
[[ 0.  1.  1.  1.  0.  1.  0.  1.  0.  1.  0.  1.  0.  0.  1.  0.  1.  0.
   0.  0.  1.  0.  1.  0.  1.  1.  0.  0.  1.  0.  0.  1.  1.  0.  0.  1.
   1.  0.  1.  1.  0.  1.  1.  1.  1.  0.  1.  0.  0.  0.  1.  1.  1.  1.
   0.  0.  0.  0.  0.  0.  0.  1.  1.  0.  1.  1.  1.  1.  1.  1.  0.  1.
   1.  0.  0.  0.  0.  0.  0.  0.  0.  1.  1.  0.  1.  0.  0.  0.  0.  1.
   0.  0.  0.  0.  1.  1.  0.  1.  0.  0.]]
predict is:
[[ 0.  0.  1.  1.  1.  0.  0.  1.  0.  1.  1.  1.  0.  1.  1.  0.  1.  0.
   0.  0.  0.  0.  0.  1.  1.  0.  0.  0.  1.  0.  0.  0.  0.  0.  0.  0.
   1.  0.  1.  1.  1.  1.  1.  1.  1.  1.  0.  0.  0.  0.  1.  1.  1.  0.
   1.  1.  0.  0.  1.  0.  0.  0.  1.  0.  1.  1.  1.  0.  1.  1.  0.  1.
   1.  0.  0.  0.  0.  1.  1.  1.  0.  0.  0.  0.  1.  0.  0.  0.  0.  1.
   0.  0.  0.  0.  0.  0.  0.  1.  0.  1.]]

最后，这个LR模型的拟合能力怎么这么弱。。。别担心，代码没错，把w的有效维数(代码中的part变量)增加到400试试~

考资料：
Theano教程：http://deeplearning.net/software/theano/tutorial/faq_tutorial.html

Stack Overflow：http://stackoverflow.com/questions/15917849/how-can-i-assign-update-subset-of-tensor-shared-variable-in-theano

AlexInML

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Theano权重子集更新

新入门Theano，官方文档最后一节讲到Theano中部分权重（权重子集）更新问题“How to update a subset of weights?”，按照教程自己写了一个实例，但是f = theano.function(…, updates=updates)报错。Bug信息提示，updates = inc_subtensor(subset, g*lr)得到的是一个tensor类型，而updat
复制链接

扫一扫

专栏目录