theano linear regression exercise(theano 线性回归练习)

最新推荐文章于 2018-05-21 15:14:46 发布

程序探索队

最新推荐文章于 2018-05-21 15:14:46 发布

阅读量1.8k

点赞数

分类专栏： python 文章标签： theano deep learning

本文链接：https://blog.csdn.net/vins_napoleon/article/details/38057927

版权

python 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

最近学习theano工具包做deep learning，这个包最令人激动的是自动导数计算，你给出符号化的公式之后，能自动生成导数，这是最吸引我的地方之一。最重要的是能对用户透明的使用CPU or GPU加速，单纯的用CPU（6核）加速最少提高了10倍，我的显卡太差了，所以它默认用了CPU。比起matlab整deep learning我觉得这个库非常的有市场。

环境说明，因为要用到g++作为底层编译GPU/CPU加速的代码，所以自己手动的部署整个环境非常的麻烦，我用的academic licence 的Enthought canopy，几乎只要这一个就能把所有的依赖包都加进去，不过他用的theano版本可能比较老会出现各种问题，你可以自己用pip install theano自己安装一个最新的，这样问题减少很多（python2.7）。

线性回归，应该是作为使用这个工具包，感受其调试执行过程的最重要的第一步，当然，目前就我所知，我还没找到一个这样的练习，本例也是我自己设计的，十分简单，但是却是一个机器学习程序的整个完整框架，大部分的思路来源于theano tutorial 这里是链接 theano tutorial （2014/7/23）

整个theano 工具包的核心是tensor 库，作为一个符号化系统库，本身有很多有趣的机制，比如graphic 你要理解整个求导的一个机制，就要看这个，然后就是theano.function这个伟大的函数了，有了导数，只是有了符号化的导数，要计算，要梯度下降就需要连接理想与现实的函数theano.function

这里介绍下几个重要的东西，一个是theano.shared()函数，这个函数的厉害之处在于，第一，返回tensor类型的变量，第二，生成的变量类似于全局变量在几个函数间共享使用。要操作好theano,function 这两点必须要有深刻的认识。下面是theano.function函数签名

function.function(inputs, outputs, mode=None, updates=None, givens=None, no_default_updates=False, accept_inplace=False,

这里的updates的目标就需要shared变量，givens的数据源若是通过下标索引，也只能是shared变量（这里，我个人得出的结论，可以用，但也许是错的），比如样本是动态的更新来更新参数的时候这点特别重要。

下面贴出整个代码：

# -*- coding: utf-8 -*-
import numpy as np
import theano.tensor as T
import theano
import time

class Linear_Reg(object):
    def __init__(self,x):
        self.a = theano.shared(value = np.zeros((1,),
             dtype=theano.config.floatX),name = 'a')
        self.b = theano.shared(value = np.zeros((1,),
             dtype=theano.config.floatX),name = 'b')
        self.result = self.a * x + self.b
        self.params = [self.a,self.b]
    def msl(self,y):
        return T.mean((y - self.result)**2)

def run():
        rate = 0.01
        data = np.linspace(1,10,10)
        # y = 3 * x + 1 最后的random是加了一些随机的噪声 不然求出来的回归毫无意义
        labels = data * 3 + np.ones(data.shape[0],dtype=np.float64) + np.random.rand(data.shape[0])
        print labels
        X = theano.shared(np.asarray(data,
                                         dtype=theano.config.floatX),borrow = True)
        Y = theano.shared(np.asarray(labels,
                                          dtype=theano.config.floatX),borrow = True)

        index = T.lscalar()
        x = T.dscalar('x')
        y = T.dscalar('y')
        
        reg = Linear_Reg(x = x)
        cost = reg.msl(y)

        
        a_g = T.grad(cost = cost,wrt = reg.a)
        b_g = T.grad(cost = cost, wrt = reg.b)

        updates=[(reg.a,reg.a - rate * a_g),(reg.b,reg.b - rate * b_g)]
        train_model = theano.function(inputs=[index],
                                   outputs = reg.msl(y),
                                    updates = updates,
                                    givens = {
                                        x:X[index],
                                        y:Y[index]
                                       }
                                      )
            
        done = True
        err = 0.0
        count = 0
        last = 0.0
        start_time = time.clock()
        while done:
            err_s = [train_model(i) for i in xrange(data.shape[0])]
            err = np.mean(err_s)
            
            #print err
            count = count + 1
            if count > 10000 or err <0.1:
                done = False
            last = err
        end_time = time.clock()
        print 'Total time is ：',end_time -start_time,' s' # 5.12s
        print 'last error :',err
        print 'a value : ',reg.a.get_value() #  [ 2.92394467] 
        print 'b value : ',reg.b.get_value() # [ 1.81334458]
       
run()

基本思想是每一次取一个点，然后梯度下降，严格上样本点最好应该是随机取的，但是为了实验方便按顺序取，迭代这么多次，只是为了保证收敛。本程序更大的作用是作为示例，这么套框架，这样是可以的，在使用function中如果用类似的处理办法，就可以分析问题所在了，贴出代码的初衷也是给自己一个今后的模版，省事很多！刚学theano，有些拙劣的意见，还请指教！