MNIST学习笔记

最新推荐文章于 2019-11-27 22:04:06 发布

pqpq777

最新推荐文章于 2019-11-27 22:04:06 发布

阅读量255

点赞数

分类专栏： python

本文链接：https://blog.csdn.net/qq_32555995/article/details/79587678

版权

python 专栏收录该内容

5 篇文章 0 订阅

订阅专栏

英文教程：http://deeplearning.net/tutorial/logreg.html#logreg

主要代码：

n_train_batches = train_set_x.get_value(borrow=True).shape[0] // batch_size
n_valid_batches = valid_set_x.get_value(borrow=True).shape[0] // batch_size
n_test_batches = test_set_x.get_value(borrow=True).shape[0] // batch_size

以n_train_batches为例，train_set_x.get_value.shape[0]为训练样本集中样本的个数（shape[0]为矩阵的行数，即训练样本数，而shape[1]是训练样本的列数，即样本的属性个数），batch_size为一个mini-batch（取样集，网上无合适的翻译，但有人用了这个先这样命名吧）中有多少样本数，两者相除得到的结果为一次epoch（世代）需要训练几个mini-batch。

index = T.lscalar()
x = T.matrix('x')
y = T.ivector('y')
classifier = LogisticRegression(input=x, n_in=28 * 28, n_out=10)
cost = classifier.negative_log_likelihood(y)

index为mini-batch中的索引，代表一次epoch中的第几个mini-batch。x为一个矩阵，存储的一行数据为一张 28*28=784像素的图片，行数为训练样本的个数。y为标记向量，如[1,0,0,0,0,0,0,0,0,0]代表样例的数字为0，[0,0,1,0,0,0,0,0,0,0]代表样例的数字为2。利用LogisticsRegression实例化classifier对象，同时定义cost为classifier的损失函数。需要在训练中使得损失函数的数值cost最小（使用随机梯度下降法），即训练模型成功。

test_model = theano.function(
    inputs=[index],
    outputs=classifier.errors(y),
    givens={
        x: test_set_x[index * batch_size: (index + 1) * batch_size],
        y: test_set_y[index * batch_size: (index + 1) * batch_size]
    }
)

validate_model = theano.function(
    inputs=[index],
    outputs=classifier.errors(y),
    givens={
        x: valid_set_x[index * batch_size: (index + 1) * batch_size],
        y: valid_set_y[index * batch_size: (index + 1) * batch_size]
    }
)

输入是index，输出则是classifier对象中的errors方法的返回值，其中y作为errors方法的输入参数。其中的classifier接收x作为输入参数。givens关键字的作用是使用冒号后面的变量来替代冒号前面的变量，本例中，即使用测试数据中的第index批数据(一批有batch_size个)来替换x和y。test_model用中文来解释就是: 接收第index批测试数据的图像数据x和期望输出y作为输入，返回误差值的函数，函数theano.tensor.neq(self.y_pred, y)用于统计self.y_pred和y中不相等的样本的个数。

g_W = T.grad(cost=cost, wrt=classifier.W)
g_b = T.grad(cost=cost, wrt=classifier.b)

计算的是梯度, 用于学习算法，T.grad(y, x) 计算的是相对于x的y的梯度。

updates = [(classifier.W, classifier.W - learning_rate * g_W),
               (classifier.b, classifier.b - learning_rate * g_b)]

updates是一个长度为2的list, 每个元素都是一组tuple, 在theano.function中, 每次调用对应函数, 使用tuple中的第二个元素来更新第一个元素。

train_model = theano.function(
        inputs=[index],
        outputs=cost,
        updates=updates,
        givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )

与test_model和validate_model类似，但是有所不同的是增加了updates参数，这个参数给定了每次调用train_model时对某些参数的修改(W和b)。同时outputs也变成了cost。在训练中需要使得损失函数最小。

某些语句解释：

1、theano.shared 共享变量

self.W = theano.shared(
            value=numpy.zeros(
                (n_in, n_out),
                dtype=theano.config.floatX
            ),
            name='W',
            borrow=True
        )

shared函数将变量设置为全局变量，让变量的值可在多个函数中使用；

numpy.zeros是得到形状为(n_in, n_out)的二维零矩阵，n_in为行，n_out为列；

dtype类型需要设置成theano.config.floatX，这样GPU才能调用；

参数name：用于标识此参数的字符串：

import numpy, theano
np_array = numpy.zeros(2, dtype='float32')
s_default = theano.shared(np_array, name='s_default')

print ('s_default.name:',s_default.name)

输出：

s_default.name: s_default

参数borrow=True/False：对数据的改变会/不会影响到原始变量：

import numpy, theano
np_array = numpy.zeros(2, dtype='float32')

s_default = theano.shared(np_array)
s_false   = theano.shared(np_array, borrow=False)
s_true    = theano.shared(np_array, borrow=True)

np_array += 1

print('s_default:',s_default.get_value())
print('s_false:',s_false.get_value())
print('s_true:',s_true.get_value())

输出：

s_default: [ 0.  0.]
s_false: [ 0.  0.]
s_true: [ 1.  1.]

pqpq777

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
MNIST学习笔记

英文教程：http://deeplearning.net/tutorial/logreg.html#logreg主要代码：n_train_batches = train_set_x.get_value(borrow=True).shape[0] // batch_sizen_valid_batches = valid_set_x.get_value(borrow=True).shape[0] /...
复制链接

扫一扫