Theano-Deep Learning Tutorials 笔记:Getting Started

教程地址:http://www.deeplearning.net/tutorial/gettingstarted.html

 

Datasets

(1)mnist手写数字集:每张是一个784维向量(28*28),像素值为0到1的float,每张代表一个0到9的数,50000张training set,10000张validation set(验证集用于类似学习率,model size等参数的选择),10000张testing set。

For convenience we pickled the dataset to make it easier to use in python.

import cPickle, gzip, numpy

# Load the dataset
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = cPickle.load(f)
f.close()


Note:cPickle包的功能和用法与pickle包几乎完全相同,cPickle用C码的,性能好很多。

 

(2)We encourage you to store the dataset into shared variablesand access it based on the minibatch index, given a fixed and known batch size(即代码中的batch_size =500).

原因是:使用GPU时,不停地把数据拷贝到GPU效率不高,尽量使用Theano shared variables来提高性能;建议设6个不同共享变量,data:training set,validation set ,testing set 3个,label 3个。

def shared_dataset(data_xy):
    #Function that loads the dataset into shared variables
    data_x, data_y = data_xy
    shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))
    shared_y = theano.shared(numpy.asarray(data_y, dtype=theano.config.floatX))
    # GPU上数据存储为float,y应该是int,所以return的时候用cast转成int,
    return shared_x, T.cast(shared_y, 'int32')

test_set_x, test_set_y = shared_dataset(test_set)
valid_set_x, valid_set_y = shared_dataset(valid_set)
train_set_x, train_set_y = shared_dataset(train_set)

batch_size = 500    # size of the minibatch

# accessing the third minibatch of the training set

data  = train_set_x[2 * batch_size: 3 * batch_size]
label = train_set_y[2 * batch_size: 3 * batch_size]

如果出现内存溢出的情况:

you can store a sufficiently small chunk of your data (several minibatches) in a shared variable and use that during training. Once you got through the chunk, update the values it stores.

Learning a Classifier

Zero-One Loss

预测对的样本损失就是0,不对就是1,所有样本损失求和

If f: R^D \rightarrow\{0,...,L\} is the prediction function, then this loss can be written as:

\ell_{0,1} = \sum_{i=0}^{|\mathcal{D}|} I_{f(x^{(i)}) \neq y^{(i)}}

where either \mathcal{D} is the training set (during training) or\mathcal{D} \cap \mathcal{D}_{train} = \emptyset (to avoid biasing the evaluation of validation or test error).I is the indicator function defined as:

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值