Theano深度学习笔记（一）热身训练

最新推荐文章于 2020-07-20 16:10:45 发布

Timmy_Y

最新推荐文章于 2020-07-20 16:10:45 发布

阅读量1.6k

点赞数 2

分类专栏：机器学习深度学习文章标签：深度学习 Theano python 学习笔记

本文链接：https://blog.csdn.net/mingtian715/article/details/54614834

版权

这篇博客介绍了Theano在深度学习中的应用，以MNIST数据集为例，讲解了数据处理、0-1损失与Negative Log-Likelihood损失函数、梯度下降法的三种形式，并探讨了防止过拟合的L1/L2正则化和提前停止策略。

摘要由CSDN通过智能技术生成

Mnist 数据集

Mnist是一个包含60000个训练样本和10000个测试样本的手写数字集合（下载地址）。通常，将60000个训练样本分为50000个实际训练样本和10000个交叉验证样本。每个样本是一副28*28的黑白图像，下面是典型图像。

为了方便处理，样本为2值图像(黑：0，白：1），每个样本被拉成一列（784），并且具有一个标号（0~9）。下面是读入数据集的代码：

import cPickle, gzip, numpy

# Load the dataset
f = gzip.open('mnist.pkl.gz', 'rb')
train_set, valid_set, test_set = cPickle.load(f)
f.close()

当实际使用数据时，通常将其分成很多块（随机梯度下降法，后文提到）。为了不用每次都从CPU内从中读取数据，再拿到GPU中做运算（这样会很慢），需要建立共享变量，这样每次GPU做运算时直接从GPU内从中读取数据就可以了。下面这段代码实现了数据共享，并取出数据中的一块。

import theano
import theano.tensor as T

def shared_dataset(data_xy):
    """ Function that loads the dataset into shared variables

    The reason we store our dataset in shared variables is to allow
    Theano to copy it into the GPU memory (when code is run on GPU).
    Since copying data into the GPU is slow, copying a minibatch everytime
    is needed (the default behaviour if the data is not in a shared
    variable) would lead to a large decrease in performance.
    """
    data_x, data_y = data_xy
    shared_x = theano.shared(numpy.asarray(data_x, dtype=theano.config.floatX))
    shared_y = theano.shared(numpy.asarray(data_y, dtype=theano.config.floatX))
    # When storing data on the GPU it has to be stored as floats
    # therefore we will store the labels as ``floatX`` as well
    # (``shared_y`` does exactly that). But during our computations
    # we need them as ints (we use labels as index, and if they are
    # floats it doesn't make sense) therefore instead of returning
    # ``shared_y`` we will have to cast it to int. This little hack
    # lets us get around this issue
    return shared_x, T.cast(shared_y, 'int32')

test_set_x, test_set_y = shared_dataset(test_set)
valid_set_x, valid_set_y = shared_dataset(valid_set)
train_set_x, train_set_y = shared_dataset(train_set)

batch_size = 500    # size of the minibatch

# accessing the third minibatch of the training set

data  = train_set_x[2 * batch_size: 3 * batch_size]
label = train_set_y[2 * batch_size: 3 * batch_size]

常用符号含义