CNN模型:LeNet-5代码详细步骤分析学习,相关内容详情连接等等

更新于:20180920

最近一个月才算真正的接触深度学习,还挺好,发现自己之前都啥跟啥呀~

既然是有目标的,那为了尽快,就使用同一个框架来做、来学,不然乱七八糟的。本文代码是theano、tensorflow

但实际完全可以用pytorch 来做啊。恩~~

 

这里学习了两个版本的代码,保存下,原理自己学:

https://blog.csdn.net/hjimce/article/details/47323463

https://blog.csdn.net/tina_ttl/article/details/51034849

https://blog.csdn.net/enchanted_zhouh/article/details/76855108

版本一:

"""This tutorial introduces the LeNet5 neural network architecture
using Theano.  LeNet5 is a convolutional neural network, good for
classifying images. This tutorial shows how to build the architecture,
and comes with all the hyper-parameters you need to reproduce the
paper's MNIST results.


This implementation simplifies the model in the following ways:

 - LeNetConvPool doesn't implement location-specific gain and bias parameters
 - LeNetConvPool doesn't implement pooling by average, it implements pooling
   by max.
 - Digit classification is implemented with a logistic regression rather than
   an RBF network
 - LeNet5 was not fully-connected convolutions at second layer

References:
 - Y. LeCun, L. Bottou, Y. Bengio and P. Haffner:
   Gradient-Based Learning Applied to Document
   Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998.
   http://yann.lecun.com/exdb/publis/pdf/lecun-98.pdf

"""


'''
①
1.from theano.tensor.signal import downsample 改为 from theano.tensor.signal.pool import pool_2d;
2.pooled_out = downsample.max_pool_2d() 改为 pooled_out = pool_2d()
pooled_out = pool_2d(input=conv_out, ws=self.poolsize, ignore_border=True)这里是ws不是ds,不然会有一个警告
3.xrange()→range(),而且需要为整数,故range(int())
4.logistic_sgd、HiddenLayer都导入不成功,可把函数提出来用
'''
import os
import sys
import timeit
import numpy
import theano
import theano.tensor as T
#from theano.tensor.signal import downsample
import pickle
import gzip
from theano.tensor.signal.pool import pool_2d
from theano.tensor.nnet import conv

#from logistic_sgd import LogisticRegression, load_data  这行调用不了呀
#from mlp import HiddenLayer

'''def load_data():
    with gzip.open('./mnist.pkl.gz') as fp:
        training_data, valid_data, test_data = pickle.load(fp,encoding="bytes")
    return training_data, valid_data, test_data
training_data, valid_data, test_data = load_data()'''

class LogisticRegression(object):
    """Multi-class Logistic Regression Class

    The logistic regression is fully described by a weight matrix :math:`W`
    and bias vector :math:`b`. Classification is done by projecting data
    points onto a set of hyperplanes, the distance to which is used to
    determine a class membership probability.
    """

    def __init__(self, input, n_in, n_out):
        """ Initialize the parameters of the logistic regression

        :type input: theano.tensor.TensorType
        :param input: symbolic variable that describes the input of the
                      architecture (one minibatch)

        :type n_in: int
        :param n_in: number of input units, the dimension of the space in
                     which the datapoints lie

        :type n_out: int
        :param n_out: number of output units, the dimension of the space in
                      which the labels lie

        """
        # start-snippet-1
        # initialize with 0 the weights W as a matrix of shape (n_in, n_out)
        self.W = theano.shared(
            value=numpy.zeros(
                (n_in, n_out),
                dtype=theano.config.floatX
            ),
            name='W',
            borrow=True
        )
        # initialize the biases b as a vector of n_out 0s
        self.b = theano.shared(
            value=numpy.zeros(
                (n_out,),
                dtype=theano.config.floatX
            ),
            name='b',
            borrow=True
        )

        # symbolic expression for computing the matrix of class-membership
        # probabilities
        # Where:
        # W is a matrix where column-k represent the separation hyperplane for
        # class-k
        # x is a matrix where row-j  represents input training sample-j
        # b is a vector where element-k represent the free parameter of
        # hyperplane-k
        self.p_y_given_x = T.nnet.softmax(T.dot(input, self.W) + self.b)

        # symbolic description of how to compute prediction as class whose
        # probability is maximal
        self.y_pred = T.argmax(self.p_y_given_x, axis=1)
        # end-snippet-1

        # parameters of the model
        self.params = [self.W, self.b]

        # keep track of model input
        self.input = input

    def negative_log_likelihood(self, y):
        """Return the mean of the negative log-likelihood of the prediction
        of this model under a given target distribution.

        .. math::

            \frac{1}{|\mathcal{D}|} \mathcal{L} (\theta=\{W,b\}, \mathcal{D}) =
            \frac{1}{|\mathcal{D}|} \sum_{i=0}^{|\mathcal{D}|}
                \log(P(Y=y^{(i)}|x^{(i)}, W,b)) \\
            \ell (\theta=\{W,b\}, \mathcal{D})

        :type y: theano.tensor.TensorType
        :param y: corresponds to a vector that gives for each example the
                  correct label

        Note: we use the mean instead of the sum so that
              the learning rate is less dependent on the batch size
        """
        # start-snippet-2
        # y.shape[0] is (symbolically) the number of rows in y, i.e.,
        # number of examples (call it n) in the minibatch
        # T.arange(y.shape[0]) is a symbolic vector which will contain
        # [0,1,2,... n-1] T.log(self.p_y_given_x) is a matrix of
        # Log-Probabilities (call it LP) with one row per example and
        # one column per class LP[T.arange(y.shape[0]),y] is a vector
        # v containing [LP[0,y[0]], LP[1,y[1]], LP[2,y[2]], ...,
        # LP[n-1,y[n-1]]] and T.mean(LP[T.arange(y.shape[0]),y]) is
        # the mean (across minibatch examples) of the elements in v,
        # i.e., the mean log-likelihood across the minibatch.
        return -T.mean(T.log(self.p_y_given_x)[T.arange(y.shape[0]), y])
        # end-snippet-2

    def errors(self, y):
        """Return a float representing the number of errors in the minibatch
        over the total number of examples of the minibatch ; zero one
        loss over the size of the minibatch

        :type y: theano.tensor.TensorType
        :param y: corresponds to a vector that gives for each example the
                  correct label
        """

        # check if y has same dimension of y_pred
        if y.ndim != self.y_pred.ndim:
            raise TypeError(
                'y should have the same shape as self.y_pred',
                ('y', y.type, 'y_pred', self.y_pred.type)
            )
        # check if y is of the correct datatype
        if y.dtype.startswith('int'):
            # the T.neq operator returns a vector of 0s and 1s, where 1
            # represents a mistake in prediction
            return T.mean(T.neq(self.y_pred, y))
        else:
            raise NotImplementedError()


def load_data(dataset):
            ''' Loads the dataset

            :type dataset: string
            :param dataset: the path to the dataset (here MNIST)
            '''

            #############
            # LOAD DATA #
            #############

            # Download the MNIST dataset if it is not present
            data_dir, data_file = os.path.split(dataset)
            if data_dir == "" and not os.path.isfile(dataset):
                # Check if dataset is in the data directory.
                new_path = os.path.join(
                    os.path.split(__file__)[0],
                    "..",
                    "data",
                    dataset
                )
                if os.path.isfile(new_path) or data_file == 'mnist.pkl.gz':
                    dataset = new_path

            if (not os.path.isfile(dataset)) and data_file == 'mnist.pkl.gz':
                from six.moves import urllib
                origin = (
                    'http://www.iro.umontreal.ca/~lisa/deep/data/mnist/mnist.pkl.gz'
                )
                print('Downloading data from %s' % origin)
                urllib.request.urlretrieve(origin, dataset)

            print('... loading data')

            # Load the dataset
            with gzip.open(dataset, 'rb') as f:
                try:
                    train_set, valid_set, test_set = pickle.load(f, encoding='latin1')
                except:
                    train_set, valid_set, test_set = pickle.load(f)

            # train_set, valid_set, test_set format: tuple(input, target)
            # input is a numpy.ndarray of 2 dimensions (a matrix)
            # where each row corresponds to an example. target is a
            # numpy.ndarray of 1 dimension (vector) that has the same length as
            # the number of rows in the input. It should give the target
            # to the example with the same index in the input.

            def shared_dataset(data_xy, borrow=True):
                """ Function that loads the dataset into shared variables

                The reason we store our dataset in shared variables is to allow
                Theano to copy it into the GPU memory (when code is run on GPU).
                Since copying data into the GPU is slow, copying a minibatch everytime
                is needed (the default behaviour if the data is not in a shared
                variable) would lead to a large decrease in performance.
                """
                data_x, data_y = data_xy
                shared_x = theano.shared(numpy.asarray(data_x,
                                                       dtype=theano.config.floatX),
                                         borrow=borrow)
                shared_y = theano.shared(numpy.asarray(data_y,
                                                       dtype=theano.config.floatX),
                                         borrow=borrow)
                # When storing data on the GPU it has to be stored as floats
                # therefore we will store the labels as ``floatX`` as well
                # (``shared_y`` does exactly that). But during our computations
                # we need them as ints (we use labels as index, and if they are
                # floats it doesn't make sense) therefore instead of returning
                # ``shared_y`` we will have to cast it to int. This little hack
                # lets ous get around this issue
                return shared_x, T.cast(shared_y, 'int32')

            test_set_x, test_set_y = shared_dataset(test_set)
            valid_set_x, valid_set_y = shared_dataset(valid_set)
            train_set_x, train_set_y = shared_dataset(train_set)

            rval = [(train_set_x, train_set_y), (valid_set_x, valid_set_y),
                    (test_set_x, test_set_y)]
            return rval
class HiddenLayer(object):
    def __init__(self, rng, input, n_in, n_out, W=None, b=None,
                 activation=T.tanh):
        """
        Typical hidden layer of a MLP: units are fully-connected and have
        sigmoidal activation function. Weight matrix W is of shape (n_in,n_out)
        and the bias vector b is of shape (n_out,).

        NOTE : The nonlinearity used here is tanh

        Hidden unit activation is given by: tanh(dot(input,W) + b)

        :type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights

        :type input: theano.tensor.dmatrix
        :param input: a symbolic tensor of shape (n_examples, n_in)

        :type n_in: int
        :param n_in: dimensionality of input

        :type n_out: int
        :param n_out: number of hidden units

        :type activation: theano.Op or function
        :param activation: Non linearity to be applied in the hidden
                           layer
        """
        self.input = input
        # end-snippet-1

        # `W` is initialized with `W_values` which is uniformely sampled
        # from sqrt(-6./(n_in+n_hidden)) and sqrt(6./(n_in+n_hidden))
        # for tanh activation function
        # the output of uniform if converted using asarray to dtype
        # theano.config.floatX so that the code is runable on GPU
        # Note : optimal initialization of weights is dependent on the
        #        activation function used (among other things).
        #        For example, results presented in [Xavier10] suggest that you
        #        should use 4 times larger initial weights for sigmoid
        #        compared to tanh
        #        We have no info for other function, so we use the same as
        #        tanh.
        if W is None:
            W_values = numpy.asarray(
                rng.uniform(
                    low=-numpy.sqrt(6. / (n_in + n_out)),
                    high=numpy.sqrt(6. / (n_in + n_out)),
                    size=(n_in, n_out)
                ),
                dtype=theano.config.floatX
            )
            if activation == theano.tensor.nnet.sigmoid:
                W_values *= 4

            W = theano.shared(value=W_values, name='W', borrow=True)

        if b is None:
            b_values = numpy.zeros((n_out,), dtype=theano.config.floatX)
            b = theano.shared(value=b_values, name='b', borrow=True)

        self.W = W
        self.b = b

        lin_output = T.dot(input, self.W) + self.b
        self.output = (
            lin_output if activation is None
            else activation(lin_output)
        )
        # parameters of the model
        self.params = [self.W, self.b]

# 卷积神经网络的一层,包含:卷积+下采样两个步骤
# 算法的过程是:卷积-》下采样-》激活函数

class LeNetConvPoolLayer(object):
    # image_shape是输入数据的相关参数设置  filter_shape本层的相关参数设置
    def __init__(self, rng, input, filter_shape, image_shape, poolsize=(2, 2)):
        """
        1.type rng: numpy.random.RandomState
        :param rng: a random number generator used to initialize weights
        3、input: 输入特征图数据,也就是n幅特征图片 shape是[batch, height, width, channels]
        4、参数 filter_shape: (number of filters, num input feature maps,
                              filter height, filter width)
        filter_shape是列表,因此我们可以用filter_shape[0]获取卷积核个数
        num of filters:是卷积核的个数,有多少个卷积核,那么本层的out feature maps的个数
        也将生成多少个。num input feature maps:输入特征图的个数。
        然后接着filter height, filter width是卷积核的宽高,比如5*5,9*9……
        5、参数 image_shape: (batch size, num input feature maps,
                             image height, image width),
         batch size:批量训练样本个数 ,num input feature maps:输入特征图的个数
         image height, image width分别是输入的feature map图片的大小。
         image_shape是一个列表类型,所以可以直接用索引,访问上面的4个参数,索引下标从
         0~3。比如image_shape[2]=image_heigth  image_shape[3]=num input feature maps
        6、参数 poolsize: 池化下采样的的块大小,一般为(2,2)
        """

        assert image_shape[1] == filter_shape[1]  # 判断输入特征图的个数是否一致,如果不一致是错误的
        self.input = input

        # fan_in=num input feature maps *filter height*filter width
        # numpy.prod(x)函数为计算x各个元素的乘积
        # 也就是说fan_in就相当于每个即将输出的feature map所需要链接参数权值的个数 如C3:6*(5*5)
        fan_in = numpy.prod(filter_shape[1:])
        # fan_out=num output feature maps * filter height * filter width
        fan_out = (filter_shape[0] * numpy.prod(filter_shape[2:]) /
                   numpy.prod(poolsize))
        # 把参数初始化到[-a,a]之间的数,其中a=sqrt(6./(fan_in + fan_out)),然后参数采用均匀采样
        # 权值需要多少个?卷积核个数*输入特征图个数*卷积核宽*卷积核高?这样没有包含采样层的链接权值个数
        '''
        1.theano.shared(array,type,name,borrow)可以看作是将变量设置为全局变量,其值可以在多个函数中共用.get_value()、set_value()可以查看、设置共享变量的数值;
        参数borrow:borrow=True/False:对数据的任何改变会/不会影响到原始的变量.
        numpy.asarray()这个函数很有意思:https://blog.csdn.net/mango_badnot/article/details/53637368
        应该是[[[特征1]],[[特征2]]],[[[特征1]]]...[[特征n]]]。[],代表一个卷积核,里面的两对[]代表某特征,特征之和也就是过滤器的个数啦
        2.array和asarray都可以将结构数据转化为ndarray(将输入数据(列表的列表,元组的元组,元组的列表等)转换为矩阵形式),
        但是主要区别就是当数据源是ndarray时,array仍然会copy出一个副本,占用新的内存,但asarray不会。
        3.rng = numpy.random.RandomState(23355)
        arrayA = rng.uniform(0,1,(2,3))
        对于某一个伪随机数发生器,seed=23355是种子,它相同,产生的随机数序列就是相同的;该段代码的目的是产生一个2行3列的assarray,其中的每个元素都是[0,1]区间的均匀分布的随机数'''
        W_bound = numpy.sqrt(6. / (fan_in + fan_out))
        self.W = theano.shared(
            numpy.asarray(
                rng.uniform(low=-W_bound, high=W_bound, size=filter_shape),
                dtype=theano.config.floatX  # 设置精度为配置文件config的floatX形式
            ),
            borrow=True
        )

        # b为偏置,是一维的向量。每个输出特征图i对应一个偏置参数b[i]
        # ,因此下面初始化b的个数就是特征图的个数filter_shape[0]
        b_values = numpy.zeros((filter_shape[0],), dtype=theano.config.floatX)
        self.b = theano.shared(value=b_values, borrow=True)

        # 卷积层操作,函数conv.conv2d的第一个参数为输入的特征图,第二个参数为随机初始化的卷积核参数
        # 第三个参数为卷积核的相关属性,输入特征图的相关属性
        ''',对于conv2d的输入,第一层的输入往往是[多少张图片,RGB通道数,图片有多少行,图片有多少列], 
        权重的形状往往是[卷积核数量,上层特征图数,卷积核宽,卷积核高]。
        '''

        ##疑问???????不懂这个怎么这么多参数
        conv_out = conv.conv2d(
            input=input,
            filters=self.W,
            filter_shape=filter_shape,
            image_shape=image_shape
        )
        '''filter_shape: (number of filters, num input feature maps,filter height, filter width)
            image_shape: (batch size, num input feature maps,image height, image width),'''

        # 池化操作,最大池化  ignore_border=True时,才能使用padding
        # https://www.cnblogs.com/qw12/p/6231110.html  还有mode可以选择池化类型 默认avg'''downsample.max_pool_2d'''
        pooled_out =pool_2d(
            input=conv_out,
            ws=poolsize,
            ignore_border=True
        )
        # 激励函数,也就是说是先经过卷积核再池化后,然后在进行非线性映射
        # add the bias term. Since the bias is a vector (1D array), we first
        # reshape it to a tensor of shape (1, n_filters, 1, 1). Each bias will
        # thus be broadcasted across mini-batches and feature map
        # width & height
        '''.dimshuffle 改变输入维度的顺序,返回原始变量的一个view.
        维度上:(′x′):将标量变成 1 维数组;0,原来b的行数,1原来b的列数代替。https://www.cnblogs.com/ZJUT-jiangnan/p/6023589.html'''
        self.output = T.tanh(pooled_out + self.b.dimshuffle('x', 0, 'x', 'x'))

        # 保存参数
        self.params = [self.W, self.b]
        self.input = input


# 测试函数v
''' Demonstrates lenet on MNIST dataset
    :learning_rate: 梯度下降法的学习率
    :n_epochs: 最大迭代次数
    :type dataset: string
    :param dataset: path to the dataset used for training /testing (MNIST here)
    :nkerns: 每个卷积层的卷积核个数,第一层卷积核个数为 nkerns[0]=20,第二层卷积核个数
    为50个
    '''
def evaluate_lenet5(learning_rate=0.1, n_epochs=200,
                    dataset='mnist.pkl.gz',
                    nkerns=[20, 50], batch_size=500):
    rng = numpy.random.RandomState(23455)
## 导入数据部分
    datasets = load_data(dataset)  # 加载数据:train_set, valid_set, test_set

    train_set_x, train_set_y = datasets[0]  # 训练数据
    valid_set_x, valid_set_y = datasets[1]  # 验证数据
    test_set_x, test_set_y = datasets[2]  # 测试数据

    # 计算批量训练可以分多少批数据进行训练,这个只要是知道批量训练的人都知道 也就是下山的步长
    '''比如,我们有1050个样本,如果我们设置 batch_size等于100(这个时候100就是min_batch),
    算法会首先从训练集取前100[0~100)个输入数据去训练网络;下一次取[100~200),循环.后面的50个样本独拿出来训练;总共是11次训练其网络参数的Iteration(迭代)'''
    n_train_batches = train_set_x.get_value(borrow=True).shape[0]  # 训练数据个数
    n_valid_batches = valid_set_x.get_value(borrow=True).shape[0]
    n_test_batches = test_set_x.get_value(borrow=True).shape[0]
    n_train_batches /= batch_size  # 批数
    n_valid_batches /= batch_size
    n_test_batches /= batch_size


## building model部分
    # allocate symbolic variables for the data
    '''float是因为GPU一般是float32类型,所以在编写程序的时候,我们很少用到double,常用的数据类型如下:
    数值:iscalar(int类型的变量)、fscalar(float类型的变量)
    一维向量:ivector(int 类型的向量)、fvector(float类型的向量)、
    二维矩阵:fmatrix(float类型矩阵)、imatrix(int类型的矩阵)
    三维float类型矩阵:ftensor3 ;四维float类型矩阵:ftensor4'''
    index = T.lscalar()  # index to a [mini]batch  0维的int32

    # start-snippet-1
    x = T.matrix('x')  # the data is presented as rasterized images
    y = T.ivector('y')  # the labels are presented as 1D vector of
    # [int] labels


    # Reshape matrix of rasterized images of shape (batch_size, 28 * 28)
    # to a 4D tensor, compatible with our LeNetConvPoolLayer
    # (28, 28) is the size of MNIST images.
    layer0_input = x.reshape((batch_size, 1, 28, 28))

    '''构建第一层网络:
    image_shape:输入大小为28*28的特征图,batch_size个训练数据,每个训练数据有1个特征图
    filter_shape:卷积核个数为nkernes[0]=20,因此本层每个训练样本即将生成20个特征图
    经过卷积操作,图片大小变为(28-5+1 , 28-5+1) = (24, 24)
    经过池化操作,图片大小变为 (24/2, 24/2) = (12, 12)  stride  
    最后生成的本层image_shape为(batch_size, nkerns[0], 12, 12)'''
    layer0 = LeNetConvPoolLayer(
        rng,
        input=layer0_input,
        image_shape=(batch_size, 1, 28, 28),
        filter_shape=(nkerns[0], 1, 5, 5),
        poolsize=(2, 2)
    )

    '''构建第二层网络:输入batch_size个训练图片,经过第一层的卷积后,每个训练图片有nkernes[0]个特征图,每个特征图
    大小为12*12
    经过卷积后,图片大小变为(12-5+1, 12-5+1) = (8, 8)
    经过池化后,图片大小变为(8/2, 8/2) = (4, 4)
    最后生成的本层的image_shape为(batch_size, nkerns[1], 4, 4)'''
    layer1 = LeNetConvPoolLayer(
        rng,
        input=layer0.output,
        image_shape=(batch_size, nkerns[0], 12, 12),
        filter_shape=(nkerns[1], nkerns[0], 5, 5),
        poolsize=(2, 2)
    )

    # the HiddenLayer being fully-connected, it operates on 2D matrices of
    # shape (batch_size, num_pixels) (i.e matrix of rasterized images).
    # This will generate a matrix of shape (batch_size, nkerns[1] * 4 * 4),
    # or (500, 50 * 4 * 4) = (500, 800) with the default values.  500个数据
    '''.flatten()返回一个折叠成一维的数组。该函数只能适用于numpy对象,即array或者mat,普通的list列表是不行的
    默认按行排'''
    layer2_input = layer1.output.flatten(2)

    '''全链接:输入layer2_input是一个二维的矩阵,第一维表示样本,第二维表示上面经过卷积下采样后
    每个样本所得到的神经元,也就是每个样本的特征,HiddenLayer类是一个单层网络结构
    下面的layer2把神经元个数由800个压缩映射为500个 50*'''
    layer2 = HiddenLayer(
        rng,
        input=layer2_input,
        n_in=nkerns[1] * 4 * 4,
        n_out=500,
        activation=T.tanh
    )

    # 最后一层:逻辑回归层分类判别,把500个神经元,压缩映射成10个神经元,分别对应于手写字体的0~9
    layer3 = LogisticRegression(input=layer2.output, n_in=500, n_out=10)

    # the cost we minimize during training is the NLL of the model
    cost = layer3.negative_log_likelihood(y)

    # create a function to compute the mistakes that are made by the model, '''数据集范围如0~500;500~1000'''
    test_model = theano.function(
        [index],
        layer3.errors(y),
        givens={
            x: test_set_x[index * batch_size: (index + 1) * batch_size],
            y: test_set_y[index * batch_size: (index + 1) * batch_size],
    }
    )

    validate_model = theano.function(
        [index],
        layer3.errors(y),
        givens={
            x: valid_set_x[index * batch_size: (index + 1) * batch_size],
            y: valid_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )

    # 把所有的参数放在同一个列表里,可直接使用列表相加  WX+b
    params = layer3.params + layer2.params + layer1.params + layer0.params

    # 梯度求导
    grads = T.grad(cost, params)

    # train_model is a function that updates the model parameters by
    # SGD Since this model has many parameters, it would be tedious to
    # manually create an update rule for each model parameter. We thus
    # create the updates list by automatically looping over all
    # (params[i], grads[i]) pairs.
    '''SGD随机最速下降法:https://blog.csdn.net/u010248552/article/details/79764340
    https://blog.csdn.net/tsyccnh/article/details/76064087
    线形模型yp,i=axi+b,经典梯度下降方法每一个样本都要经过计算求loss;要优化的参数有两个,
    分别是a和b,分别对他们求微分,也就是偏微分; ∇a,∇b ,分别表示loss在a、b方向的梯度, 更新参数a=a−α∇a
    '''
    updates = [
        (param_i, param_i - learning_rate * grad_i)
        for param_i, grad_i in zip(params, grads)
    ]

    train_model = theano.function(
        [index],
        cost,
        updates=updates,
        givens={
            x: train_set_x[index * batch_size: (index + 1) * batch_size],
            y: train_set_y[index * batch_size: (index + 1) * batch_size]
        }
    )
    # end-snippet-1

    ###############
    # TRAIN MODEL #
    ###############
    print('... training')
    # early-stopping parameters
    patience = 10000  # look as this many examples regardless
    patience_increase = 2  # wait this much longer when a new best is found
    improvement_threshold = 0.995
    # a relative improvement of this much isconsidered significant
    validation_frequency = min(n_train_batches, patience / 2)
    ''' go through this many minibatche before checking the network
       on the validation set; in this case we check every epoch'''
    best_validation_loss = numpy.inf  # 表示一个无限大的正数
    # best_iter = 0
    # test_score = 0.
    start_time = timeit.default_timer()  # 准确测量小段代码的执行时间

    epoch = 0
    done_looping = False
    while (epoch < n_epochs) and (not done_looping):
        epoch = epoch + 1
        # 每一批训练数据 索引
        for minibatch_index in range(int(n_train_batches)):
            cost_ij = train_model(minibatch_index)
            iter = (epoch - 1) * n_train_batches + minibatch_index
            if (iter + 1) % validation_frequency == 0:

            # compute zero-one loss on validation set
                validation_losses = [validate_model(i) for i
                                 in range(int(n_valid_batches))]
                this_validation_loss = numpy.mean(validation_losses)
                print('epoch %i, minibatch %i/%i, validation error %f %%' %
                      (epoch, minibatch_index + 1, n_train_batches,
                      this_validation_loss * 100.))

                 # if we got the best validation score until now
                if this_validation_loss < best_validation_loss:

                    # improve patience if loss improvement is good enough
                    if this_validation_loss < best_validation_loss * \
                        improvement_threshold:
                        patience = max(patience, iter * patience_increase)

                    # save best validation score and iteration number
                    best_validation_loss = this_validation_loss
                    best_iter = iter

                    # test it on the test set
                    test_losses = [
                    test_model(i)
                    for i in range(int(n_test_batches))
                    ]
                    test_score = numpy.mean(test_losses)
                    print(('     epoch %i, minibatch %i/%i, test error of '
                          'best model %f %%') %
                          (epoch, minibatch_index + 1, n_train_batches,
                           test_score * 100.))

            if patience <= iter:
                done_looping = True
                break

    end_time = timeit.default_timer()
    print('Optimization complete.')
    print('Best validation score of %f %% obtained at iteration %i, '
          'with test performance %f %%' %
          (best_validation_loss * 100., best_iter + 1, test_score * 100.))
    print(('The code for file ' +
                      os.path.split(__file__)[1] +
                      ' ran for %.2fm' % ((end_time - start_time) / 60.)),file=sys.stderr) #print >> 将内容输出定向到某文
#python2  print >> sys.stdout的形式就是print的一种默认输出格式,等于print "%VALUE%"
if __name__ == '__main__':
    evaluate_lenet5()


def experiment(state, channel):
    evaluate_lenet5(state.learning_rate, dataset=state.dataset)

## 版本二 

from skimage import io,transform
import os
import glob
import numpy as np
import tensorflow as tf


#将所有的图片重新设置尺寸为32*32
w = 32
h = 32
c = 1

#mnist数据集中训练数据和测试数据保存地址
train_path = "E:/data/datasets/mnist/train/"
test_path = "E:/data/datasets/mnist/test/"

#读取图片及其标签函数
'''os.listdir()返回指定的文件夹包含的文件或文件夹的名字,存放于一个列表中;os.path.isdir()判断某一路径是否为目录
 enumerate()将一个可遍历的数据对象(如列表、元组或字符串)组合为一个索引序列,数据下标和相应数据'''
def read_image(path):
    label_dir = [path+x for x in os.listdir(path) if os.path.isdir(path+x)]
    images = []
    labels = []
    for index,folder in enumerate(label_dir):
        for img in glob.glob(folder+'/*.png'):
            print("reading the image:%s"%img)
            image = io.imread(img)
            image = transform.resize(image,(w,h,c))
            images.append(image)
            labels.append(index)
    return np.asarray(images,dtype=np.float32),np.asarray(labels,dtype=np.int32)

#读取训练数据及测试数据            
train_data,train_label = read_image(train_path)
test_data,test_label = read_image(test_path)

#打乱训练数据及测试数据  np.arange()返回一个有终点和起点的固定步长的排列,
train_image_num = len(train_data)
train_image_index = np.arange(train_image_num) ##起始点0,结束点train_image_num,步长1,返回类型array,一维
np.random.shuffle(train_image_index)
train_data = train_data[train_image_index]
train_label = train_label[train_image_index]

test_image_num = len(test_data)
test_image_index = np.arange(test_image_num)
np.random.shuffle(test_image_index)
test_data = test_data[test_image_index]
test_label = test_label[test_image_index]

#搭建CNN 此函数可以理解为形参,用于定义过程,在执行的时候再赋具体的值,形参名X,y_
x = tf.placeholder(tf.float32,[None,w,h,c],name='x')
y_ = tf.placeholder(tf.int32,[None],name='y_')

def inference(input_tensor,train,regularizer):

    #第一层:卷积层,过滤器的尺寸为5×5,深度为6,不使用全0补充,步长为1。
    #尺寸变化:32×32×1->28×28×6
    '''参数的初始化:tf.truncated_normal_initializer()或者简写为tf.TruncatedNormal()、tf.RandomNormal() 去掉_initializer,大写首字母即可
生成截断正态分布的随机数,这个初始化方法好像在tf中用得比较多mean=0.0, stddev=1.0 正态分布
http://www.mamicode.com/info-detail-1835147.html'''
    with tf.variable_scope('layer1-conv1'):
        conv1_weights = tf.get_variable('weight',[5,5,c,6],initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv1_biases = tf.get_variable('bias',[6],initializer=tf.constant_initializer(0.0))
        conv1 = tf.nn.conv2d(input_tensor,conv1_weights,strides=[1,1,1,1],padding='VALID')
        relu1 = tf.nn.relu(tf.nn.bias_add(conv1,conv1_biases))

    #第二层:池化层,过滤器的尺寸为2×2,使用全0补充,步长为2。
    #尺寸变化:28×28×6->14×14×6
    with tf.name_scope('layer2-pool1'):
        pool1 = tf.nn.max_pool(relu1,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

    #第三层:卷积层,过滤器的尺寸为5×5,深度为16,不使用全0补充,步长为1。
    #尺寸变化:14×14×6->10×10×16
    with tf.variable_scope('layer3-conv2'):
        conv2_weights = tf.get_variable('weight',[5,5,6,16],initializer=tf.truncated_normal_initializer(stddev=0.1))
        conv2_biases = tf.get_variable('bias',[16],initializer=tf.constant_initializer(0.0))
        conv2 = tf.nn.conv2d(pool1,conv2_weights,strides=[1,1,1,1],padding='VALID')
        relu2 = tf.nn.relu(tf.nn.bias_add(conv2,conv2_biases))

    #第四层:池化层,过滤器的尺寸为2×2,使用全0补充,步长为2。
    #尺寸变化:10×10×6->5×5×16
    with tf.variable_scope('layer4-pool2'):
        pool2 = tf.nn.max_pool(relu2,ksize=[1,2,2,1],strides=[1,2,2,1],padding='SAME')

    #将第四层池化层的输出转化为第五层全连接层的输入格式。第四层的输出为5×5×16的矩阵,然而第五层全连接层需要的输入格式
    #为向量,所以我们需要把代表每张图片的尺寸为5×5×16的矩阵拉直成一个长度为5×5×16的向量。
    #举例说,每次训练64张图片,那么第四层池化层的输出的size为(64,5,5,16),拉直为向量,nodes=5×5×16=400,尺寸size变为(64,400)
    pool_shape = pool2.get_shape().as_list()
    nodes = pool_shape[1]*pool_shape[2]*pool_shape[3]
    reshaped = tf.reshape(pool2,[-1,nodes])

    #第五层:全连接层,nodes=5×5×16=400,400->120的全连接
    #尺寸变化:比如一组训练样本为64,那么尺寸变化为64×400->64×120
    #训练时,引入dropout,dropout在训练时会随机将部分节点的输出改为0,dropout可以避免过拟合问题。
    #这和模型越简单越不容易过拟合思想一致,和正则化限制权重的大小,使得模型不能任意拟合训练数据中的随机噪声,以此达到避免过拟合思想一致。
    #本文最后训练时没有采用dropout,dropout项传入参数设置成了False,因为训练和测试写在了一起没有分离,不过大家可以尝试。
    '''tf.matmul()这个函数是专门矩阵或者tensor乘法,而不是矩阵元素对应元素相乘
    tf.multiply()两个矩阵中对应元素各自相乘
    tf.nn.dropout(x, keep_prob):TensorFlow里面为了防止或减轻过拟合而使用的函数,它一般用在全连接层,
    x:指输入;keep_prob: 设置神经元被选中的概率,使输入tensor中某些元素变为0,其它没变0的元素变为原来的1/keep_prob大小,可以想象下,比如某些元素弃用
    在初始化时keep_prob是一个占位符,keep_prob = tf.placeholder(tf.float32).
    tensorflow在run时设置keep_prob具体的值,例如keep_prob: 0.5,train的时候才是dropout起作用的时候
    keep_prob: A scalar Tensor with the same type as x. The probability that each element is kept.'''
    with tf.variable_scope('layer5-fc1'):
        fc1_weights = tf.get_variable('weight',[nodes,120],initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses',regularizer(fc1_weights))
        fc1_biases = tf.get_variable('bias',[120],initializer=tf.constant_initializer(0.1))
        fc1 = tf.nn.relu(tf.matmul(reshaped,fc1_weights) + fc1_biases)
        if train:
            fc1 = tf.nn.dropout(fc1,0.5)

    #第六层:全连接层,120->84的全连接
    #尺寸变化:比如一组训练样本为64,那么尺寸变化为64×120->64×84
    '''tf.add_to_collection:把变量放入一个集合,把很多变量变成一个列表
    tf.get_collection:从一个结合中取出全部变量,是一个列表
    tf.add_n:把一个列表的东西都依次加起来'''
    with tf.variable_scope('layer6-fc2'):
        fc2_weights = tf.get_variable('weight',[120,84],initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses',regularizer(fc2_weights))
        fc2_biases = tf.get_variable('bias',[84],initializer=tf.truncated_normal_initializer(stddev=0.1))
        fc2 = tf.nn.relu(tf.matmul(fc1,fc2_weights) + fc2_biases)
        if train:
            fc2 = tf.nn.dropout(fc2,0.5)

    #第七层:全连接层(近似表示),84->10的全连接
    #尺寸变化:比如一组训练样本为64,那么尺寸变化为64×84->64×10。最后,64×10的矩阵经过softmax之后就得出了64张图片分类于每种数字的概率,
    #即得到最后的分类结果。
    with tf.variable_scope('layer7-fc3'):
        fc3_weights = tf.get_variable('weight',[84,10],initializer=tf.truncated_normal_initializer(stddev=0.1))
        if regularizer != None:
            tf.add_to_collection('losses',regularizer(fc3_weights))
        fc3_biases = tf.get_variable('bias',[10],initializer=tf.truncated_normal_initializer(stddev=0.1))
        logit = tf.matmul(fc2,fc3_weights) + fc3_biases
    return logit

#正则化,交叉熵,平均交叉熵,损失函数,最小化损失函数,预测和实际equal比较,tf.equal函数会得到True或False,
#accuracy首先将tf.equal比较得到的布尔值转为float型,即True转为1.,False转为0,最后求平均值,即一组样本的正确率。
#比如:一组5个样本,tf.equal比较为[True False True False False],转化为float型为[1. 0 1. 0 0],准确率为2./5=40%。
'''规则化可以帮助防止过度配合,提高模型的适用性。(让模型无法完美匹配所有的训练项。)(使用规则来使用尽量少的变量去拟合数据)
规则化就是说给需要训练的目标函数加上一些规则(限制),让他们不要自我膨胀。
TensorFlow会将L2的正则化损失值除以2使得求导得到的结果更加简洁 
如tf.contrib.layers.apply_regularization/l1_regularizer/l2_regularizer/sum_regularizer
https://blog.csdn.net/liushui94/article/details/73481112
sparse_softmax_cross_entropy_with_logits()是将softmax和cross_entropy放在一起计算
https://blog.csdn.net/ZJRN1027/article/details/80199248'''
regularizer = tf.contrib.layers.l2_regularizer(0.001)
y = inference(x,False,regularizer)
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits=y,labels=y_)
cross_entropy_mean = tf.reduce_mean(cross_entropy)
loss = cross_entropy_mean + tf.add_n(tf.get_collection('losses'))
train_op = tf.train.AdamOptimizer(0.001).minimize(loss)
correct_prediction = tf.equal(tf.cast(tf.argmax(y,1),tf.int32),y_)
accuracy = tf.reduce_mean(tf.cast(correct_prediction,tf.float32))

#每次获取batch_size个样本进行训练或测试
def get_batch(data,label,batch_size):
    for start_index in range(0,len(data)-batch_size+1,batch_size):
        slice_index = slice(start_index,start_index+batch_size)
        yield data[slice_index],label[slice_index]

#创建Session会话
with tf.Session() as sess:
    #初始化所有变量(权值,偏置等)
    sess.run(tf.global_variables_initializer())

    #将所有样本训练10次,每次训练中以64个为一组训练完所有样本。
    #train_num可以设置大一些。
    train_num = 10
    batch_size = 64


    for i in range(train_num):

        train_loss,train_acc,batch_num = 0, 0, 0
        for train_data_batch,train_label_batch in get_batch(train_data,train_label,batch_size):
            _,err,acc = sess.run([train_op,loss,accuracy],feed_dict={x:train_data_batch,y_:train_label_batch})
            train_loss+=err;train_acc+=acc;batch_num+=1
        print("train loss:",train_loss/batch_num)
        print("train acc:",train_acc/batch_num)

        test_loss,test_acc,batch_num = 0, 0, 0
        for test_data_batch,test_label_batch in get_batch(test_data,test_label,batch_size):
            err,acc = sess.run([loss,accuracy],feed_dict={x:test_data_batch,y_:test_label_batch})
            test_loss+=err;test_acc+=acc;batch_num+=1
        print("test loss:",test_loss/batch_num)
        print("test acc:",test_acc/batch_num)

 

  • 0
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
LeNet-5神经网络 C源代码,这个写的比较好,可以用gcc编译去跑,结合理论可以对深度学习有更深刻的了解 介绍 根据YANN LECUN的论文《Gradient-based Learning Applied To Document Recognition》设计的LeNet-5神经网络,C语言写成,不依赖任何第三方库。 MNIST手写字符集初代训练识别率97%,多代训练识别率98%。 DEMO main.c文件为MNIST数据集的识别DEMO,直接编译即可运行,训练集60000张,测试集10000张。 项目环境 该项目为VISUAL STUDIO 2015项目,用VISUAL STUDIO 2015 UPDATE1及以上直接打开即可编译。采用ANSI C编写,因此源码无须修改即可在其它平台上编译。 如果因缺少openmp无法编译,请将lenet.c中的#include和#pragma omp parallel for删除掉即可。 API #####批量训练 lenet: LeNet5的权值的指针,LeNet5神经网络的核心 inputs: 要训练的多个图片对应unsigned char二维数组的数组,指向的二维数组的batchSize倍大小内存空间指针。在MNIST测试DEMO中二维数组为28x28,每个二维数组数值分别为对应位置图像像素灰度值 resMat:结果向量矩阵 labels:要训练的多个图片分别对应的标签数组。大小为batchSize batchSize:批量训练输入图像(二维数组)的数量 void TrainBatch(LeNet5 *lenet, image *inputs, const char(*resMat)[OUTPUT],uint8 *labels, int batchSize); #####单个训练 lenet: LeNet5的权值的指针,LeNet5神经网络的核心 input: 要训练的图片对应二维数组 resMat:结果向量矩阵 label: 要训练的图片对应的标签 void Train(LeNet5 *lenet, image input, const char(*resMat)[OUTPUT],uint8 label); #####预测 lenet: LeNet5的权值的指针,LeNet5神经网络的核心 input: 输入的图像的数据 labels: 结果向量矩阵指针 count: 结果向量个数 return 返回值为预测的结果 int Predict(LeNet5 *lenet, image input, const char(*labels)[LAYER6], int count); #####初始化 lenet: LeNet5的权值的指针,LeNet5神经网络的核心

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值