利用tensorflow一步一步实现基于MNIST 数据集进行手写数字识别的神经网络,逻辑回归

MNIST from scratch

MNIST从无到有

 

This notebook walks through an example oftraining a TensorFlow model to do digit classification using the MNIST dataset. MNIST is a labeled set of images of handwritten digits.

这次我们来看一个使用MNIST 数据集来训练一个TensorFlow 模型,从而实现数字分类。MNIST 数据集是一系列经过标签处理过的手写数字图片。

 

An example follows.

这个例子如下:

We're going to be building a model thatrecognizes these digits as 5, 0, and 4.

我们将建立一个模型,用来识别数字504

Imports and input data

 

We'll proceed in steps, beginning withimporting and inspecting the MNIST data. This doesn't have anything to do withTensorFlow in particular -- we're just downloading the data archive.

我们将一步一步进行,首先是导入MNIST数据,并查看下MNIST 数据的结构。当然这步和TensorFlow 没有什么特别关心。我们只是下载这些数据包

下面是这个程序,大家可以在python下运行下

 

import os

from six.moves.urllib.request importurlretrieve

SOURCE_URL ='https://storage.googleapis.com/cvdf-datasets/mnist/'

#要下载数据包的网址。

WORK_DIRECTORY ="/tmp/mnist-data"

#这个是我们本地的工作路径

 

def maybe_download(filename):

   """A helper to download the data files if notpresent."""

   if not os.path.exists(WORK_DIRECTORY):

       os.mkdir(WORK_DIRECTORY)

   filepath = os.path.join(WORK_DIRECTORY, filename)

   if not os.path.exists(filepath):

       filepath, _ = urlretrieve(SOURCE_URL + filename, filepath)

       statinfo = os.stat(filepath)

       print('Successfully downloaded', filename, statinfo.st_size, 'bytes.')

   else:

       print('Already downloaded', filename)

   return filepath

#以上函数的功能是输入一个文件名,并判断本地是否已经有,如果没有相关目录和文件,则会创建一个,然后到通过urrretrieve 去下载并存到那个目录下的。

 

train_data_filename =maybe_download('train-images-idx3-ubyte.gz')

train_labels_filename =maybe_download('train-labels-idx1-ubyte.gz')

test_data_filename = maybe_download('t10k-images-idx3-ubyte.gz')

test_labels_filename =maybe_download('t10k-labels-idx1-ubyte.gz')

上面四个调用分别下载4个不同的文件到"/tmp/mnist-data"目录下。

 

Working with the images

开始基于这些图片进行工作了。

Now we have the files, but the formatrequires a bit of pre-processing before we can work with it. The data isgzipped, requiring us to decompress it. And, each of the images aregrayscale-encoded with values from [0, 255]; we'll normalize these to [-0.5,0.5].

下载我们已经把文件下载下来了,但是这些格式需要我们预处理下,否则我们还不能直接使用它,因为这些数据是gzipped 压缩的,需要我们解压。而且,没有图片是基于黑白编码的,他们的数字范围是[0255],我们将标准化他们到[-0.50.5]之间。

Let's try to unpack the data using thedocumented format:

让我们尽量使用文本格式解压这些数据吧。

[offset] [type]          [value]          [description]

0000    32 bit integer  0x00000803(2051)magic number

0004    32 bit integer  60000            number of images

0008    32 bit integer  28               number of rows

0012    32 bit integer  28               number of columns

0016    unsigned byte   ??               pixel

0017    unsigned byte   ??               pixel

........

xxxx    unsigned byte   ??               pixel

基于行标题排列如上:

 

Pixels are organized row-wise. Pixel valuesare 0 to 255. 0 means background (white), 255 means foreground (black).

像素也是基于行标题排列的,而像素值是从02550代表背景色白色,255代表前景色,黑色。

 

We'll start by reading the first image fromthe test data as a sanity check.

我们将先从测试数据集里读取第一个图片来进行一个完整性检查。

下面是这个函数功能,可以在python环境下运行测试下。

 

import gzip, binascii, struct, numpy

import matplotlib.pyplot as plt

 

with gzip.open(test_data_filename) as f:

    #Print the header fields.

   for field in ['magic number', 'image count', 'rows', 'columns']:

       # struct.unpack reads the binary data provided by f.read.

       # The format string '>i' decodes a big-endian integer, which

       # is the encoding of the data.

       print(field, struct.unpack('>i', f.read(4))[0])

   

    #Read the first 28x28 set of pixel values.

    #Each pixel is one byte, [0, 255], a uint8.

buf = f.read(28* 28)

#由于我们知道图片的大小是28x28 的,因此读取一幅图片的字节大小为28x28。

   image = numpy.frombuffer(buf, dtype=numpy.uint8)

 

    #Print the first few values of image.

print('First 10pixels:', image[:10])

 

The first 10pixels are all 0 values. Not very interesting, but also unsurprising. We'dexpect most of the pixel values to be the background color, 0.

10个像素都是0,不是我们感兴趣的,但是也应该意外。我们本期望大部分的像素值应该是背景颜色值0

We could printall 28 * 28 values, but what we really need to do to make sure we're readingour data properly is look at an image.

我们可以打印出所有28x28个值,但是我们真正需要做的事是确保我们正在读取的数据应该是一幅图片。

 

 

%matplotlibinline

 

# We'll showthe image and its pixel value histogram side-by-side.

_, (ax1, ax2) =plt.subplots(1, 2)

 

# To interpretthe values as a 28x28 image, we need to reshape

# the numpyarray, which is one dimensional.

ax1.imshow(image.reshape(28,28), cmap=plt.cm.Greys);

# cmap: 颜色图谱(colormap), 默认绘制为RGB(A)颜色空间。

#现在我们选择灰度空间

ax2.hist(image,bins=20, range=[0,255]);

 

 

The largenumber of 0 values correspond to the background of the image, another large massof value 255 is black, and a mix of grayscale transition values in between.

大部分的数是0,对应于我们的图像的背景另一个比较多的数是255黑色。介于这两个数之间的事灰色过度值。

Both the imageand histogram look sensible. But, it's good practice when training image modelsto normalize values to be centered around 0.

这图像及直方图看起来还比较合理,但是当作为训练图像的时候,把它们标准化在以0为中心的一个值还是比较好的方式。

 

We'll do thatnext. The normalization code is fairly short, and it may be tempting to assumewe haven't made mistakes, but we'll double-check by looking at the renderedinput and histogram again. Malformed inputs are a surprisingly common source oferrors when developing new models.

下一步我们将实现这个功能,标准化的代码是相当简短的,以至于都要怀疑我们是做错了,但是通过我们一再检查,包括看输入的图片及直方图都是没有问题的,我们也知道输入上的错误是我们开发时常常会发生的。

 

# Let's convertthe uint8 image to 32 bit floats and rescale

# the values tobe centered around 0, between [-0.5, 0.5].

#

# We again plotthe image and histogram to check that we

# haven'tmangled the data.

scaled =image.astype(numpy.float32)

scaled =(scaled - (255 / 2.0)) / 255

_, (ax1, ax2) =plt.subplots(1, 2)

ax1.imshow(scaled.reshape(28,28), cmap=plt.cm.Greys);

ax2.hist(scaled,bins=20, range=[-0.5, 0.5]);

 

上面这段代码就是把uint8 先转为32位浮点数,然后再进行数字范围转换到[-0.50.5]这样这些数的中心就变为0

 

Great -- we'veretained the correct image data while properly rescaling to the range [-0.5,0.5].

很好,我们保持了原图数据的完整,并适当把他们的灰度值转化到[-0.50,5]的范围。

 

Reading thelabels

下面读取下标签文件。

 

Let's nextunpack the test label data. The format here is similar: a magic number followedby a count followed by the labels as uint8 values. In more detail:

让我们下一步来解压测试标签数据。他们的格式如下:一个魔法数字后跟着标签总数,从第8个字节开始就是相应标签值了,他们的大小是无符号字节型。

 

[offset][type]          [value]          [description]

0000     32 bit integer  0x00000801(2049) magic number (MSB first)

0004     32 bit integer  10000            number of items

0008     unsigned byte   ??               label

0009     unsigned byte   ??               label

........

xxxx     unsigned byte   ??               label

 

As with theimage data, let's read the first test set value to sanity check our input path.We'll expect a 7.

由于从我们上次的图片可以看到,第一张图是数字7。让我们读下测试数据集里第一个值,以便我们做完整性检查,所以我们期待这个数字应该也是7

 

withgzip.open(test_labels_filename) as f:

    # Print the header fields.

    for field in ['magic number', 'labelcount']:

        print(field, struct.unpack('>i',f.read(4))[0])

    #打印读取的魔法数字,标签的总数。

    print('First label:', struct.unpack('B',f.read(1))[0])

    #打印第一个数字标签

magic number2049

label count10000

First label: 7

 

indeed, thefirst label of the test set is 7.

Forming thetraining, testing, and validation data sets

确实,测试集里的第一个标签是7,形成训练集,测试数据集和验证数据集

 

Now that weunderstand how to read a single element, we can read a much larger set thatwe'll use for training, testing, and validation.

既然我们已经懂得如何读取单个元素,我们就可以读取更多的用于训练,测试,验证的数据集了。

Image data

图片数据。

The code belowis a generalization of our prototyping above that reads the entire test andtraining data set.

下面的代码是实现以上介绍的所有内容的一个函数实现,包括读取整个测试和训练数据集。大家可以使用python 去测试下。

 

IMAGE_SIZE = 28

PIXEL_DEPTH =255

 

defextract_data(filename, num_images):

    """Extract the images into a4D tensor [image index, y, x, channels].

      For MNIST data, the number of channels isalways 1.

    Values are rescaled from [0, 255] down to[-0.5, 0.5].

    """

    print('Extracting', filename)

    with gzip.open(filename) as bytestream:

        # Skip the magic number and dimensions;we know these values.

        bytestream.read(16)

 

        buf = bytestream.read(IMAGE_SIZE *IMAGE_SIZE * num_images)

        data = numpy.frombuffer(buf,dtype=numpy.uint8).astype(numpy.float32)

        data = (data - (PIXEL_DEPTH / 2.0)) /PIXEL_DEPTH

        data = data.reshape(num_images,IMAGE_SIZE, IMAGE_SIZE, 1)

        return data

 

train_data =extract_data(train_data_filename, 60000)

test_data =extract_data(test_data_filename, 10000)

以上是两个函数调用,并把内容存储在train_data,test_data里。

 

Extracting/tmp/mnist-data/train-images-idx3-ubyte.gz

Extracting/tmp/mnist-data/t10k-images-idx3-ubyte.gz

 

A crucialdifference here is how we reshape the array of pixel values. Instead of oneimage that's 28x28, we now have a set of 60,000 images, each one being 28x28. Wealso include a number of channels, which for grayscale images as we have hereis 1.

在这里有一个关键的不同,那就是我们将使用reshape 来重新规整像素值为一维数组,而不是一幅图像使用28x28二维数组,我们有60000张图,每一个是28x28,这里也包括通道数。因为我们这里是灰度(黑白)图像,因此我们的通道数是1

Let's make surewe've got the reshaping parameters right by inspecting the dimensions and thefirst two images. (Again, mangled input is a very common source of errors.)

我们知道输入上的错误是一个比较常见的错误,因此有必要让我们来确保下我们的reshaping 参数正确:我们将检测下维数及最开始的2张图片。

 

print('Trainingdata shape', train_data.shape)

_, (ax1, ax2) =plt.subplots(1, 2)

ax1.imshow(train_data[0].reshape(28,28), cmap=plt.cm.Greys);

ax2.imshow(train_data[1].reshape(28,28), cmap=plt.cm.Greys);

 

Training datashape (60000, 28, 28, 1)

 

Looks good. Nowwe know how to index our full set of training and test images.

看起来不错,现在我们知道如何索引(抽取)我们整个训练及测试图片了。

 

Label data

标签数据。

 

Let's move onto loading the full set of labels. As is typical in classification problems,we'll convert our input labels into a 1-hot encoding over a length 10 vectorcorresponding to 10 digits. The vector [0, 1, 0, 0, 0, 0, 0, 0, 0, 0], forexample, would correspond to the digit 1.

让我们继续去获取完整的标签数据集,因为这是一个典型的分类问题,我们将转换我们的输入标签为一个1-hot 编码格式,在这10数字的向量里只有一个是1,其它都是0

比如这样的一个向量[0, 1, 0, 0, 0, 0, 0, 0, 0, 0],它在第2位为1,对于得数字就是1

[1,0, 0, 0, 0, 0, 0, 0, 0, 0],这个在第一位是1,因此对应的数字就是0

 

NUM_LABELS = 10 

#标签总数,也就是类别总数。下面的函数功能是解压标签数据集,并把他们转换为1-hot向量。

def  extract_labels(filename, num_images):

    """Extract the labels into a1-hot matrix [image index, label index]."""

    print('Extracting', filename)

    with gzip.open(filename) as bytestream:

        # Skip the magic number and count; weknow these values.

        bytestream.read(8)

        buf = bytestream.read(1 * num_images)

        labels = numpy.frombuffer(buf,dtype=numpy.uint8)

    # Convert to dense 1-hot representation.

    return (numpy.arange(NUM_LABELS) ==labels[:, None]).astype(numpy.float32)

 

train_labels =extract_labels(train_labels_filename, 60000)

test_labels =extract_labels(test_labels_filename, 10000)

 

Extracting/tmp/mnist-data/train-labels-idx1-ubyte.gz

Extracting/tmp/mnist-data/t10k-labels-idx1-ubyte.gz

 

As with ourimage data, we'll double-check that our 1-hot encoding of the first few valuesmatches our expectations.

现在我们已经有了我们自己转换好的图片数据,我们将再次检查下我们的1-hot 编码,我们利用前几个数字来看下是否符合我们的预期结果。

 

print('Traininglabels shape', train_labels.shape)

print('Firstlabel vector', train_labels[0])

print('Secondlabel vector', train_labels[1])

 

#执行下下面三行代码。

Training labelsshape (60000, 10)

First labelvector [ 0.  0.  0. 0.  0.  1. 0.  0.  0.  0.]

Second labelvector [ 1.  0.  0. 0.  0.  0. 0.  0.  0.  0.]

 

The 1-hotencoding looks reasonable.

看起来1-hot编码向量是正确的。

Segmenting datainto training, test, and validation

 把我们的训练数据集,测试数据集合验证数据集进行分段。

The final stepin preparing our data is to split it into three sets: training, test, andvalidation. This isn't the format of the original data set, so we'll take asmall slice of the training data and treat that as our validation set.

 最后的步骤是把我们准备的数据集分成3个数据集,训练,测试,验证。

而原来的数据集只有训练,测试。因此我们需要把一小部分的训练数据集和一小部分的测试集合起来组成一个验证数据集。

 

VALIDATION_SIZE= 5000

验证个数

validation_data= train_data[:VALIDATION_SIZE, :, :, :]

validation_labels= train_labels[:VALIDATION_SIZE]

train_data =train_data[VALIDATION_SIZE:, :, :, :]

train_labels =train_labels[VALIDATION_SIZE:]

 

train_size =train_labels.shape[0]

 

print('Validationshape', validation_data.shape)

print('Trainsize', train_size)

 

Validationshape (5000, 28, 28, 1)

Train size55000

 

 

Defining themodel

定义我们的模型

Now that we'veprepared our data, we're ready to define our model.

既然我们已经准备好了所有的数据,现在我们要定义我们的模型了。真正使用tensorflow 创建神经网络开始了,让我们全神贯注起来吧。

 

The commentsdescribe the architecture, which fairly typical of models that process imagedata. The raw input passes through several convolution and max pooling layerswith rectified linear activations before several fully connected layers and asoftmax loss for predicting the output class. During training, we use dropout.

在注释里已经描述了整个架构,这是一个非常典型的处理图像数据的模型:

原始的图片数据经过几个卷积层和最大池化层,再通过校正后的线性激活函数,传人到几个全连接网络层,网络层输出后给到softmax 进行损失计算。从而来预测属于哪个类别。注意,在我们训练的时候还要使用dropout 功能,以防过度拟合。

 

 

We'll separateour model definition into three steps:

我们将把上述模型分为3步进行:

    Defining the variables that will hold thetrainable weights.

    Defining the basic model graph structuredescribed above. And,

    Stamping out several copies of the modelgraph for training, testing, and validation.

定义用于训练的权重变量。

定义基本的如上面我们描述的模型图谱结构

分别运行三次上面的模型图谱,分别代表训练,测试,以及验证。

 

We'll startwith the variables.

现在让我们从定义变量开始吧!

import tensorflowas tf

 

# We'll bundlegroups of examples during training for efficiency.

#我们将在训练过程中,把他们分段为一个个小组,用来提高我们训练效率及减少计算机内存要求。

# This definesthe size of the batch.

 这是我们定义的批量大小。

BATCH_SIZE = 60

# We have onlyone channel in our grayscale images.

我们的图像是灰度图像,因此只有一个通道。

NUM_CHANNELS =1

# The randomseed that defines initialization.

 随机数的种子,在我们初始化的时候需要随机生成一些数字。

SEED = 42

 

# This is wheretraining samples and labels are fed to the graph.

这就是我们把训练样本集和标签集送到图谱里去。

# Theseplaceholder nodes will be fed a batch of training data at each

# trainingstep, which we'll write once we define the graph structure.

这些占位符节点将会在每次训练步骤的时候被训练数据集填充。

因此我们只需要在我们的整个图片结构里定义一次。

 

train_data_node= tf.placeholder(tf.float32,

  shape=(BATCH_SIZE, IMAGE_SIZE, IMAGE_SIZE,NUM_CHANNELS))

train_labels_node= tf.placeholder(tf.float32,

  shape=(BATCH_SIZE, NUM_LABELS))

 

# For thevalidation and test data, we'll just hold the entire dataset in

# one constantnode.

对于验证和测试数据集,我们在一个常数节点里就可以保持整个数据集。

validation_data_node= tf.constant(validation_data)

test_data_node= tf.constant(test_data)

 

# The variablesbelow hold all the trainable weights. For each, the

# parameterdefines how the variables will be initialized.

 下面的变量将保存所有的可训练权重系数。在函数输入参数里会定义好每一个权重系数是如何初始化的。

conv1_weights =tf.Variable(

  tf.truncated_normal([5, 5, NUM_CHANNELS,32],  # 5x5 filter, depth 32.

                      stddev=0.1, seed=SEED))

conv1_biases =tf.Variable(tf.zeros([32]))

conv2_weights =tf.Variable(

  tf.truncated_normal([5, 5, 32, 64],stddev=0.1, seed=SEED))

conv2_biases = tf.Variable(tf.constant(0.1,shape=[64]))

fc1_weights =tf.Variable(  # fully connected, depth512.

  tf.truncated_normal([IMAGE_SIZE // 4 *IMAGE_SIZE // 4 * 64, 512],

                      stddev=0.1, seed=SEED))

fc1_biases =tf.Variable(tf.constant(0.1, shape=[512]))

fc2_weights =tf.Variable(

  tf.truncated_normal([512, NUM_LABELS],stddev=0.1, seed=SEED))

fc2_biases =tf.Variable(tf.constant(0.1, shape=[NUM_LABELS]))

 

print('Done')

 

Done

 

Now that we'vedefined the variables to be trained, we're ready to wire them together into aTensorFlow graph.

既然我们已经定义好了用于训练的变量,我们下一步将把他们组织到一个TensorFLow 图谱里来。

We'll define ahelper to do this, model, which will return copies of the graph suitable fortraining and testing. Note the train argument, which controls whether or notdropout is used in the hidden layer. (We want to use dropout only duringtraining.)

我们将定义一个函数来实现它,训练模型,到时候我们就可以进行函数调用,从而可以分别进行训练和测试。注意,在这个函数的参数里,有一个train 参数,它将决定是否需要使用在隐藏层里使用dropout 功能,因为我们只需要在训练的时候使用它,但把它设置为训练=true 的时候,这个时候才会使用dropout 功能。

def model(data,train=False):

    """The Model definition."""

    # 2D convolution, with 'SAME' padding (i.e.the output feature map has

    # the same size as the input). Note that{strides} is a 4D array whose

    # shape matches the data layout: [imageindex, y, x, depth].

    #2D 卷积,使用SAME 填充,这样确保输出特征地图和输入会有相同的大小。

    #主要边带是4D数组,它的形状遵循这样的数据排列: [image index, y, x, depth]

    conv = tf.nn.conv2d(data,

                        conv1_weights,

                        strides=[1, 1, 1, 1],

                        padding='SAME')

 

    # Bias and rectified linear non-linearity.

    #偏置和修正线性的非线性。这个怎么理解呢?

#Relu 是这样一个作用:小于0的值就变成0(非线性了),大于0的等于它本身(线性化)

    relu = tf.nn.relu(tf.nn.bias_add(conv,conv1_biases))

 

    # Max pooling. The kernel size spec ksizealso follows the layout of

    # the data. Here we have a pooling windowof 2, and a stride of 2.

     #最大池化,这核大小由ksize设置,并遵循数据布局。

     #池化窗口是2,边带也是2

    pool = tf.nn.max_pool(relu, ksize=[1, 2, 2,1],

                          strides=[1, 2, 2, 1],

                          padding='SAME')

    conv = tf.nn.conv2d(pool,

                        conv2_weights,

                        strides=[1, 1, 1, 1],

                        padding='SAME')

    relu = tf.nn.relu(tf.nn.bias_add(conv,conv2_biases))

    pool = tf.nn.max_pool(relu,

                          ksize=[1, 2, 2, 1],

                          strides=[1, 2, 2, 1],

                          padding='SAME')

 

    # Reshape the feature map cuboid into a 2Dmatrix to feed it to the

    # fully connected layers.

   #Reshape 特征长方体图形映射到2D矩阵,用来填充到全连接层。

    pool_shape = pool.get_shape().as_list()

    reshape = tf.reshape(pool,

        [pool_shape[0], pool_shape[1] *pool_shape[2] * pool_shape[3]])

 

    # Fully connected layer. Note that the '+'operation automatically

    # broadcasts the biases.

    #全连接层:注意这里的“+”操作会自动转换这里的偏置参数

    hidden = tf.nn.relu(tf.matmul(reshape,fc1_weights) + fc1_biases)

 

    # Add a 50% dropout during training only.Dropout also scales

    # activations such that no rescaling isneeded at evaluation time.

   #对于训练过程,增加一个50% dropout (舍弃). Dropout 的时候会缩放激活量,

#因此在评估的时候不需要做缩放处理

    if train:

        hidden = tf.nn.dropout(hidden, 0.5,seed=SEED)

    return tf.matmul(hidden, fc2_weights) +fc2_biases

 

print('Done')

Done

 

Having definedthe basic structure of the graph, we're ready to stamp out multiple copies fortraining, testing, and validation.

我们已经定义好了基本的图谱结构了,现在我们就可以调用它分别用于训练,测试,和验证了。

 

Here, we'll dosome customizations depending on which graph we're constructing.train_prediction holds the training graph, for which we use cross-entropy lossand weight regularization. We'll adjust the learning rate during training --that's handled by the exponential_decay operation, which is itself an argumentto the MomentumOptimizer that performs the actual training.

到这里,我们将做一些基于我们正在创建的训练图谱进行一些定制化。

训练图谱推导出train_prediction,在这个图谱中,使用交叉熵计算损失和冰对权重做正则化。在训练过程中我们会调整学习率,这个调整依赖于exponential_decay操作去处理,它自身又是作为momentumoptimizer操作的一个参数,从而进行实际训练操作。

The vaildationand prediction graphs are much simpler the generate -- we need only createcopies of the model with the validation and test inputs and a softmaxclassifier as the output.

验证和预测图谱是非常简单的,我们只需要创建模型的副本,验证和测试数据集是它的输入,softmax分类器是它的输出。

# Trainingcomputation: logits + cross-entropy loss.

logits =model(train_data_node, True)

loss =tf.reduce_mean(tf.nn.softmax_cross_entropy_with_logits(

  labels=train_labels_node, logits=logits))

 

# L2regularization for the fully connected parameters.

 L2 正则化所有全连接参数。

regularizers =(tf.nn.l2_loss(fc1_weights) + tf.nn.l2_loss(fc1_biases) +

                tf.nn.l2_loss(fc2_weights) +tf.nn.l2_loss(fc2_biases))

# Add theregularization term to the loss.

将正则化项加入损失中。

loss += 5e-4 *regularizers

 

# Optimizer:set up a variable that's incremented once per batch and

# controls thelearning rate decay.

优化器,设置一个变量来累计数,每一个batch 后增加1。并控制我们学习率的衰减。

batch =tf.Variable(0)

# Decay onceper epoch, using an exponential schedule starting at 0.01.

  每个周期更新衰减一次,这里使用从0.01开始的指数周期衰减函数。

learning_rate =tf.train.exponential_decay(

  0.01,                # Base learning rate.学习率为0.01

  batch * BATCH_SIZE,  # Current index into the dataset.

  train_size,          # Decay step.

  0.95,                # Decay rate.

  staircase=True)

# Use simplemomentum for the optimization.

 #使用简单的momentum进行优化

optimizer =tf.train.MomentumOptimizer(learning_rate,

 0.9).minimize(loss, global_step=batch)

 

# Predictionsfor the minibatch, validation set and test set.

 #预测一次batch,验证集和测试集。

train_prediction= tf.nn.softmax(logits)

# We'll computethem only once in a while by calling their {eval()} method.

 我们将在后续调用eval()方法的时候才会计算它们。

validation_prediction= tf.nn.softmax(model(validation_data_node))

test_prediction= tf.nn.softmax(model(test_data_node))

 

print('Done')

Done

 

Training andvisualizing results

训练,并虚化整个结果。

Now that wehave the training, test, and validation graphs, we're ready to actually gothrough the training loop and periodically evaluate loss and error.

现在我们有了训练,测试,验证图谱,我们将真正进入训练训练,并周期性来评估模型的损失率及偏差。

All of theseoperations take place in the context of a session. In Python, we'd writesomething like:

所有这些操作将在一个session(会话期)上下文里进行,对于Python,我们将写成下面这个样子!

 

withtf.Session() as s:

  ...training / test / evaluation loop...

 

But, here,we'll want to keep the session open so we can poke at values as we work out thedetails of training. The TensorFlow API includes a function for this,InteractiveSession.

但是这次,我们想要保持会话期一直打开,这样我们就可以抓取一些数字,从而可以让我们看到更详细的训练过程。TensorFLow API 以及包含有一个这样的功能了,那就是InteractiveSession. 交互式会话期。

 

We'll start bycreating a session and initializing the varibles we defined above.

我们开始创建一个会话期及初始化一些我们上面以及定义的那些变量。

# Create a newinteractive session that we'll use in # subsequent code cells.

 创建一个新的交互式会话期,我们将使用后续代码单元。

s =tf.InteractiveSession()

 

# Use our newlycreated session as the default for

# subsequentoperations.

使用我们新创建的会话期作为我们缺省的后续操作。

s.as_default()

 

# Initializeall the variables we defined above.

初始化我们上面定义的所有变量

tf.global_variables_initializer().run()

 

Now we're readyto perform operations on the graph. Let's start with one round of training.We're going to organize our training steps into batches for efficiency; i.e.,training using a small set of examples at each step rather than a singleexample.

现在我们将准备执行这个图谱的操作了。让我们开始一轮训练,为了训练效率,我们使用一个batch来训练,而不是一个单一的例子。

 

BATCH_SIZE = 60

 

# Grab thefirst BATCH_SIZE examples and labels.

 抓取第一份batch 个图片和标签。

batch_data =train_data[:BATCH_SIZE, :, :, :]

batch_labels =train_labels[:BATCH_SIZE]

 

# Thisdictionary maps the batch data (as a numpy array) to the

# node in thegraph it should be fed to.

 定义一个这样的待填充字典,用来映射到这个图谱的节点中,字典里的内容就是训练数据节点和训练标签节点。

 

feed_dict ={train_data_node: batch_data,

             train_labels_node: batch_labels}

 

# Run the graphand fetch some of the nodes.

 运行这个图谱并抓取一些节点数据。

_, l, lr,predictions = s.run(

  [optimizer, loss, learning_rate,train_prediction],

  feed_dict=feed_dict)

 

print('Done')

Done

Let's take alook at the predictions. How did we do? Recall that the output will beprobabilities over the possible classes, so let's look at those probabilities.

让我们看下predictions 里的数据,我们该怎么做呢?让我们回想一下,输出代表着可能的类的概率,所以让我们看看这些概率值是多少呢?

print(predictions[0])

[  2.25393116e-04   4.76219611e-05   1.66867452e-03   5.67827519e-05

   6.03432178e-01   4.34969068e-02   2.19316553e-05   1.41286102e-04

   1.54903100e-05   3.50893795e-01]

As expectedwithout training, the predictions are all noise. Let's write a scoring functionthat picks the class with the maximum probability and compares with theexample's label. We'll start by converting the probability vectors returned bythe softmax into predictions we can match against the labels.

如我们所期待,这些是没有经过训练的结果,这些数据都是一些混乱的,如噪声一样。让我们写一个打分函数,它将抓取最大的概率值,然后去和我们实际的标签值做个对比。我们使用softmax 来转换这个概率向量,并实现分类,从而跟我们的实际的标签向量比较。

 

# The highestprobability in the first entry.

#在第一项里的最高概率值。

print('Firstprediction', numpy.argmax(predictions[0]))

 

# But,predictions is actually a list of BATCH_SIZE probability vectors.

#但是预测出来的实际是BATCH_SIZE小批量里的一个概率向量。

print(predictions.shape)

 

# So, we'lltake the highest probability for each vector.

#因此我们将取每个向量里最高的概率。

print('Allpredictions', numpy.argmax(predictions, 1))

Firstprediction 4

第一个预测下是数字4的结果

(60, 10)

All predictions[4 4 2 7 7 7 7 7 7 7 7 7 0 8 9 0 7 7 0 7 4 0 5 0 9 9 7 0 7 4 7 7 7 0 7 7 9

 7 9 9 0 7 7 7 2 7 0 7 2 9 9 9 9 9 0 7 9 4 8 7]

 

Next, we can dothe same thing for our labels -- using argmax to convert our 1-hot encodinginto a digit class.

下一步,我们要对标签数据做一些处理,使用argmax 函数来转换1-hot 编码到一个数字类别。

print('Batchlabels', numpy.argmax(batch_labels, 1))

Batch labels [73 4 6 1 8 1 0 9 8 0 3 1 2 7 0 2 9 6 0 1 6 7 1 9 7 6 5 5 8 8 3 4 4 8 7 3

 6 4 6 6 3 8 8 9 9 4 4 0 7 8 1 0 0 1 8 5 7 1 7]

Now we cancompare the predicted and label classes to compute the error rate and confusionmatrix for this batch.

correct =numpy.sum(numpy.argmax(predictions, 1) == numpy.argmax(batch_labels, 1))

total =predictions.shape[0]

 

print(float(correct)/ float(total))

 

confusions =numpy.zeros([10, 10], numpy.float32)

bundled =zip(numpy.argmax(predictions, 1), numpy.argmax(batch_labels, 1))

for predicted,actual in bundled:

  confusions[predicted, actual] += 1

 

plt.grid(False)

plt.xticks(numpy.arange(NUM_LABELS))

plt.yticks(numpy.arange(NUM_LABELS))

plt.imshow(confusions,cmap=plt.cm.jet, interpolation='nearest');

 

0.06666666666666667

Now let's wrapthis up into our scoring function.

 

deferror_rate(predictions, labels):

    """Return the error rate andconfusions."""

    correct = numpy.sum(numpy.argmax(predictions,1) == numpy.argmax(labels, 1))

    total = predictions.shape[0]

    error = 100.0 - (100 * float(correct) /float(total))

    confusions = numpy.zeros([10, 10],numpy.float32)

    bundled = zip(numpy.argmax(predictions, 1),numpy.argmax(labels, 1))

    for predicted, actual in bundled:

        confusions[predicted, actual] += 1

        return error, confusions

print('Done')

Done

 

We'll need totrain for some time to actually see useful predicted values. Let's define aloop that will go through our data. We'll print the loss and errorperiodically.

我们需要训练一段时间才能确保看到有用的预测值。让我们定义一个循环,在这个循环里处理这些数据集。我们将周期性地打印出对应的损失值和错误值

Here, we wantto iterate over the entire data set rather than just the first batch, so we'llneed to slice the data to that end.

到这里,我们想要迭代整个数据及而不是第一个小批量升级。因此我们需要整个数据进行分片处理。

(One passthrough our training set will take some time on a CPU, so be patient if you areexecuting this notebook.)

对于一个单cpu核的话,一轮训练操作将需要一定的时间,因此请耐心等待执行整个训练过程。

 

# Train overthe first 1/4th of our training set.

steps=train_size // BATCH_SIZE

for step inrange(steps):

            # Compute the offset of the currentminibatch in the data.

            # Note that we could use betterrandomization across epochs.

      offset = (step * BATCH_SIZE) %(train_size - BATCH_SIZE)

      batch_data = train_data[offset:(offset +BATCH_SIZE), :, :, :]

      batch_labels = train_labels[offset:(offset+ BATCH_SIZE)]

      # This dictionary maps the batchdata (asa numpy array) tothe

      # node in the graph it should be fed to.

      feed_dict = {train_data_node: batch_data,train_labels_node: batch_lbels}

      #Run the graph and fech some f the nodes.

      _, l, lr, predictions = s.run([optimizer, loss, learning_rate, train_prediction],

       fed_dict=feed_dict)

       # Print out the loss periodically.

       if step % 100 == 0:

       error,_ = error_rate(predictions, bach_labels)

       print('Step %d of %d')

 

Step 0 of 916

Mini-batchloss: 7.71249 Error: 91.66667 Learning rate: 0.01000

Validationerror: 88.9%

Step 100 of 916

Mini-batchloss: 3.28715 Error: 8.33333 Learning rate: 0.01000

Validationerror: 5.8%

Step 200 of 916

。。。。。

 

The error seemsto have gone down. Let's evaluate the results using the test set.

感觉错误在慢慢变小。让我们用测试数据集来评估下这个结果。

To helpidentify rare mispredictions, we'll include the raw count of each (prediction,label) pair in the confusion matrix.

 为了我们更区分出这些比较少的误测,我们将在这个婚礼矩阵里包含纯数据对(预测值和对应的标签值)

test_error,confusions = error_rate(test_prediction.eval(), test_labels)

print('Testerror: %.1f%%' % test_error)

 

plt.xlabel('Actual')

plt.ylabel('Predicted')

plt.grid(False)

plt.xticks(numpy.arange(NUM_LABELS))

plt.yticks(numpy.arange(NUM_LABELS))

plt.imshow(confusions,cmap=plt.cm.jet, interpolation='nearest');

 

for i, cas inenumerate(confusions):

    for j, count in enumerate(cas):

        if count > 0:

            xoff = .07 * len(str(count))

            plt.text(j-xoff, i+.2, int(count),fontsize=9, color='white')

 

Test error:2.0%

 

We can see herethat we're mostly accurate, with some errors you might expect, e.g., '9' isoften confused as '4'.

到这里,我们将看到结果已经很精确了,虽然有时候9会被错误地识别为4

 

Let's doanother sanity check to make sure this matches roughly the distribution of ourtest set, e.g., it seems like we have fewer '5' values.

让我们做另外一个完整性检查,以确保这与我们测试集的分布大致相符。比如我们感觉标签数字5的样本量比较少。

 

plt.xticks(numpy.arange(NUM_LABELS))

plt.hist(numpy.argmax(test_labels,1));

 

Indeed, weappear to have fewer 5 labels in the test set. So, on the whole, it seems likeour model is learning and our early results are sensible.

确实,我们发现在这些测试集中,我们的标签5的样本数比较少。因此总统来说,我们的模型能够进行学习而且前面的结果也是比较靠谱的。

But, we've onlydone one round of training. We can greatly improve accuracy by training forlonger. To try this out, just re-execute the training cell above.

可是,我们只进行了一轮训练,如果我们做充分多的训练,我们就能大大提高识别精度的!为了证明这个结论,请连续执行上面的训练过程吧!

评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值