TensorFlow实现进阶的卷积网络

首先这一节使用的数据集CIFAR-10数据集,包含了60000张32x32的彩色图像,其中50000张训练图像,测试集10000张。一共是10类标签,分别是airplane,automobile,bird,cat,deer,dog,frog,horse,ship和truck。首先我们来准备好数据集,这个也花了一点时间。
cifar—10数据集地址:http://www.cs.toronto.edu/~kriz/cifar.html
在这里插入图片描述
我们在这个项目中需要下载的是第三个CIFAR-10 binary version 162MB这个版本,应该里面的数据是二进制数据。
我没有按照代码那样下载数据集,直接在该网站上快速下载,
我以前也用过这个数据集跑VGG网络,当时下载的是python版本也就是第一个,本来以为省事了,看代码直接把文件后缀改为.bin,但是结果当时是不行,loss = nan最后直接爆炸,大家千万别犯这样的错误哦,接着就是下一个错了。
import cifar10这个是配合后面的cifar10.maybe_download_and_extract()这个共同完成数据集的下载,既然我们已经下载好了就注释掉它,然后就是import cifar10_input没有这个函数,这个函数主要是读入数据和数据增强的作用,冷静不能删掉,我们得找到它,于是我在网上找到这个函数,只需要新建一个py文件,命名为cifar_10_input。然后放到随便一个文件中,并将文件sources root设置,这样就可以成功导入了。接下来是cifar_10_input的代码

#
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import os

from six.moves import xrange  # pylint: disable=redefined-builtin
import tensorflow as tf

# Process images of this size. Note that this differs from the original CIFAR
# image size of 32 x 32. If one alters this number, then the entire model
# architecture will change and any model would need to be retrained.
# 处理这种大小的图像。 请注意,这与 32 x 32 的原始 CIFAR 图像大小不同。
# 如果更改此数字,则整个模型体系结构将发生变化,任何模型都需要重新训练。
IMAGE_SIZE = 24

# Global constants describing the CIFAR-10 data set.
# 描述 CIFAR-10 数据集的全局常数。
NUM_CLASSES = 10
NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN = 50000
NUM_EXAMPLES_PER_EPOCH_FOR_EVAL = 10000


def read_cifar10(filename_queue):
    """Reads and parses examples from CIFAR10 data files.
       从 filename_queue 中读取 CIFAR10 二进制数据,构造成样本数据

    Recommendation: if you want N-way read parallelism, call this function
    N times.  This will give you N independent Readers reading different
    files & positions within those files, which will give better mixing of
    examples.
    Args:
      filename_queue: A queue of strings with the filenames to read from.
    Returns:
      An object representing a single example, with the following fields:
        height: number of rows in the result (32)
        width: number of columns in the result (32)
        depth: number of color channels in the result (3)
        key: a scalar string Tensor describing the filename & record number
          for this example.
        label: an int32 Tensor with the label in the range 0..9.
        uint8image: a [height, width, depth] uint8 Tensor with the image data
    """

    class CIFAR10Record(object):
        pass

    result = CIFAR10Record()

    # Dimensions of the images in the CIFAR-10 dataset.
    # See http://www.cs.toronto.edu/~kriz/cifar.html for a description of the
    # input format.
    # cifar10 的数据集共有 6 万幅 32 * 32 大小的图片,分为 10 类,每类 6000 张,其中 5 万张用于训练, 1 万张用于测试。
    # 数据集被分成了 5 个训练的 batches (data_batch_1.bin ~ data_batch_5.bin) 和 1 个测试的 batch (test_batch.bin)。每个 batch 里的图片都是随机排列的。
    # 每个 bin 文件的格式如下:
    #
    # <1 x label><3072 x pixel>
    # ...
    # <1 x label><3072 x pixel>
    #
    # 共有一万行,每行 3073 个字节,第一个字节表示标签信息,剩下的 3072 字节分为 RGB 三通道,每个通道 1024( = 32 * 32) 个字节。
    # 注意,行与行之间没有明显的区分标识符,所以整个 bin 文件字节长度恰好是 3073 万。
    label_bytes = 1  # 2 for CIFAR-100
    result.height = 32
    result.width = 32
    result.depth = 3
    image_bytes = result.height * result.width * result.depth
    # Every record consists of a label followed by the image, with a
    # fixed number of bytes for each.
    # 每个记录都包含标签信息和图片信息,每个记录都有固定的字节数(3073 = 1 + 3072)。
    record_bytes = label_bytes + image_bytes

    # Read a record, getting filenames from the filename_queue.  No
    # header or footer in the CIFAR-10 format, so we leave header_bytes
    # and footer_bytes at their default of 0.
    # 从 filename_queue 获取文件名,读取记录。
    # CIFAR-10 文件中没有页眉或页脚,所以我们把 header_bytes 和 footer_bytes 设置为默认值0。

    # TensorFlow 使用 tf.FixedLengthRecordReader 读取固定长度格式的数据,与 tf.decode_raw 配合使用
    reader = tf.FixedLengthRecordReader(record_bytes=record_bytes)
    result.key, value = reader.read(filename_queue)

    # Convert from a string to a vector of uint8 that is record_bytes long.
    # 从一个字符串转换为一个 uint8 的向量,即 record_bytes 长。
    record_bytes = tf.decode_raw(value, tf.uint8)

    # The first bytes represent the label, which we convert from uint8->int32.
    # 采用 tf.strided_slice 方法在 record_bytes 中提取第一个 bytes 作为标签,从 uint8 转换为 int32。
    # tf.slice(record_bytes, 起始位置, 长度)
    result.label = tf.cast(
        tf.strided_slice(record_bytes, [0], [label_bytes]), tf.int32)

    # The remaining bytes after the label represent the image, which we reshape
    # from [depth * height * width] to [depth, height, width].
    # 记录中标签后的剩余字节代表图像,从 label 起,在 record_bytes 中提取 self.image_bytes = 3072 长度为图像,
    # 从 [depth * height * width] 转化为 [depth,height,width],图片转化成 3*32*32。
    depth_major = tf.reshape(
        tf.strided_slice(record_bytes, [label_bytes],
                         [label_bytes + image_bytes]),
        [result.depth, result.height, result.width])
    # Convert from [depth, height, width] to [height, width, depth].
    # 从 [depth, height, width] 转化为 [height, width, depth],图片转化成 32*32*3。
    result.uint8image = tf.transpose(depth_major, [1, 2, 0])

    return result


def _generate_image_and_label_batch(image, label, min_queue_examples,
                                    batch_size, shuffle):
    """Construct a queued batch of images and labels.
       构造 batch_size 样本集
    Args:
      image: 3-D Tensor of [height, width, 3] of type.float32.
      label: 1-D Tensor of type.int32
      min_queue_examples: int32, minimum number of samples to retain
        in the queue that provides of batches of examples.
        在队列中保留的最小样本数量。
      batch_size: Number of images per batch.
      shuffle: boolean indicating whether to use a shuffling queue.
      shuffle 的作用在于指定是否需要随机打乱样本的顺序,一般作用于训练阶段,提高鲁棒性。

    Returns:
      images: Images. 4D tensor of [batch_size, height, width, 3] size.
      labels: Labels. 1D tensor of [batch_size] size.
    """
    # Create a queue that shuffles the examples, and then
    # read 'batch_size' images + labels from the example queue.
    # 创建一个随机打乱样本顺序的队列,然后从示例队列中读取 batch_size 个图像+标签
    num_preprocess_threads = 16
    if shuffle:
        # 当 shuffle = true 时,每次从队列中 dequeue 取数据时,不再按顺序,而是随机的,所以打乱了样本的原有顺序。
        # shuffle 还要配合参数 min_after_dequeue 使用才能发挥作用。
        # 这个参数 min_after_dequeue 的意思是队列中,做 dequeue(取数据)的操作后,queue runner 线程要保证队列中至少剩下 min_after_dequeue 个数据。
        # 如果 min_after_dequeue 设置的过少,则即使 shuffle 为 true,也达不到好的混合效果。
        images, label_batch = tf.train.shuffle_batch(
            [image, label],
            batch_size=batch_size,
            num_threads=num_preprocess_threads,
            capacity=min_queue_examples + 3 * batch_size,
            min_after_dequeue=min_queue_examples)
    else:
        # 当 shuffle = false 时,每次 dequeue 是从队列中按顺序取数据,遵从先入先出的原则
        images, label_batch = tf.train.batch(
            [image, label],
            batch_size=batch_size,
            num_threads=num_preprocess_threads,
            capacity=min_queue_examples + 3 * batch_size)

    # Display the training images in the visualizer.
    # 在可视化器中显示训练图像。
    tf.summary.image('images', images)

    return images, tf.reshape(label_batch, [batch_size])


"""
  原始图片经过了部分预处理之后,才送入模型进行训练或评估。
  原始的图片尺寸为32*32的像素尺寸,主要的预处理是两步:
  1、 首先将其裁剪为24*24像素大小的图片,其中训练集是随机裁剪,测试集是沿中心裁 
  2、 将图片进行归一化,变为0均值,1方差
  其中为了增加样本量,我们还对训练集增加如下的预处理:
  1、 随机的对图片进行由左到右的翻转 
  2、 随机的改变图片的亮度 
  3、 随机的改变图片的对比度
  4、 最后是图片的白化
"""


def distorted_inputs(data_dir, batch_size):
    """Construct distorted input for CIFAR training using the Reader ops.
       使用 Reader ops 将样本数据进行预处理,构造成 CIFAR 训练数据

    Args:
      data_dir: Path to the CIFAR-10 data directory.
      batch_size: Number of images per batch.
    Returns:
      images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
      labels: Labels. 1D tensor of [batch_size] size.
    """
    filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
                 for i in xrange(1, 6)]
    for f in filenames:
        if not tf.gfile.Exists(f):
            raise ValueError('Failed to find file: ' + f)

    # Create a queue that produces the filenames to read.
    # 生成要读取的文件名队列
    filename_queue = tf.train.string_input_producer(filenames)

    # Read examples from files in the filename queue.
    read_input = read_cifar10(filename_queue)
    reshaped_image = tf.cast(read_input.uint8image, tf.float32)

    height = IMAGE_SIZE
    width = IMAGE_SIZE

    # Image processing for training the network. Note the many random
    # distortions applied to the image.
    # 为训练网络进行图像处理。注意应用于图像的许多随机失真。

    # Randomly crop a [height, width] section of the image.
    # 随机裁剪图像为 [height,width] 像素大小的图片
    distorted_image = tf.random_crop(reshaped_image, [height, width, 3])

    # Randomly flip the image horizontally.
    # 随意地水平翻转图像。
    distorted_image = tf.image.random_flip_left_right(distorted_image)

    # Because these operations are not commutative, consider randomizing
    # the order their operation.
    # 因为这些操作是不可交换的,所以请考虑将它们的操作随机化。
    # 随机的改变图片的亮度
    distorted_image = tf.image.random_brightness(distorted_image,
                                                 max_delta=63)
    # 随机的改变图片的对比度
    distorted_image = tf.image.random_contrast(distorted_image,
                                               lower=0.2, upper=1.8)

    # Subtract off the mean and divide by the variance of the pixels.
    # 图像的白化:减去平均值并除以像素的方差,均值与方差的均衡,降低图像明暗、光照差异引起的影响
    float_image = tf.image.per_image_standardization(distorted_image)

    # Set the shapes of tensors.
    float_image.set_shape([height, width, 3])
    read_input.label.set_shape([1])

    # Ensure that the random shuffling has good mixing properties.
    # 确保随机 shuffling 具有良好的混合性能。
    min_fraction_of_examples_in_queue = 0.4
    min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
                             min_fraction_of_examples_in_queue)
    print('Filling queue with %d CIFAR images before starting to train. '
          'This will take a few minutes.' % min_queue_examples)

    # Generate a batch of images and labels by building up a queue of examples.
    # 构造 batch_size 样本集(图像+标签)
    return _generate_image_and_label_batch(float_image, read_input.label,
                                           min_queue_examples, batch_size,
                                           shuffle=True)


def inputs(eval_data, data_dir, batch_size):
    """Construct input for CIFAR evaluation using the Reader ops.
       使用 Reader ops 将样本数据进行预处理,构造成 CIFAR 测试数据构建
    Args:
      eval_data: bool, indicating if one should use the train or eval data set.
      data_dir: Path to the CIFAR-10 data directory.
      batch_size: Number of images per batch.
    Returns:
      images: Images. 4D tensor of [batch_size, IMAGE_SIZE, IMAGE_SIZE, 3] size.
      labels: Labels. 1D tensor of [batch_size] size.
    """
    if not eval_data:
        filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
                     for i in xrange(1, 6)]
        num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN
    else:
        filenames = [os.path.join(data_dir, 'test_batch.bin')]
        num_examples_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_EVAL

    for f in filenames:
        if not tf.gfile.Exists(f):
            raise ValueError('Failed to find file: ' + f)

    # Create a queue that produces the filenames to read.
    # 生成要读取的文件名队列
    filename_queue = tf.train.string_input_producer(filenames)

    # Read examples from files in the filename queue.
    # 从文件名队列中的文件读取示例
    read_input = read_cifar10(filename_queue)
    reshaped_image = tf.cast(read_input.uint8image, tf.float32)

    height = IMAGE_SIZE
    width = IMAGE_SIZE

    # Image processing for evaluation.
    # Crop the central [height, width] of the image.
    resized_image = tf.image.resize_image_with_crop_or_pad(reshaped_image,
                                                           height, width)

    # Subtract off the mean and divide by the variance of the pixels.
    float_image = tf.image.per_image_standardization(resized_image)

    # Set the shapes of tensors.
    float_image.set_shape([height, width, 3])
    read_input.label.set_shape([1])

    # Ensure that the random shuffling has good mixing properties.
    min_fraction_of_examples_in_queue = 0.4
    min_queue_examples = int(num_examples_per_epoch *
                             min_fraction_of_examples_in_queue)

    # Generate a batch of images and labels by building up a queue of examples.
    # 通过构建一个示例队列生成一批图像和标签。
    return _generate_image_and_label_batch(float_image, read_input.label,
                                           min_queue_examples, batch_size,
                                           shuffle=False)

哇有点长,直接复制就好了。这个py文件中代码比较简单,只是能看懂,但是要我自己写也写不出来,所以我准备使用笨办法,花时间把它背会。
接着进入主题。开始一步步实现在cifar-10数据集上进行识别。
首先导入必要的包

import cifar_10_input
import tensorflow as tf
import numpy as np
import time

接着定义batch_size,训练的轮数max_steps,以及cifar_10数据集的位置

max steps = 3000
batch_size = 128
data_dir = './cifar-10-binary/cifar-10-batches-bin'#这个要看自己的路径

同前面一样,我们首先定义初始化weight函数,和仍然使用截断正态分布来初始化权重,但是这里会给一个L2的loss,相当于做一个L2正则化,目的是通过减少特征或者惩罚不重要的特征权重来缓解过拟合的问题,但是我们通常是不清楚该惩罚哪些特征的权重,而正则化就是帮助我们惩罚特征权重的,也就是说特征的权重也会模型的损失函数的一部分。也可以这样理解,为了使用某些特征我们需要付出loss的代价,除非这个特征非常有效,不然就会被loss上的增加覆盖效果,而淡化该特征的权重,这样我们大概可以筛选出最有效的特征,减少特征权重的过拟合。这是著名的奥卡姆剃刀法则,他说越是简单的东西越是有效。L2正则化是会让特征的权重不过大,史特特征的权重比较平均,而L1正则化是使得制造稀疏的特性,大部分无用的权重会被置为0,详细的过程可以参考的转载的一篇博客,写的very清楚。
我们使用w1控制L2 loss的大小,使用tf.nn.l2_loss函数计算weight的L2 loss,在使用tf.multiply来让L2 loss乘以w1得到最后的weight loss,接着我们使用tf.ad_tf_collection把weight loss统一存到一个collection中,这个collection命名为“loss”,它会在后面计算神经网络的总体loss时用到。

def variable_with_weight_loss(shape, studdev, wl):
	var = tf.Variable(tf.truncated_normal(shape, stddev = stddev))
	if wl is not None:
		weight_loss = tf.multiply(tf.nn.l2_loss(var), wl, name = 'weight_loss')
		tf.add_to.collection('losses', weight_loss)
	return var

使用cifar10_input类中的distorted_inputs函数产生训练需要使用的数据和标签labels,这里的返回是已经装好的tensor,每次执行都会生成一个batch_size的数量的样本,这里我们对数据进行了增强(data augmentation)通过这样我们获得更多带噪声的样本,原本的一张图片变为多张,相当于扩大了样本容量,对提高准确率很有帮助。

images_train, labels_train = cifar10_input.distorted_inputs(data_dir = data_dir, batch_size = batch_size)

然后我们在使用cifar10_input.inputs函数生成测试数据,这里不需要太多的数据增强,只是要抽样调查一下,但是尺寸得和训练数据一样变为24x24.

images_test,labels_test = cifar10_input.inputs(eval_data = True,data_dir = data_dir,batch_size = batch_size)

这里创建输入数据得palceholder,包括特征和label。在设定palceholder时得数据尺寸不再是以前得None,我们需要固定下来,因为后面网络中会用到,这里其实就是输入样本得数量–batch_size。而数据尺寸是24x24,颜色为3通道。

image_holder = tf.placeholder(tf.float32, [batch_size, 24,24,3])
label_holder = tf.placeholder(tf.int32,[batch_size])

数据准备好我们开始定义网络结构
首先第一层
利用前面定义好得variable_with_loss函数创建卷积核得参数并且进行初始化,第一个卷积核大小设置为5x5,3通道,64个,同时设置标准差为0.05(经验值)
,不进行L2正则化。bias偏置设为常数0,64维,然后进行tf.nn.conv2d函数进行卷积操作,接着就是relu非线性激活,pooling下采样,最后我们使用tf.nn.lrn函数对结果进行处理(局部响应标准化)作用:使得其中响应大的值变得相对更大,并抑制其他反馈较小的神经元。

weight1 = variable_with_weight_loss(shape = [5, 5, 3, 64], stddev = 5e-2, wl = 0.0)
bias1 = tf.Variable(tf.constant(0.0, shape = [64]))
kernal1 = tf.nn.conv2d(image_holder, weight1, [1,1,1,1],padding = 'SAME')
conv1 = tf.nn.relu(tf.nn.bias(kernal1, bias1))
pool1 = tf.nn.max_pool(conv1,ksize = [1,3,3,1],strides = [1,2,2,1],padding = 'SAME')
norm1 = tf.nn.krn(pool1, 4, bias =1.0, alpha = 0.001/9.0,beta = 0.75)

然后进行搭建第二层网络
和第一层的区别我们试着换一下lrn层和池化层

weight2 = variable_with_weight_loss(shape = [5,5,64,64],stddev = 5e-2,wl = 0.0)
bias2 = tf.Variable(tf.nn.constant(0.1, shape = [64]))
kernal2 = tf.nn.conv2d(norm1, weight2, [1,1,1,1],padding = 'SAME')
conv2 = tf.nn.relu(tf.nn.bias_add(kernal2, bias2))
norm2 = tf.nn.lrn(conv2, 4, bias = 1.0, alpha = 0.001/9.0, beta = 0.75)
pool2 = tf.max_pool(norm2, ksize = [1, 3, 3, 1],strides = [1, 2, 2, 1],padding = 'SAME')

我们进入全连接层,就是要展成一维向量,我们使用tf.reshape函数将每个样本变成一维向量,并且使用get_shape函数获取数据扁平化之后的长度。这里我自己跑了下程序理解了一下:
在这里插入图片描述
如图,我假设batch_size是6张图片,图片的大小是[3,3,2,4],首先我将他reshape为一维向量结果如下:
在这里插入图片描述
dim = reshape.get_shape()[1].value这一步是为了获取每一维(横向)的值
结果为12
在这里插入图片描述

==================

回到我们主题:
接着初始化全连接层的权重,设置节点数为384,标准差设置为0.04,bias的值为0.1,这里我们希望全连接层不要过拟合,因此设一个非零的weight loss值为0.004,最后依然使用relu激活函数进行非线性化。

#reshape-->[batch_size, dim]
reshape = tf,reshape(pool2, [batch_size, -1])
dim = resape.get_shape()[1].value
#weight3-->[dim,384]
weight3 = variable_with_weight_loss(shape = [dim, 384], stddev = 0.04, wl = 0.004)
bias3 = tf.Variable(tf.constant(0.1, shape = [384]))
#local3-->[batch_size, 384]
local3 = tf.nn.relu(tf.matmul(reshape, weight3) + bias3)

然后在来一个全连接层,这个的节点数设置为原来的一半,192

#weight4-->[384,192]
weight4 = variable_with_weight_loss(shape = [384, 192],stddev = 0.04, wl = 0.004)
bias4 = tf.Variable(tf.constant(0.1),shape = [192])
#local4-->[batch_size, 192]
locacl4 = tf.nn.relu(tf.matmul(local3, weight4) + bias4)

下面是最后一层,首先依然是创建这一层的weight,其正态分布的标准差设置为上一个隐含层的节点数的倒数,不计入L2的正则。这里不像之前那样输出softmax这是因为我们把softmax的结果放在loss部分,我们直接比较inference输出的各类数值大小即可,计算softmax主要是为了计算loss,所以我们把他放到后面比较合适。

weight5 = variable_with_weight_loss(shape = [192, 10],stddev = 1/192.0, wl = 0.0)
bias5 = tf.Variable(tf.constant(0.0), shaoe = [10])
logits = tf.add(tf.matmul(local4, weight5),bias5)

完成模型的建立,我们正式进入第二步:损失函数的定义。
这里依然使用cross_entropy,需要注意的是我们把softmax的计算和cross_entropy loss的计算合在一起,使用tf.nn.sparse_softmax_cross_entropy_with_logits,这里使用tf.reduce_mean对cross_entroy计算均值,在使用tf.add_to_collection把cross_entropy的loss添加到整体losses的collection中,最后使用tf.add_n将整体losses的collection中的全部loss求和,得到最终的loss。其中包括cross entrop loss,还有后两个全连接层的weight的L2 loss。

def loss(logits, labels):
	labels = tf.cast(labels, tf.int64)
	cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(logits = logits, labels = labels, name = 'cross_entropy_per_example')
	cross_entropy_mean = tf.reudce_mean(cross_entropy, name = 'cross_entropy')
	tf.add_to_collection('losses', cross_entropy_mean)
	reduce tf.add_n(tf.get_collection('losses'), name = 'total_loss')

然后将logits和label_holder传入loss函数中获得loss

loss = loss(logits, label_holder)

优化器选择Adam Optimizer,学习率设置为1e-3

train_op = tf.train.AdamOptimizer(1e-3).minimize(loss)

使用tf.nn.in_top_k函数球输出结果中top k的准确率,默认使用的top1,也就是输出结果最高的那一类准确率。

top_k_op = tf.nn.in_top_k(logits, label_hilder,1)

创建默认的session,初始化所有参数

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

这一步是启动前面提到的图片数据增强的线程队列。这里一共使用16个线程来进行加速。

tf.train.start_queue_runners()

接下来进行第三步:训练
在每个step中,我们需要先用session的run方法执行image_train、labels_train的计算,获得一个batch的训练数据,在将这个batch的数据传入train_op和loss的计算,我们记录每个step花费的时间,每隔10个ste炮灰计算并展现当前的loss、每秒中能训练的样本数,以及训练一个batch数据所花费的时间。这里要解释的是_,loss_value = sess,run([train_op, loss],feed_dict = {image_holder: image_batch,label_holder:label_batch})其中_,代表的是对应train_op的输出,由于输出没有用到,但是需要run,所以只是空出这个变量名

for step in range(max_step):
	start_time = time.time()
	image_batch,label_batch = sess.run([images_train, labels_train])
	_,loss_value = sess,run([train_op, loss],feed_dict = {image_holder: image_batch,label_holder:label_batch})
	duration = time.time() - strat_time
	if step%10 == 0:
		examples_per_sec = batch_size / duration
		sec_per_batch = float(duration)
		format_str = ('step %d, loss = %.2f(%.1f examplse/sec; %.3f sec/batch)')
		print(format_str % (step, loss_value, examples_per_sec, sec_per_batch))

然后就是最后一步评测了,测试集共有10000个样本,但是需要注意的是我们还是要想训练的那样使用固定的batch_size,然后我们要计算共有多少个batch需要训练,同时使用session的run方法获取images_test|labels_test的batch,在执行top_k_op计算这个batch上的top 1预测正确的样本数,最后汇总所有预测正确的结果,求得试验样本预测正确的数量,最后打印打印完事儿。

num_examples = 10000
import match
num_iter = int(math.ceil(num_examples / batch_size))#对浮点数向上取整,然后转化为int
true_count = 0
total_sample_count = num_iter * batch_size
step = 0
while step < num_iter:
	image_batch,label_batch = sess.run([images_test, labels_test])
	predictions = sess.run([top_k_op],feed_dict = {image_holder: image_batch,label_holder: label_batch})
	truw_count += np.sum(predictions)
	step +=1
precision = true_count / total_sample_count
print('precision @ 1 = %.3f' %precision)

到这里就完事了。完整代码(cifar10_input.py在前面)

===============

import cifar10_input
import tensorflow as tf
import numpy as np
import time

max_steps = 3000
batch_size = 128
data_dir = './cifar-10-binary/cifar-10-batches-bin'


def variable_with_weight_loss(shape, stddev, wl):
    var = tf.Variable(tf.truncated_normal(shape, stddev=stddev))
    if wl is not None:
        weight_loss = tf.multiply(tf.nn.l2_loss(var), wl, name='weight_loss')
        tf.add_to_collection('losses', weight_loss)
    return var


def loss(logits, labels):
#      """Add L2Loss to all the trainable variables.
#      Add summary for "Loss" and "Loss/avg".
#      Args:
#        logits: Logits from inference().
#        labels: Labels from distorted_inputs or inputs(). 1-D tensor
#                of shape [batch_size]
#      Returns:
#        Loss tensor of type float.
#      """
#      # Calculate the average cross entropy loss across the batch.
    labels = tf.cast(labels, tf.int64)
    cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
        logits=logits, labels=labels, name='cross_entropy_per_example')
    cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
    tf.add_to_collection('losses', cross_entropy_mean)

  # The total loss is defined as the cross entropy loss plus all of the weight
  # decay terms (L2 loss).
    return tf.add_n(tf.get_collection('losses'), name='total_loss')
  
###

# cifar10.maybe_download_and_extract()


images_train, labels_train = cifar10_input.distorted_inputs(data_dir=data_dir,
                                                            batch_size=batch_size)

images_test, labels_test = cifar10_input.inputs(eval_data=True,
                                                data_dir=data_dir,
                                                batch_size=batch_size)                                                  
#images_train, labels_train = cifar10.distorted_inputs()
#images_test, labels_test = cifar10.inputs(eval_data=True)

image_holder = tf.placeholder(tf.float32, [batch_size, 24, 24, 3])
label_holder = tf.placeholder(tf.int32, [batch_size])

#logits = inference(image_holder)

weight1 = variable_with_weight_loss(shape=[5, 5, 3, 64], stddev=5e-2, wl=0.0)
kernel1 = tf.nn.conv2d(image_holder, weight1, [1, 1, 1, 1], padding='SAME')
bias1 = tf.Variable(tf.constant(0.0, shape=[64]))
conv1 = tf.nn.relu(tf.nn.bias_add(kernel1, bias1))
pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                       padding='SAME')
norm1 = tf.nn.lrn(pool1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)


weight2 = variable_with_weight_loss(shape=[5, 5, 64, 64], stddev=5e-2, wl=0.0)
kernel2 = tf.nn.conv2d(norm1, weight2, [1, 1, 1, 1], padding='SAME')
bias2 = tf.Variable(tf.constant(0.1, shape=[64]))
conv2 = tf.nn.relu(tf.nn.bias_add(kernel2, bias2))
norm2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75)
pool2 = tf.nn.max_pool(norm2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                       padding='SAME')

reshape = tf.reshape(pool2, [batch_size, -1])
dim = reshape.get_shape()[1].value
weight3 = variable_with_weight_loss(shape=[dim, 384], stddev=0.04, wl=0.004)
bias3 = tf.Variable(tf.constant(0.1, shape=[384]))
local3 = tf.nn.relu(tf.matmul(reshape, weight3) + bias3)

weight4 = variable_with_weight_loss(shape=[384, 192], stddev=0.04, wl=0.004)
bias4 = tf.Variable(tf.constant(0.1, shape=[192]))                                      
local4 = tf.nn.relu(tf.matmul(local3, weight4) + bias4)

weight5 = variable_with_weight_loss(shape=[192, 10], stddev=1/192.0, wl=0.0)
bias5 = tf.Variable(tf.constant(0.0, shape=[10]))
logits = tf.add(tf.matmul(local4, weight5), bias5)

loss = loss(logits, label_holder)


train_op = tf.train.AdamOptimizer(1e-3).minimize(loss) #0.72

top_k_op = tf.nn.in_top_k(logits, label_holder, 1)

sess = tf.InteractiveSession()
tf.global_variables_initializer().run()

tf.train.start_queue_runners()
###
for step in range(max_steps):
    start_time = time.time()
    image_batch,label_batch = sess.run([images_train,labels_train])
    _, loss_value = sess.run([train_op, loss],feed_dict={image_holder: image_batch,
                                                         label_holder:label_batch})
    duration = time.time() - start_time

    if step % 10 == 0:
        examples_per_sec = batch_size / duration
        sec_per_batch = float(duration)
    
        format_str = ('step %d, loss = %.2f (%.1f examples/sec; %.3f sec/batch)')
        print(format_str % (step, loss_value, examples_per_sec, sec_per_batch))

    
###
num_examples = 10000
import math
num_iter = int(math.ceil(num_examples / batch_size))
true_count = 0  
total_sample_count = num_iter * batch_size
step = 0
while step < num_iter:
    image_batch,label_batch = sess.run([images_test,labels_test])
    predictions = sess.run([top_k_op],feed_dict={image_holder: image_batch,
                                                 label_holder:label_batch})
    print(predictions)
    true_count += np.sum(predictions)
    step += 1

precision = true_count / total_sample_count
print('precision @ 1 = %.3f' % precision)

最终结果:
在这里插入图片描述

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值