TensorFlow精进之路（六）：CIFAR-10图像是被（下）-CSDN博客

本文链接：https://blog.csdn.net/Aidam_Bo/article/details/92621805

8、源码分析

1、入口函数

要训练tensorflow官方的cifar10模型，只要执行python cifar10_train.py即可，所以入口函数应该是在cifar10_train.py里。找到

def main(argv=None):  # pylint: disable=unused-argument
  cifar10.maybe_download_and_extract()
  if tf.gfile.Exists(FLAGS.train_dir):
    tf.gfile.DeleteRecursively(FLAGS.train_dir)
  tf.gfile.MakeDirs(FLAGS.train_dir)
  train()
 
 
if __name__ == '__main__':
  tf.app.run()

前面是下载和解压cifar10数据集的功能，不是重点，接着看train()函数

2、train()函数

def train():
  """Train CIFAR-10 for a number of steps."""
  with tf.Graph().as_default():
    global_step = tf.train.get_or_create_global_step()
 
    with tf.device('/cpu:0'):
      images, labels = cifar10.distorted_inputs()
...

函数cifar10.distorted_inputs()是获取CIFAR10数据集的图片数据和对应的标签的，我们接着去看cifar10.py里的distorted_inputs()函数干了什么

3、cifar10.distorted_inputs()函数

def distorted_inputs():
  if not FLAGS.data_dir:
    raise ValueError('Please supply a data_dir')
  data_dir = os.path.join(FLAGS.data_dir, 'cifar-10-batches-bin')
  images, labels = cifar10_input.distorted_inputs(data_dir=data_dir,
                                                  batch_size=FLAGS.batch_size)
  if FLAGS.use_fp16:
    images = tf.cast(images, tf.float16)
    labels = tf.cast(labels, tf.float16)
  return images, labels

可以看到其将CIFAR10数据集的路径和batch大小传到cifar10_input.distorted_inputs函数，cifar10_input.distorted_inputs函数再返回图片和标签的数据，我们接着看cifar10_input.distorted_inputs函数

4、cifar10_input.distorted_inputs(data_dir, batch_size)函数

def distorted_inputs(data_dir, batch_size):
  filenames = [os.path.join(data_dir, 'data_batch_%d.bin' % i)
               for i in xrange(1, 6)]
  for f in filenames:
    if not tf.gfile.Exists(f):
      raise ValueError('Failed to find file: ' + f)
 
  # Create a queue that produces the filenames to read.
  filename_queue = tf.train.string_input_producer(filenames)

因为数据集图片和标签的数据实际上放在data_batch_1.bin～data_batch_5.bin里，所以先将其放到数组filenames里，然后传给tf.train.string_input_producer函数，而tf.train.string_input_producer函数就是创建文件名队列的。接着往下看，

# Read examples from files in the filename queue.
  read_input = read_cifar10(filename_queue)
  reshaped_image = tf.cast(read_input.uint8image, tf.float32)

read_cifar10函数实际上跟CIFAR10图像识别（上）那节写的get_record函数的作用类似，该函数返回一个类，而标签数据存在read_input.label中，图片数据存在read_input.uint8image中，再经过tf.cast将数据转成float32型，接着看，

height = IMAGE_SIZE #24
width = IMAGE_SIZE #24
 
distorted_image = tf.random_crop(reshaped_image, [height, width, 3])
distorted_image = tf.image.random_flip_left_right(distorted_image)
distorted_image = tf.image.random_brightness(distorted_image,
                                             max_delta=63)
distorted_image = tf.image.random_contrast(distorted_image,
                                           lower=0.2, upper=1.8)
float_image = tf.image.per_image_standardization(distorted_image)
# Set the shapes of tensors.
float_image.set_shape([height, width, 3])
read_input.label.set_shape([1])
min_fraction_of_examples_in_queue = 0.4
min_queue_examples = int(NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN *
                         min_fraction_of_examples_in_queue)
 
return _generate_image_and_label_batch(float_image, read_input.label,min_queue_examples, batch_size,shuffle=True)

上面操作都是对图片数据进行数据增强操作，将源图片随机切割成24*24图片，随机左右翻转等等操作，再转成张量[24,24,3]的形式。接着看看_generate_image_and_label_batch函数做了什么，

def _generate_image_and_label_batch(image, label, min_queue_examples,batch_size, shuffle):
  num_preprocess_threads = 16
  if shuffle:
    images, label_batch = tf.train.shuffle_batch(
        [image, label],
        batch_size=batch_size,
        num_threads=num_preprocess_threads,
        capacity=min_queue_examples + 3 * batch_size,
        min_after_dequeue=min_queue_examples)
  else:
    images, label_batch = tf.train.batch(
        [image, label],
        batch_size=batch_size,
        num_threads=num_preprocess_threads,
        capacity=min_queue_examples + 3 * batch_size)
 
  # Display the training images in the visualizer.
  tf.summary.image('images', images)
 
  return images, tf.reshape(label_batch, [batch_size])

主要看tf.train.shuffle_batch函数，该函数主要输出一个打乱顺序排列的样本batch，[image, label]表示样本和样本标签，batch_size是样本batch长度，capacity是队列的容量，num_threads表示开启多少个线程，min_after_dequeue表示出队后，队列中最少要有min_after_dequeue个数据。所以可知，经过这些运算以后，得到的图片数据为一个四维张量[batch_size, height, width, 3]，标签为一维张量[batch_size]。回到train()函数，继续往下看，

# Build a Graph that computes the logits predictions from the
# inference model.
logits = cifar10.inference(images)

这里cifar10.inference函数就是我们卷积模型的重点了，进去看看，

def inference(images):
   # conv1
  with tf.variable_scope('conv1') as scope:
    kernel = _variable_with_weight_decay('weights',
                                         shape=[5, 5, 3, 64],
                                         stddev=5e-2,
                                         wd=None)
    conv = tf.nn.conv2d(images, kernel, [1, 1, 1, 1], padding='SAME')
    biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.0))
    pre_activation = tf.nn.bias_add(conv, biases)
    conv1 = tf.nn.relu(pre_activation, name=scope.name)
    _activation_summary(conv1)
 
  # pool1
  pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                         padding='SAME', name='pool1')
  # norm1
  norm1 = tf.nn.lrn(pool1, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75,
                    name='norm1')
 
  # conv2
  with tf.variable_scope('conv2') as scope:
    kernel = _variable_with_weight_decay('weights',
                                         shape=[5, 5, 64, 64],
                                         stddev=5e-2,
                                         wd=None)
    conv = tf.nn.conv2d(norm1, kernel, [1, 1, 1, 1], padding='SAME')
    biases = _variable_on_cpu('biases', [64], tf.constant_initializer(0.1))
    pre_activation = tf.nn.bias_add(conv, biases)
    conv2 = tf.nn.relu(pre_activation, name=scope.name)
    _activation_summary(conv2)
 
  # norm2
  norm2 = tf.nn.lrn(conv2, 4, bias=1.0, alpha=0.001 / 9.0, beta=0.75,
                    name='norm2')
  # pool2
  pool2 = tf.nn.max_pool(norm2, ksize=[1, 3, 3, 1],
                         strides=[1, 2, 2, 1], padding='SAME', name='pool2')
 
  # local3
  with tf.variable_scope('local3') as scope:
    # Move everything into depth so we can perform a single matrix multiply.
    reshape = tf.reshape(pool2, [images.get_shape().as_list()[0], -1])
    dim = reshape.get_shape()[1].value
    weights = _variable_with_weight_decay('weights', shape=[dim, 384],
                                          stddev=0.04, wd=0.004)
    biases = _variable_on_cpu('biases', [384], tf.constant_initializer(0.1))
    local3 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)
    _activation_summary(local3)
 
  # local4
  with tf.variable_scope('local4') as scope:
    weights = _variable_with_weight_decay('weights', shape=[384, 192],
                                          stddev=0.04, wd=0.004)
    biases = _variable_on_cpu('biases', [192], tf.constant_initializer(0.1))
    local4 = tf.nn.relu(tf.matmul(local3, weights) + biases, name=scope.name)
    _activation_summary(local4)
 
  # linear layer(WX + b),
    with tf.variable_scope('softmax_linear') as scope:
    weights = _variable_with_weight_decay('weights', [192, NUM_CLASSES],
                                          stddev=1/192.0, wd=None)
    biases = _variable_on_cpu('biases', [NUM_CLASSES],
                              tf.constant_initializer(0.0))
    softmax_linear = tf.add(tf.matmul(local4, weights), biases, name=scope.name)
    _activation_summary(softmax_linear)
 
  return softmax_linear

可以看到，这个模型跟我们讲的两层卷积神经网络识别MNIST模型是类似的，经过第一层卷积层和池化层，第二层卷积层和池化层，再经过三层全连接层。接着往下看，

# Calculate loss.
loss = cifar10.loss(logits, labels)

这里就是计算损失函数了，进去看看，

def loss(logits, labels):
  labels = tf.cast(labels, tf.int64)
  cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits(
      labels=labels, logits=logits, name='cross_entropy_per_example')
  cross_entropy_mean = tf.reduce_mean(cross_entropy, name='cross_entropy')
  tf.add_to_collection('losses', cross_entropy_mean)
 
  # The total loss is defined as the cross entropy loss plus all of the weight
  # decay terms (L2 loss).
  return tf.add_n(tf.get_collection('losses'), name='total_loss')

其中，tf.nn.sparse_softmax_cross_entropy_with_logits函数是计算logits和labels的softmax交叉熵，再用tf.reduce_mean求均值，再用tf.add_n求和。回到cifar10_train继续往下看，

# Build a Graph that trains the model with one batch of examples and
# updates the model parameters.
train_op = cifar10.train(loss, global_step)

这个就是训练的函数，进去看看，

def train(total_loss, global_step):
  # Variables that affect learning rate.
  num_batches_per_epoch = NUM_EXAMPLES_PER_EPOCH_FOR_TRAIN / FLAGS.batch_size
  decay_steps = int(num_batches_per_epoch * NUM_EPOCHS_PER_DECAY)
 
  # Decay the learning rate exponentially based on the number of steps.
  lr = tf.train.exponential_decay(INITIAL_LEARNING_RATE,
                                  global_step,
                                  decay_steps,
                                  LEARNING_RATE_DECAY_FACTOR,
                                  staircase=True)
  tf.summary.scalar('learning_rate', lr)
 
  # Generate moving averages of all losses and associated summaries.
  loss_averages_op = _add_loss_summaries(total_loss)
 
  # Compute gradients.
  with tf.control_dependencies([loss_averages_op]):
    opt = tf.train.GradientDescentOptimizer(lr)
    grads = opt.compute_gradients(total_loss)
 
  # Apply gradients.
  apply_gradient_op = opt.apply_gradients(grads, global_step=global_step)
 
  # Add histograms for trainable variables.
  for var in tf.trainable_variables():
    tf.summary.histogram(var.op.name, var)
 
  # Add histograms for gradients.
  for grad, var in grads:
    if grad is not None:
      tf.summary.histogram(var.op.name + '/gradients', grad)
 
  # Track the moving averages of all trainable variables.
  variable_averages = tf.train.ExponentialMovingAverage(
      MOVING_AVERAGE_DECAY, global_step)
  with tf.control_dependencies([apply_gradient_op]):
    variables_averages_op = variable_averages.apply(tf.trainable_variables())
 
  return variables_averages_op

其中，tf.train.exponential_decay函数就是上一节提到的学习率的指数衰减法，设置学习率后，再使用梯度下降法tf.train.GradientDescentOptimizer优化损失，而grads = opt.compute_gradients(total_loss)和opt.apply_gradients(grads, global_step=global_step)函数，其实和前面用到的tf.train.Optimizer.minimize一样的，只不过minimize合并了这两个函数。tf.train.ExponentialMovingAverage函数是使用滑动平均法更新参数，在回到cifar10_train，继续往下看，

class _LoggerHook(tf.train.SessionRunHook):
  """Logs loss and runtime."""
 
  def begin(self):
    self._step = -1
    self._start_time = time.time()
 
  def before_run(self, run_context):
    self._step += 1
    return tf.train.SessionRunArgs(loss)  # Asks for loss value.
 
  def after_run(self, run_context, run_values):
    if self._step % FLAGS.log_frequency == 0:
      current_time = time.time()
      duration = current_time - self._start_time
      self._start_time = current_time
 
      loss_value = run_values.results
      examples_per_sec = FLAGS.log_frequency * FLAGS.batch_size / duration
      sec_per_batch = float(duration / FLAGS.log_frequency)
 
      format_str = ('%s: step %d, loss = %.2f (%.1f examples/sec; %.3f '
                    'sec/batch)')
      print (format_str % (datetime.now(), self._step, loss_value,
                           examples_per_sec, sec_per_batch))
 
with tf.train.MonitoredTrainingSession(
    checkpoint_dir=FLAGS.train_dir,
    hooks=[tf.train.StopAtStepHook(last_step=FLAGS.max_steps),
           tf.train.NanTensorHook(loss),
           _LoggerHook()],
    config=tf.ConfigProto(
        log_device_placement=FLAGS.log_device_placement)) as mon_sess:
  while not mon_sess.should_stop():
    mon_sess.run(train_op)

上面就是真的开始计算了，这里不用之前的tf.Session()会话来计算，而是用tf.train.MonitoredTrainingSession，好处是，这个会话能自动保存和载入模型的文件，默认每10分钟保存一次，就不需要我们自己写保存代码了。checkpoint_dir传入保存的路径，tf.train.StopAtStepHook函数指定训练多少步后就停止，tf.train.NanTensorHook用于监控loss，如果loss是Nan，则停止训练。_LoggerHook则用于打印时间、步数、损失值等，打印格式如下：

2018-05-22 15:57:15.755749: step 3370, loss = 1.22 (475.8 examples/sec; 0.269 sec/batch)

2018-05-22 15:57:18.434435: step 3380, loss = 1.12 (477.8 examples/sec; 0.268 sec/batch)

2018-05-22 15:57:21.054679: step 3390, loss = 1.30 (488.5 examples/sec; 0.262 sec/batch)

2018-05-22 15:57:23.721501: step 3400, loss = 1.22 (480.0 examples/sec; 0.267 sec/batch)

2018-05-22 15:57:26.337015: step 3410, loss = 1.21 (489.4 examples/sec; 0.262 sec/batch)

总结：

这个模型相比前面两个训练MNIST的模型来说稍微复杂一点，但是其过程还是差不多的：

1、定义神经网络的结构和前向传播输出结果

2、定义损失函数以及选择反向传播优化的算法

3、生成会话并且在训练数据上反复运行反向传播优化算法