TensorFlow学习-基于CNN实现手写数字识别



一、网络结构

使用2个卷积层,2个池化层, 2个全连接层组成网络

输入→ 卷积→ ReLU→max polling→ 卷积→ ReLU→max polling→ FC→输出

  1. 输入
    一个4维的tensor: [batch_size, image_width, image_height, channels], 分别代表梯度下降处理的批量数据大小,图片宽度,图片高度和图片的channel个数(彩色图片channel数为3[Red, Green, Blue],单色图片channel数为1)

    # Input Layer
    # Reshape X to 4-D tensor: [batch_size, width, height, channels]
    # MNIST images are 28x28 pixels, and have one color channel
    input_layer  =  tf.reshape(features, [ - 1 28 28 1 ])


  2. 卷积层#1:
    采用32(channel)个5*5的过滤器(kernel)对原始输入图像做卷积(局部感知), 另外对输入矩阵加了zero padding以保持卷积输出宽高和输入一致,并用ReLU作为激活函数引入非线性特性

    # Convolutional Layer #1
      # Computes 32 features using a 5x5 filter with ReLU activation.
      # Padding is added to preserve width and height.
      # Input Tensor Shape: [batch_size, 28, 28, 1]
      # Output Tensor Shape: [batch_size, 28, 28, 32]
      conv1  =  tf.layers.conv2d(
          inputs = input_layer,
          filters = 32 ,
          kernel_size = [ 5 5 ],
          padding = "same" ,
          activation = tf.nn.relu)


  3. 池化层#1
    采用2*2的过滤器(stride=2)对卷积层#1的输出做最大值下采样(max polling), 降低了数据纬度,并避免过拟合

    # Pooling Layer #1
    # First max pooling layer with a 2x2 filter and stride of 2
    # Input Tensor Shape: [batch_size, 28, 28, 32]
    # Output Tensor Shape: [batch_size, 14, 14, 32]
    pool1  =  tf.layers.max_pooling2d(inputs = conv1, pool_size = [ 2 2 ], strides = 2 )
  4. 卷积层#2
    采用64个5*5的过滤器(kernel)对池化层#1的输出做卷积, 并用ReLU作为激活函数

    # Convolutional Layer #2
    # Computes 64 features using a 5x5 filter.
    # Padding is added to preserve width and height.
    # Input Tensor Shape: [batch_size, 14, 14, 32]
    # Output Tensor Shape: [batch_size, 14, 14, 64]
    conv2  =  tf.layers.conv2d(
         inputs = pool1,
         filters = 64 ,
         kernel_size = [ 5 5 ],
         padding = "same" ,
         activation = tf.nn.relu)


  5. 池化层#2
    采用2*2的过滤器(stride=2)对卷积层#2的输出做最大值下采样(max polling)

    # Pooling Layer #2
    # Second max pooling layer with a 2x2 filter and stride of 2
    # Input Tensor Shape: [batch_size, 14, 14, 64]
    # Output Tensor Shape: [batch_size, 7, 7, 64]
    pool2  =  tf.layers.max_pooling2d(inputs = conv2, pool_size = [ 2 2 ], strides = 2 )


  6. 全连接层#1
    首先把池化层#2的输出打平(flatten)成二维[batch_size, 7*7*64]矩阵,然后和1024个神经元做全连接,同时指定dropout=0.4(随机保留60%的数据做训练,避免过拟合)

    # Flatten tensor into a batch of vectors
    # Input Tensor Shape: [batch_size, 7, 7, 64]
    # Output Tensor Shape: [batch_size, 7 * 7 * 64]
    pool2_flat  =  tf.reshape(pool2, [ - 1 7  *  7  *  64 ])
     
    # Dense Layer
    # Densely connected layer with 1024 neurons
    # Input Tensor Shape: [batch_size, 7 * 7 * 64]
    # Output Tensor Shape: [batch_size, 1024]
    dense  =  tf.layers.dense(inputs = pool2_flat, units = 1024 , activation = tf.nn.relu)
     
    # Add dropout operation; 0.6 probability that element will be kept
    dropout  =  tf.layers.dropout(
         inputs = dense, rate = 0.4 , training = mode  = =  learn.ModeKeys.TRAIN)


  7. 输出
    10个神经元,依次代表0-9

    # Logits layer
    # Input Tensor Shape: [batch_size, 1024]
    # Output Tensor Shape: [batch_size, 10]
    logits  =  tf.layers.dense(inputs = dropout, units = 10 )


二、模型训练

  1. 对label做one-hot encoding

    # tf.one_hot接受两个参数:
    # indices代表one-hot encoding后,值为1的位置(其余为0)
    # depth代表目标值的个数(以手写数字识别为例,目标值为0-9, 所以depth=10)
    onehot_labels  =  tf.one_hot(indices = tf.cast(labels, tf.int32), depth = 10 )
  2. 计算交叉熵损失:

    loss  =  tf.losses.softmax_cross_entropy(onehot_labels = onehot_labels, logits = logits)
  3.  配置训练操作, 学习率=0.001,优化方法采用随机梯度下降:

    train_op  =  tf.contrib.layers.optimize_loss(
             loss = loss,
             global_step = tf.contrib.framework.get_global_step(),
             learning_rate = 0.001 ,
             optimizer = "SGD" )
  4. 模型预测

    # Generate Predictions
    # classes: 预测的分类,取值0-9
    # probabilities: classed对应的可能性, 经过softmax激活函数处理
      predictions  =  {
          "classes" : tf.argmax(
              input = logits, axis = 1 ),
          "probabilities" : tf.nn.softmax(
              logits, name = "softmax_tensor" )
      }
  5. 创建评估器(Estimator),返回一个分类器,能做训练和评估

    # Create the Estimator
    # 这里的cnn_model_fn几乎就是上面全部代码的一个wrap, 详见:https://www.tensorflow.org/tutorials/layers#building_the_cnn_mnist_classifier
    mnist_classifier  =  learn.Estimator(
           model_fn = cnn_model_fn, model_dir = "/tmp/mnist_convnet_model" )
  6. 训练:

    # Train the model
    mnist_classifier.fit(
         x = train_data,
         y = train_labels,
         batch_size = 100 ,
         steps = 20000 ,
         monitors = [logging_hook])

三、模型评估

  1. 配置评估metric并做评估

    # Configure the accuracy metric for evaluation
       metrics  =  {
           "accuracy" :
               learn.MetricSpec(
                   metric_fn = tf.metrics.accuracy, prediction_key = "classes" ),
       }
     
       # Evaluate the model and print results
       eval_results  =  mnist_classifier.evaluate(
           x = eval_data, y = eval_labels, metrics = metrics)

四、源码

完整代码: https://www.github.com/tensorflow/tensorflow/blob/r1.1/tensorflow/examples/tutorials/layers/cnn_mnist.py

原文地址: https://www.tensorflow.org/tutorials/layers


  • 0
    点赞
  • 3
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值