tensorflow实现残差网络（mnist数据集）

最新推荐文章于 2024-08-13 14:23:15 发布

Tom Hardy

最新推荐文章于 2024-08-13 14:23:15 发布

阅读量1w

点赞数 7

分类专栏：深度学习

本文链接：https://blog.csdn.net/qq_29462849/article/details/80744522

版权

本文介绍了如何使用TensorFlow框架实现残差网络，并以MNIST数据集为例进行实践。残差网络允许深度达到1000层，通过实线和虚线连接的不同通道处理，实现高效的信息传递。代码中区分了identity_block和conv_block，以应对输入输出尺寸相同或不同的情况，选择了三卷积结构以提升速度。

摘要由CSDN通过智能技术生成

介绍

残差网络是何凯明大神的神作，效果非常好，深度可以达到1000层。但是，其实现起来并没有那末难，在这里以tensorflow作为框架，实现基于mnist数据集上的残差网络，当然只是比较浅层的。

如下图所示：

实线的Connection部分，表示通道相同，如上图的第一个粉色矩形和第三个粉色矩形，都是3x3x64的特征图，由于通道相同，所以采用计算方式为H(x)=F(x)+x
虚线的的Connection部分，表示通道不同，如上图的第一个绿色矩形和第三个绿色矩形，分别是3x3x64和3x3x128的特征图，通道不同，采用的计算方式为H(x)=F(x)+Wx，其中W是卷积操作，用来调整x维度的。

根据输入和输出尺寸是否相同，又分为identity_block和conv_block，每种block有上图两种模式，三卷积和二卷积，三卷积速度更快些，因此在这里选择该种方式。具体实现见如下代码：

#tensorflow基于mnist数据集上的VGG11网络，可以直接运行
from tensorflow.examples.tutorials.mnist import input_data
import tensorflow as tf
#tensorflow基于mnist实现VGG11
mnist = input_data.read_data_sets('MNIST_data', one_hot=True)

#x=mnist.train.images
#y=mnist.train.labels
#X=mnist.test.images
#Y=mnist.test.labels
x = tf.placeholder(tf.float32, [None,784])
y = tf.placeholder(tf.float32, [None, 10])
sess = tf.InteractiveSession()

def weight_variable(shape):
#这里是构建初始变量
  initial = tf.truncated_normal(shape, mean=0,stddev=0.1)
#创建变量
  return tf.Variable(initial)

def bias_variable(shape):
  initial = tf.constant(0.1, shape=shape)
  return tf.Variable(initial)

#在这里定义残差网络的id_block块，此时输入和输出维度相同
def identity_block(X_input, kernel_size