经典网络结构对比总结一

最新推荐文章于 2022-11-21 18:09:34 发布

nathan%1

最新推荐文章于 2022-11-21 18:09:34 发布

阅读量766

点赞数 1

分类专栏：深度学习经典基础

深度学习经典基础专栏收录该内容

7 篇文章 0 订阅

订阅专栏

**#####lenet: 都是5*5的卷积，padding = valid**

x = tf.placeholder(tf.float32, (None, 32, 32, 1))

Layer 1: Convolutional. The output shape should be 28x28x6.

conv1_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 1, 6), mean = mu, stddev = sigma))

conv1   = tf.nn.conv2d(x, conv1_W, strides=[1, 1, 1, 1], padding='VALID') + conv1_b

Activation. Your choice of activation function.

Pooling. The output shape should be 14x14x6.

conv1 = tf.nn.max_pool(conv1, ksize=[1, 2, 2, 1], strides=[1, 2, 2, 1], padding='VALID')

Layer 2: Convolutional. The output shape should be 10x10x16.

conv2_W = tf.Variable(tf.truncated_normal(shape=(5, 5, 6, 16), mean = mu, stddev = sigma))

Activation. Your choice of activation function.

Pooling. The output shape should be 5x5x16.

Flatten. Flatten the output shape of the final pooling layer such that it's 1D instead of 3D. The easiest way to do is by using tf.contrib.layers.flatten, which is already imported for you.

Input = 5x5x16. Output = 400

Layer 3: Fully Connected. This should have 120 outputs.

fc1_W = tf.Variable(tf.truncated_normal(shape=(400, 120), mean = mu, stddev = sigma))

Activation. Your choice of activation function.

Layer 4: Fully Connected. This should have 84 outputs.

Activation. Your choice of activation function.

Layer 5: Fully Connected (Logits). This should have 10 outputs.

**######Alexnet：111196 的卷积配合lrn以及maxpool，fc层接dropout**

x = tf.placeholder(tf.float32, [1, 227, 227, 3])

"""build model"""
conv1 = convLayer(self.X, 11, 11, 4, 4, 96, "conv1", "VALID")
lrn1 = LRN(conv1, 2, 2e-05, 0.75, "norm1")
pool1 = maxPoolLayer(lrn1, 3, 3, 2, 2, "pool1", "VALID")

conv2 = convLayer(pool1, 5, 5, 1, 1, 256, "conv2", groups = 2)
lrn2 = LRN(conv2, 2, 2e-05, 0.75, "lrn2")
pool2 = maxPoolLayer(lrn2, 3, 3, 2, 2, "pool2", "VALID")

conv3 = convLayer(pool2, 3, 3, 1, 1, 384, "conv3")

conv4 = convLayer(conv3, 3, 3, 1, 1, 384, "conv4", groups = 2)

conv5 = convLayer(conv4, 3, 3, 1, 1, 256, "conv5", groups = 2)
pool5 = maxPoolLayer(conv5, 3, 3, 2, 2, "pool5", "VALID")

fcIn = tf.reshape(pool5, [-1, 256 * 6 * 6])
fc1 = fcLayer(fcIn, 256 * 6 * 6, 4096, True, "fc6")
dropout1 = dropout(fc1, self.KEEPPRO)

fc2 = fcLayer(dropout1, 4096, 4096, True, "fc7")
dropout2 = dropout(fc2, self.KEEPPRO)

self.fc3 = fcLayer(dropout2, 4096, self.CLASSNUM, True, "fc8")

**###vgg16：3*3的卷积核 padding 都为1，最后采用softmax。**

images = tf.placeholder("float", [None, 224, 224, 3])

###googlenet: 该结构将CNN中常用的卷积（1x1，3x3，5x5）、池化操作（3x3）堆叠在一起（卷积、池化后的尺寸相同，将通道相加），一方面增加了网络的宽度，另一方面也增加了网络对尺度的适应性。在输入前跟alex类似，有大尺寸卷积(7*7)跟lrn结合的结构,最后再回归前使用avgpooling代替了全连接。

224输入

然而这个Inception原始版本，所有的卷积核都在上一层的所有输出上来做，而那个5x5的卷积核所需的计算量就太大了，造成了特征图的厚度很大，为了避免这种情况，在3x3前、5x5前、max pooling后分别加上了1x1的卷积核，以起到了降低特征图厚度的作用，这也就形成了Inception v1的网络结构，

类似以下

def block17(net, scale=1.0, activation_fn=tf.nn.relu, scope=None, reuse=None):
"""Builds the 17x17 resnet block."""
with tf.variable_scope(scope, 'Block17', [net], reuse=reuse):
with tf.variable_scope('Branch_0'):
tower_conv = slim.conv2d(net, 128, 1, scope='Conv2d_1x1')
with tf.variable_scope('Branch_1'):
tower_conv1_0 = slim.conv2d(net, 128, 1, scope='Conv2d_0a_1x1')
tower_conv1_1 = slim.conv2d(tower_conv1_0, 128, [1, 7],
scope='Conv2d_0b_1x7')
tower_conv1_2 = slim.conv2d(tower_conv1_1, 128, [7, 1],
scope='Conv2d_0c_7x1')
mixed = tf.concat([tower_conv, tower_conv1_2], 3)
up = slim.conv2d(mixed, net.get_shape()[3], 1, normalizer_fn=None,
activation_fn=None, scope='Conv2d_1x1')
net += scale * up
if activation_fn:
net = activation_fn(net)
return net

def reduction_a(net, k, l, m, n):
with tf.variable_scope('Branch_0'):
tower_conv = slim.conv2d(net, n, 3, stride=2, padding='VALID',
scope='Conv2d_1a_3x3')
with tf.variable_scope('Branch_1'):
tower_conv1_0 = slim.conv2d(net, k, 1, scope='Conv2d_0a_1x1')
tower_conv1_1 = slim.conv2d(tower_conv1_0, l, 3,
scope='Conv2d_0b_3x3')
tower_conv1_2 = slim.conv2d(tower_conv1_1, m, 3,
stride=2, padding='VALID',
scope='Conv2d_1a_3x3')
with tf.variable_scope('Branch_2'):
tower_pool = slim.max_pool2d(net, 3, stride=2, padding='VALID',
scope='MaxPool_1a_3x3')
net = tf.concat([tower_conv, tower_conv1_2, tower_pool], 3)
return net

####resnet: 类似bottleneck 224输入

è¿éåå¾çæè¿°

def res_block_3_layers(self, bottom, channel_list, name, change_dimension = False, block_stride = 1):
"""
bottom: input values (X)
channel_list : number of channel in 3 layers
name: block name
"""
if (change_dimension):
short_cut_conv = self.conv_layer(bottom, 1, bottom.get_shape().as_list()[-1], channel_list[2], block_stride, name + "_ShortcutConv")
block_conv_input = self.batch_norm(short_cut_conv)
else:
block_conv_input = bottom

block_conv_1 = self.conv_layer(bottom, 1, bottom.get_shape().as_list()[-1], channel_list[0], block_stride, name + "_lovalConv1")
block_norm_1 = self.batch_norm(block_conv_1)
block_relu_1 = tf.nn.relu(block_norm_1)

block_conv_2 = self.conv_layer(block_relu_1, 3, channel_list[0], channel_list[1], 1, name + "_lovalConv2")
block_norm_2 = self.batch_norm(block_conv_2)
block_relu_2 = tf.nn.relu(block_norm_2)

block_conv_3 = self.conv_layer(block_relu_2, 1, channel_list[1], channel_list[2], 1, name + "_lovalConv3")
block_norm_3 = self.batch_norm(block_conv_3)
block_res = tf.add(block_conv_input, block_norm_3)
relu = tf.nn.relu(block_res)

return relu