经典神经网络-------AlexNet

最新推荐文章于 2024-07-23 17:29:00 发布

weixin_43350614

最新推荐文章于 2024-07-23 17:29:00 发布

阅读量341

点赞数

分类专栏：研究生学习文章标签：深度学习

研究生学习专栏收录该内容

8 篇文章 1 订阅

订阅专栏

经典网络
2012年，Imagenet比赛冠军的model——Alexnet [2]（以第一作者alex命名）。caffe的model文件在（2）。说实话，这个model的意义比后面那些model都大很多，首先它证明了CNN在复杂模型下的有效性，然后GPU实现使得训练在可接受的时间范围内得到结果，确实让CNN和GPU都大火了一把，顺便推动了有监督DL的发展。

AlexNet网络模型特性：
使用ReLU非线性函数作为激活函数，大大的加快了训练时间，并且这使得ReLU流行起来
在池化层采用了重叠采样的方法，有效的防止过拟合
引入新的Dropout方法，使Dropout的神经元不再进行前向传播并且不参与反向传播，减少网络训练代价。结果在各个数据集的结果在翻译中都有呈现
基于GPU跨GPU并行性好的特点，训练方式采用多GPU训练

模型结构见下图：
在这里插入图片描述
这个图有点点特殊的地方是卷积部分都是画成上下两块，意思是说吧这一层计算出来的feature map分开，但是前一层用到的数据要看连接的虚线，如图中input层之后的第一层第二层之间的虚线是分开的，是说二层上面的128map是由一层上面的48map计算的，下面同理；而第三层前面的虚线是完全交叉的，就是说每一个192map都是由前面的128+128=256map同时计算得到的。

Alexnet有一个特殊的计算层，LRN层，做的事是对当前层的输出结果做平滑处理。下面是我画的示意图：
在这里插入图片描述
前后几层（对应位置的点）对中间这一层做一下平滑约束，计算方法是：

注：卷积核的厚度=被卷积的图像的通道数
卷积核的个数=卷积操作后输出的通道数

具体计算都在图里面写了，要注意的是input层是227227，而不是paper里面的224224，这里可以算一下，主要是227可以整除后面的conv1计算，224不整除。如果一定要用224可以通过自动补边实现，不过在input就补边感觉没有意义，补得也是0。
在这里插入图片描述
和上面基本一样，唯独需要注意的是group=2，这个属性强行把前面结果的feature map分开，卷积部分分成两部分做。

这里有一层特殊的dropout层，在alexnet中是说在训练的以1/2概率使得隐藏层的某些neuron的输出为0，这样就丢到了一半节点的输出，BP的时候也不更新这些节点
在这里插入图片描述

CNN中卷积层的作用
CNN中的卷积层，在很多网络结构中会用conv来表示，也就是convolution的缩写。卷积层在CNN中扮演着很重要的角色——特征的抽象和提取，这也是CNN区别于传统的ANN或SVM的重要不同，在传统机器学习算法中，需要人为的指定特征是什么，比如经典的HOG+SVM的行人检测方案，HOG就是一种特征提取方法。所以我们送入SVM分类器中的其实HOG提取出来的特征，而不是图片的本身。而在卷积神经网络中，大部分特征提取的工作在卷积层自动完成了，越深越宽的卷积层一般来说就会有更好的表达能力。
在这里插入图片描述
池化层
池化操作（Pooling）用于卷积操作之后，其作用在于特征融合和降维，其实也是一种类似卷积的操作，只是池化层的所有参数都是超参数，都是不用学习得到的。

最大池化示例图：
在这里插入图片描述
其中，核的尺寸为2×22×2，步长为2，最大池化的过程是将2×22×2尺寸内的所有像素值取最大值，作为输出通道的像素值。

激活层
池化操作用于卷积层内，而激活操作则在卷积层和全连接层都会用到，深层网络中一般使用ReLU多段线性函数作为激活函数，如下图所示，其作用在于增加非线性。
在这里插入图片描述
全连接层
全连接层的作用
CNN中的全连接层与浅层神经网络中的作用是一样的，负责逻辑推断，所有的参数都需要学习得到。有一点区别在于第一层的全连接层用于链接卷积层的输出，它还有一个作用是去除空间信息（通道数），是一种将三维矩阵变成向量的过程（一种全卷积操作），其操作如下：
在这里插入图片描述
输入图像是W×H×CW×H×C，那么卷积核的尺寸为W×H×CW×H×C，这样的话整个输入图像就变成了一个数，一共有k个数（第一层全连接层后的神经元个数），就有K个这样的W×H×CW×H×C的卷积核

附上架构代码：(由于代码还不会导入，只能复制粘贴，希望多多见谅)
import tensorflow as tf

#输出各层参数
def shape(value):
print(value.op.name,value.get_shape().as_list())

def inference(images, batch_size, n_classes):
#conv1
with tf.variable_scope(‘conv1’) as scope:
weights = tf.get_variable(‘weights’,
shape=[11,11,3,96],#随机生成96个11113 的卷积核
dtype=tf.float32,
initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
# 其中tf.truncated_normal:从截断的正态分布中输出随机值。
# 生成的值服从具有指定平均值和标准偏差的正态分布，如果生成的值大于平均值2个标准偏差的值则丢弃重新选择。
# stddev: 正态分布的标准差。
biases = tf.get_variable(“biases”,
shape=[96],
dtype=tf.float32,
initializer=tf.constant_initializer(0.1))
conv = tf.nn.conv2d(images, weights, strides=[1, 4, 4, 1], padding=‘SAME’)
# 描述：过滤器移动的步长，第一位和第四位一般恒定为1，第二位指水平移动时候的步长，第三位指垂直移动的步长。
# strides = [1, stride, stride, 1].
# Valid: 用过滤器在输入的矩阵中按步长移动时候，会把最后的不足部分的列和行抛弃；
# Same：先在输入矩阵上下各加个值为0的行，在左右各加个个值为0的列，也就是用0把原先的矩阵包裹一层，
# 然后在移动的时候如果输入矩阵的列或者行长度不够，就用0来补齐。
pre_activation = tf.nn.bias_add(conv, biases)
#tf.nn.bias_add将偏差项biases（向量）加到conv（矩阵）上，是向量与矩阵的每一行进行相加，得到的结果和conv矩阵大小相同
conv1 = tf.nn.relu(pre_activation, name=scope.name)

    #使用relu激活函数进行激活

with tf.variable_scope('pooling1_lrn') as scope:
    # conv1经卷积之后得到的feature map，那么它就具有[batch, height, width, channels]这样的shape
    shape(conv1)
    # 池化层
    pool1 = tf.nn.max_pool(conv1, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                           padding='VALID', name='pooling1')
    # 局部响应归一化层
    norm1 = tf.nn.lrn(pool1, depth_radius=4, bias=1.0, alpha=0.001 / 9.0,
                      beta=0.75, name='norm1')
    shape(norm1)

#conv2
with tf.variable_scope('conv2') as scope:
    weights = tf.get_variable('weights',
                              shape=[5, 5, 96, 256],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
    biases = tf.get_variable('biases',
                             shape=[256],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    conv = tf.nn.conv2d(norm1, weights, strides=[1, 1, 1, 1], padding='SAME')
    pre_activation = tf.nn.bias_add(conv, biases)
    conv2 = tf.nn.relu(pre_activation, name='conv2')

with tf.variable_scope('pooling2_lrn') as scope:
    norm2 = tf.nn.lrn(conv2, depth_radius=4, bias=1.0, alpha=0.001 / 9.0,
                      beta=0.75, name='norm2')
    pool2 = tf.nn.max_pool(norm2, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                           padding='VALID', name='pooling2')

#conv3
with tf.variable_scope('conv3') as scope:
    weights = tf.get_variable('weights',
                              shape=[3, 3, 256, 384],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
    biases = tf.get_variable('biases',
                             shape=[384],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    conv = tf.nn.conv2d(pool2, weights, strides=[1, 1, 1, 1], padding='SAME')
    pre_activation = tf.nn.bias_add(conv, biases)
    conv3 = tf.nn.relu(pre_activation, name='conv3')

#conv4
with tf.variable_scope('conv4') as scope:
    weights = tf.get_variable('weights',
                              shape=[3, 3, 384, 384],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
    biases = tf.get_variable('biases',
                             shape=[384],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    conv = tf.nn.conv2d(conv3, weights, strides=[1, 1, 1, 1], padding='SAME')
    pre_activation = tf.nn.bias_add(conv, biases)
    conv4 = tf.nn.relu(pre_activation, name='conv4')

#conv5
with tf.variable_scope('conv5') as scope:
    weights = tf.get_variable('weights',
                              shape=[3, 3, 384, 256],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.1, dtype=tf.float32))
    biases = tf.get_variable('biases',
                             shape=[256],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    conv = tf.nn.conv2d(conv4, weights, strides=[1, 1, 1, 1], padding='SAME')
    pre_activation = tf.nn.bias_add(conv, biases)
    conv5 = tf.nn.relu(pre_activation, name='conv5')

with tf.variable_scope('pooling2_lrn') as scope:
    norm5 = tf.nn.lrn(conv5, depth_radius=4, bias=1.0, alpha=0.001 / 9.0,
                      beta=0.75, name='norm5')
    pool5 = tf.nn.max_pool(norm5, ksize=[1, 3, 3, 1], strides=[1, 2, 2, 1],
                           padding='VALID', name='pooling5')


print("pool5.shape = ",pool5.shape)
#fc6
with tf.variable_scope('fc6') as scope:
    reshape = tf.reshape(pool5, shape=[batch_size, -1])#将pool5拉直，将最后3个维度变成一个维度,并且保留前面的batch_size维度
    print("reshape.shape = ", reshape.shape)
    dim = reshape.get_shape()[1].value
    print("dim",dim)
    weights = tf.get_variable('weights',
                              shape=[dim, 4096],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.005, dtype=tf.float32))

    biases = tf.get_variable('biases',
                             shape=[4096],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    fc6 = tf.nn.relu(tf.matmul(reshape, weights) + biases, name=scope.name)

    # dropout6
    with tf.name_scope('dropout6') as scope:
        dropout6 = tf.nn.dropout(fc6, 0.5)#0.5表示随机选择一半的神经元失效

with tf.variable_scope('fc7') as scope:
    weights = tf.get_variable('weights',
                              shape=[4096, 4096],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.005, dtype=tf.float32))
    biases = tf.get_variable('biases',
                             shape=[4096],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    fc7 = tf.nn.relu(tf.matmul(dropout6, weights) + biases, name='fc7')
    # dropout7
    with tf.name_scope('dropout6') as scope:
        dropout7 = tf.nn.dropout(fc7, 0.5)

#fc8
with tf.variable_scope('fc8') as scope:
    weights = tf.get_variable('fc8',
                              shape=[4096, n_classes],
                              dtype=tf.float32,
                              initializer=tf.truncated_normal_initializer(stddev=0.005, dtype=tf.float32))
    biases = tf.get_variable('biases',
                             shape=[n_classes],
                             dtype=tf.float32,
                             initializer=tf.constant_initializer(0.1))
    fc8 = tf.add(tf.matmul(dropout7, weights), biases, name='fc8')

return fc8

def losses(logits, labels):
with tf.variable_scope(‘loss’) as scope:
cross_entropy = tf.nn.sparse_softmax_cross_entropy_with_logits
(logits=logits, labels=labels, name=‘xentropy_per_example’)

    loss = tf.reduce_mean(cross_entropy, name='loss')
    tf.summary.scalar(scope.name + '/loss', loss)
return loss

def training(loss, learning_rate):
with tf.name_scope(‘optimizer’):
optimizer = tf.train.AdadeltaOptimizer(learning_rate=learning_rate)
global_step = tf.Variable(0, name=‘global_step’, trainable=False)
train_op = optimizer.minimize(loss, global_step=global_step)#用于记录全局训练步骤的单值
return train_op

#评估函数
def evaluation(logits, labels):
with tf.variable_scope(‘accuracy’) as scope:
# 用于计算预测的结果和实际结果的是否相等，返回一个bool类型的张量
correct = tf.nn.in_top_k(logits, labels, 1)#预测的结果，实际样本类别的标签，一般都取1
correct = tf.cast(correct, tf.float16)
accuracy = tf.reduce_mean(correct)
tf.summary.scalar(scope.name + ‘/accuracy’, accuracy)#生成loss的标量信息

return accuracy

weixin_43350614

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
经典神经网络-------AlexNet

经典网络2012年，Imagenet比赛冠军的model——Alexnet [2]（以第一作者alex命名）。caffe的model文件在（2）。说实话，这个model的意义比后面那些model都大很多，首先它证明了CNN在复杂模型下的有效性，然后GPU实现使得训练在可接受的时间范围内得到结果，确实让CNN和GPU都大火了一把，顺便推动了有监督DL的发展。AlexNet网络模型特性：使用Re...
复制链接

扫一扫