再谈AlexNet

最新推荐文章于 2023-05-14 11:00:00 发布

youmy1111

最新推荐文章于 2023-05-14 11:00:00 发布

阅读量3k

点赞数 2

分类专栏：深度学习文章标签： TensorFlow 深度学习

本文链接：https://blog.csdn.net/youmy1111/article/details/59704841

版权

深度学习专栏收录该内容

3 篇文章 0 订阅

订阅专栏

AlexNet（论文中并没有给网络取名，这名字谁取的？）是Alex Krizhevsky等人创造的网络，在2012年ImageNet分类任务中夺得第一。尽管四年多的时间过去了，各种分析的文章也很多，新的更好的网络层出不穷，但是作为经典，这个模型依然有很多值得探讨的地方。

一、网络结构探讨
Alexnet共8层（有训练参数的层），5个卷积层，3个全连接层，首先使用了收敛速度更快的激活函数ReLU，为防止过拟合使用LRN和dropout。简单的网络结构和丰富技巧的应用使之十分具有学习和实验价值，许多论文长把AlexNet作为实验或比较对象。整体网络结构如下：

	size/stride	output	padding	value
Input	2272273
conv	11*11/4	555596	VALID
relu
lrn
pool	3*3/2	272796	VALID
conv	5*5/1	2727256	SAME
relu
lrn
pool	3*3/2	1313256
conv	3*3/1	1313384	SAME
relu
conv	3*3/1	1313384	SAME
relu
conv	3*3/1	1313256	SAME
relu
pool	3*3/2	66256
reshape
fc		114096
relu
dropout				0.5
fc		114096
relu
dropout				0.5
fc		111000

这里的结构图没按照传统结构图画法，是按照op运算的顺序来的，列出全部的op方便读者实现网络。

除此之外，AlexNet有三个要注意的地方：

1、caffe的model zoo里有个caffenet，结构基本与Alexnet一样，只是LRN和pooling的顺序换了一下，据考证这是caffenet作者在复现网络时的失误caffe issues ，作者本来是想实现个一样的，不怪他，我把论文仔仔细细看了，说得的确不直观。尽管如此caffenet依然被广泛的使用，为什么呢？从实现上来看，LRN与pooling顺序的调换并没有影响到整体的网络结构；相反，从另一种角度来看，顺序的调换反而节约了计算量，因为Alexnet先LRN计算完后，许多值被后面的pooling丢掉了；在GoogLeNet中就是先池化后LRN。

2、在论文中，模型的输入是224*224*3，经过第一层卷积后输出的featuremap是55*55*96=290400经过考证应该是作者写错了；
3、第一层卷积的卷积核大小为11*11，stride=4，按照TensorFlow卷积的算法，padding=SAME时，输出是ceil(224/4)=56，padding=VALID时，输出是ceil((224-11+1)/4)=54,不管怎样都不可能是55，考证之后我比较相信这种说法CS231.

正是由于这些疑惑，“一种莫名其妙的冲动，叫我继续追寻”，寻求答案的过程让我对理论知识和网络结构有了更深的了解，我想这才是学习和科研的感觉。

二、模型实现
查阅了很多资料，很多基于TensorFlow的AlexNet实现并没有完全基于原文，也许是因为上面提到的两点困扰，多多少少都有一些改动；因此，我重新写了一份代码，尽量做到严格遵从论文，并在细节上保持一致。代码如下：

import numpy as np
import tensorflow as tf

def inference(images):
    """ An reimplementation of AlexNet """
    # conv1
    with tf.name_scope('conv1') as scope:
        kernel = tf.Variable(tf.truncated_normal([11,11,3,96], dtype=tf.float32, stddev=1e-2), name='weights')
        conv = tf.nn.conv2d(images, kernel, [1,4,4,1], padding='VALID')
        # according to the paper, initialized the biases with 1 may accelerates the learning
        biases = tf.Variable(tf.constant(1.0, shape=[96], dtype=tf.float32), name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(bias, name=scope)
    # lrn1
    # in tflearn defaut bias=1.0
    lrn1 = tf.nn.local_response_normalization(conv1, depth_radius=5, bias=2, alpha=1e-4, beta=0.75, name='lrn1')
    # max_pool
    pool1 = tf.nn.max_pool(lrn1, [1,3,3,1], [1,2,2,1], padding='VALID', name='pool1')
    
    # conv2
    with tf.name_scope('conv2') as scope:
        kernel = tf.Variable(tf.truncated_normal([5,5,96,256], dtype=tf.float32, stddev=1e-2), name='weights')
        conv = tf.nn.conv2d(pool1, kernel, [1,1,1,1], padding='SAME')
        biases = tf.Variable(tf.constant(1.0, shape=[256], dtype=tf.float32), name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(bias, name=scope)
    # lrn2
    lrn2 = tf.nn.local_response_normalization(conv2, depth_radius=5, bias=2, alpha=1e-4, beta=0.75, name='lrn2')
    # max_pool
    pool2 = tf.nn.max_pool(lrn2, [1,3,3,1], [1,2,2,1], padding='VALID', name='pool2')

    # conv3
    with tf.name_scope('conv3') as scope:
        kernel = tf.Variable(tf.truncated_normal([3,3,256,384], dtype=tf.float32, stddev=1e-2), name='weights')
        conv = tf.nn.conv2d(pool2, kernel, [1,1,1,1], padding='SAME')
        biases = tf.Variable(tf.constant(1.0, shape=[384], dtype=tf.float32), name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv3 = tf.nn.relu(bias, name=scope)

    # conv4
    with tf.name_scope('conv4') as scope:
        kernel = tf.Variable(tf.truncated_normal([3,3,384,384], dtype=tf.float32, stddev=1e-2), name='weights')
        conv = tf.nn.conv2d(conv3, kernel, [1,1,1,1], padding='SAME')
        biases = tf.Variable(tf.constant(1.0, shape=[384], dtype=tf.float32), name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv4 = tf.nn.relu(bias, name=scope)
    
    # conv5
    with tf.name_scope('conv5') as scope:
        kernel = tf.Variable(tf.truncated_normal([3,3,384,256], dtype=tf.float32, stddev=1e-2), name='weights')
        conv = tf.nn.conv2d(conv4, kernel, [1,1,1,1], padding='SAME')
        biases = tf.Variable(tf.constant(1.0, shape=[256], dtype=tf.float32), name='biases')
        bias = tf.nn.bias_add(conv, biases)
        conv5 = tf.nn.relu(bias, name=scope)
    # max_pool
    pool3 = tf.nn.max_pool(conv5, [1,3,3,1], [1,2,2,1], padding='VALID', name='pool3')

    # fully connected layer
    # fc1
    with tf.name_scope('fc1') as scope:
        # reshape = tf.reshape(pool3, [pool3.get_shape()[0].value,-1])
        a = pool3.get_shape().as_list()
        dim = np.prod(a[1:])
        pool3 = tf.reshape(pool3, [-1,dim])
        weight = tf.Variable(tf.truncated_normal([dim,4096], dtype=tf.float32, stddev=1e-2), name='weights')
        fc = tf.matmul(pool3, weight)
        biases = tf.Variable(tf.constant(1.0, shape=[4096], dtype=tf.float32), name='biases')
        bias = fc + biases
        fc1 = tf.nn.relu(bias, name=scope)
    # dropout
    dropout1 = tf.nn.dropout(fc1, 0.5)

    # fc2
    with tf.name_scope('fc2') as scope:
        weight = tf.Variable(tf.truncated_normal([4096,4096], dtype=tf.float32, stddev=1e-2), name='weights')
        fc = tf.matmul(dropout1, weight)
        biases = tf.Variable(tf.constant(1.0, shape=[4096], dtype=tf.float32), name='biases')
        bias = fc + biases
        fc2 = tf.nn.relu(bias, name=scope)
    # dropout
    dropout2 = tf.nn.dropout(fc2, 0.5)

    # fc3
    # from 4096 to num of classes matmul
    with tf.name_scope('fc3') as scope:
        weight = tf.Variable(tf.truncated_normal([4096,NUM_CLASSES], dtype=tf.float32, stddev=1e-2), name='weights')
        fc = tf.matmul(dropout2, weight)
        biases = tf.Variable(tf.constant(1.0, shape=[NUM_CLASSES], dtype=tf.float32), name='biases')
        bias = fc + biases
        # fc3 = tf.nn.relu(bias, name=scope)
    
    return bias

注：

1、论文中两支网络是因为GPU显存不够，现在的实现都是合成一个网络；

2、输入的问题上节已提到，为了保证经过第一个卷积层输出的featuremap大小为55*55*96，这里将输入改为227*227*3。

youmy1111

关注

2
点赞
踩
3

收藏

觉得还不错? 一键收藏
2
评论
再谈AlexNet

AlexNet（论文中并没有给网络取名，这名字谁取的？）是Alex Krizhevsky等人创造的网络，在2012年ImageNet分类任务中夺得第一。尽管四年多的时间过去了，各种分析的文章也很多，新的更好的网络层出不穷，但是作为经典，这个模型依然有很多值得探讨的地方。网络结构探讨。
复制链接

扫一扫