【Tensorflow】深度学习实战03——Tensorflow实现AlexNet

最新推荐文章于 2024-05-24 08:03:51 发布

不用先生

最新推荐文章于 2024-05-24 08:03:51 发布

阅读量4.4k

点赞数 2

分类专栏： TensorFlow 深度学习文章标签： TensorFlow 深度学习 AlexNet

本文链接：https://blog.csdn.net/u013921430/article/details/80219422

版权

TensorFlow 同时被 2 个专栏收录

13 篇文章 2 订阅

订阅专栏

深度学习

9 篇文章 1 订阅

订阅专栏

【fishing-pan：https://blog.csdn.net/u013921430转载请注明出处】

前言

前两篇博文中分别利用卷积神经网络识别手写数字和对CIFAR-10数据集分类，在这两次的学习中，了解了神经网络的基本组成以及一些防止网络过拟合、增强网络泛化能力的Trick，也简单的了解了TensorFlow是如何编写网络模型的。

这篇博文中讲到的是AlexNet，这个网络应该是目前最广为人知的卷积神经网络。它在2012年以巨大的优势赢得了ILSVRC比赛，使得深度学习再次进入人们的视野，也确立了深度学习在计算机视觉领域的统治地位。

现在我们常常用到的一些Trick都是AlexNet中使用到的，例如Dropout、最大值池化（此前普遍使用平均池化）、数据增广、将ReLU作为激活函数。这其中很多方法并不是由AlexNet的作者所提出，但是AlexNet的成功让这些方法引起了人们的注意，将这些方法推入了人们的视野。

AlexNet结构

AlexNet有5个卷积层，3个全连接层，其中第三个全连接层（最后一层）是有1000类输出的Softmax层作为分类。

图 1 AlexNet的网络结构

AlexNet每层的超参数如图1所示。输入图片大小为224x224个像素。第一个卷积层卷积核尺寸是11x11，步长为4，有96个卷积核，紧接着的是一个LRN层和一个大小为3x3，步长为2的最大池化层。第二个卷积层中卷积核的大小是5x5，后面三个卷积层中卷积核的大小是3x3，步长都为1。

图 2 AlexNet每层的计算量和参数量

图2是AlexNet每层的参数量和计算量统计图，可以发现，前面5个卷积层虽然计算量大但是参数量很小，这就是卷积层的优势所在。熟悉图像处理的朋友都知道卷积操作往往用于提取图像中的某一特征，例如Sobel算子可以用于提取图像的边缘信息，卷积神经网络中卷积核同样被用于提取图像的特征，通过不断提取特征获得某一事物的特性，进而对其进行识别。卷积神经网络中的卷积层通过局部连接、权值共享两个重要属性大大的缩小了参数量。也就是说，不管图像尺寸如何，网络中需要训练的权值数量只跟卷积和大小和数目有关，所以可以使用少量的参数训练任意数目、任意大小的图像。

TensorFlow编写AlexNet

由于使用ImageNet数据集训练一个完整的AlexNet耗时非常长，所以本博文中不会构建完整的AlexNet，而是实现AlexNet的网络（含全连接层，书上的代码不含有全连接层），然后测试前馈计算（forward）和反馈计算（backward）的速度。

前面两篇博文已经把网络剖析的非常细致，而AlexNet中前五个卷积层并没有什么新的知识点，所以我不打算对网络结构的实现讲的太细致。但是为了测试需要，这里面有一些量是我们设置的AlexNet中没有的，我会简单的讲一下。

1. 编写AlexNet

定义一个inference(images) 函数，在函数中构建AlexNet的前5层，函数输入为batch_size 个图像，输出为第五个卷积层池化后的结果，以及所有参数的集合。

def inference(images):
    parameters = []

    with tf.name_scope('conv1') as scope:
        kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 64], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(images, kernel, [1, 4, 4, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [64], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(bias, name = scope)
        print_activations(conv1)
        parameters += [kernel, biases]

        #lrn1 = tf.nn.lrn(conv1, 4, bias = 1.0, alpha = 0.001 / 9, beta = 0.75, name = 'lrn1')
        pool1 = tf.nn.max_pool(conv1, ksize = [1, 3, 3, 1], strides = [1, 2, 2, 1], padding = 'VALID', name = 'pool1')
        print_activations(pool1)

    with tf.name_scope('conv2') as scope:
        kernel = tf.Variable(tf.truncated_normal([5, 5, 64, 192], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(pool1, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [192], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv2)

        #lrn2 = tf.nn.lrn(conv2, 4, bias = 1.0, alpha = 0.001 / 9, beta = 0.75, name = 'lrn2')
        pool2 = tf.nn.max_pool(conv2, ksize = [1, 3, 3, 1], strides = [1, 2, 2, 1], padding = 'VALID', name = 'pool2')
        print_activations(pool2)

    with tf.name_scope('conv3') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 192, 384], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(pool2, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [384], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv3 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv3)

    with tf.name_scope('conv4') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 384, 256], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(conv3, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [256], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv4 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv4)

    with tf.name_scope('conv5') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 256, 256], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(conv4, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [256], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv5 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv5)

        pool5 = tf.nn.max_pool(conv5, ksize = [1, 3, 3, 1], strides = [1, 2, 2, 1], padding = 'VALID', name = 'pool5')
        print_activations(pool5)
##定义全连接层。    
    with tf.name_scope('FC1') as scope:
        reshape=tf.reshape(pool5,[batch_size,-1])
        dim=reshape.get_shape()[1].value
        weight=variable_with_weight_loss(shape=[dim,4096],stddev=0.01,wl=0.004)
        biases=tf.Variable(tf.constant(0.0,shape=[4096]),dtype=tf.float32,trainable=True)
        FC1=tf.nn.relu(tf.matmul(reshape,weight)+biases)
        print_activations(FC1)
    with tf.name_scope('FC2') as scope:
        
        weight=variable_with_weight_loss(shape=[4096,4096],stddev=0.001,wl=0.004)
        biases=tf.Variable(tf.constant(0.0,shape=[4096]),dtype=tf.float32,trainable=True)
        FC2=tf.nn.relu(tf.matmul(FC1,weight)+biases)
        print_activations(FC2)
    with tf.name_scope('FC3') as scope:
        
        weight=variable_with_weight_loss(shape=[4096,1000],stddev=0.001,wl=0.004)
        biases=tf.Variable(tf.constant(0.0,shape=[1000]),dtype=tf.float32,trainable=True)
        FC3=tf.nn.relu(tf.matmul(FC2,weight)+biases)
        print_activations(FC3)


        return FC3, parameters

2. 测试前馈计算与反馈计算的速率

这部分按照函数执行的顺序进行讲解，首先分析是执行run_benchmark()函数；在这个测试代码中我们并没有读入图像，而是随机的生成了一个batch_sizex224x224x3的矩阵作为神经网络的输入（调用inference(images)），在初始化网络中的参数后调用time_tensorflow_run（）函数；而后利用tf.nn.l2_loss(pool5)求取pool5的L2范数，再计算pool5的L2范数与网络中变量之间的梯度关系，其实这一步就是实现一个反馈计算。

def run_benchmark():
    with tf.Graph().as_default():
        #创建输入图像batch
        image_size = 224
        images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3], dtype = tf.float32, stddev = 1e-1))
        pool5, parameters = inference(images)
        
        #初始化网络中的所有变量
        init = tf.global_variables_initializer()
        sess = tf.Session()
        sess.run(init)
        
        #调用time_tensorflow_run函数
        time_tensorflow_run(sess, pool5, "Forward")
        
        #获得backward计算时间
        objective = tf.nn.l2_loss(pool5)
        grad = tf.gradients(objective, parameters)
        time_tensorflow_run(sess, grad, "Forward-backward")

在time_tensorflow_run()函数中通过session.run（target）执行网络，如果target==pool5，即实现前馈计算，若target== grad即实现反馈计算。

def time_tensorflow_run(session, target, info_string):
    num_steps_burn_in = 10
    total_duration = 0.0
    total_duration_squared = 0.0

    for i in range(num_batches + num_steps_burn_in):
        start_time = time.time()
        _ = session.run(target)
        duration = time.time() - start_time
        if i >= num_steps_burn_in:
            if not i % 10:
                print('%s: step %d, duration = %.3f' %(datetime.now(), i - num_steps_burn_in, duration))
            total_duration += duration
            total_duration_squared += duration * duration

    mn = total_duration / num_batches
    vr = total_duration_squared / num_batches - mn * mn
    sd = math.sqrt(vr)
    print('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %(datetime.now(), info_string, num_batches, mn, sd))

运行结果

由于我的GPU比较差，所以计算起来比较慢，但是很明显前馈计算要比反馈计算快上几倍；

在之前的学习中，我从其他博主的博客以及一些书籍中了解到LRN在现在的很多网络中都不在使用，因为效果并不明显，而且会降低网络的计算速度。因此，我把网络中的LRN处理全部去除，然后重新运行结果如下；

虽然并不知道去除LRN后对图像识别的结果会有多大程度的改变（我们暂且认为准确率降低），但是两相对比，去除LRN后计算速度确实明显提高，尤其是反馈计算，计算的结果快了40%。

全部代码

为了方便大家试验，我这里提供完整的测试代码；

# -*- coding: utf-8 -*-
"""
Created on Sun May  6 19:28:05 2018

@author: most_pan
"""

from datetime import datetime
import math
import time
import tensorflow as tf

batch_size = 32
num_batches = 100

def variable_with_weight_loss(shape,stddev,wl):
    var=tf.Variable(tf.truncated_normal(shape,stddev=stddev))
    if wl is not None:
        weight_loss=tf.multiply(tf.nn.l2_loss(var),wl,name='weight_loss')
        tf.add_to_collection('losses',weight_loss)
    return var

def print_activations(t):
    print(t.op.name, '', t.get_shape().as_list())

def inference(images):
    parameters = []

    with tf.name_scope('conv1') as scope:
        kernel = tf.Variable(tf.truncated_normal([11, 11, 3, 64], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(images, kernel, [1, 4, 4, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [64], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv1 = tf.nn.relu(bias, name = scope)
        print_activations(conv1)
        parameters += [kernel, biases]

        #lrn1 = tf.nn.lrn(conv1, 4, bias = 1.0, alpha = 0.001 / 9, beta = 0.75, name = 'lrn1')
        pool1 = tf.nn.max_pool(conv1, ksize = [1, 3, 3, 1], strides = [1, 2, 2, 1], padding = 'VALID', name = 'pool1')
        print_activations(pool1)

    with tf.name_scope('conv2') as scope:
        kernel = tf.Variable(tf.truncated_normal([5, 5, 64, 192], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(pool1, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [192], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv2 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv2)

        #lrn2 = tf.nn.lrn(conv2, 4, bias = 1.0, alpha = 0.001 / 9, beta = 0.75, name = 'lrn2')
        pool2 = tf.nn.max_pool(conv2, ksize = [1, 3, 3, 1], strides = [1, 2, 2, 1], padding = 'VALID', name = 'pool2')
        print_activations(pool2)

    with tf.name_scope('conv3') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 192, 384], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(pool2, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [384], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv3 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv3)

    with tf.name_scope('conv4') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 384, 256], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(conv3, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [256], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv4 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv4)

    with tf.name_scope('conv5') as scope:
        kernel = tf.Variable(tf.truncated_normal([3, 3, 256, 256], dtype = tf.float32, stddev = 1e-1), name = 'weights')
        conv = tf.nn.conv2d(conv4, kernel, [1, 1, 1, 1], padding = 'SAME')
        biases = tf.Variable(tf.constant(0.0, shape = [256], dtype = tf.float32), trainable = True, name = 'biases')
        bias = tf.nn.bias_add(conv, biases)
        conv5 = tf.nn.relu(bias, name = scope)
        parameters += [kernel, biases]
        print_activations(conv5)

        pool5 = tf.nn.max_pool(conv5, ksize = [1, 3, 3, 1], strides = [1, 2, 2, 1], padding = 'VALID', name = 'pool5')
        print_activations(pool5)
##定义全连接层。    
    with tf.name_scope('FC1') as scope:
        reshape=tf.reshape(pool5,[batch_size,-1])
        dim=reshape.get_shape()[1].value
        weight=variable_with_weight_loss(shape=[dim,4096],stddev=0.01,wl=0.004)
        biases=tf.Variable(tf.constant(0.0,shape=[4096]),dtype=tf.float32,trainable=True)
        FC1=tf.nn.relu(tf.matmul(reshape,weight)+biases)
        print_activations(FC1)
    with tf.name_scope('FC2') as scope:
        
        weight=variable_with_weight_loss(shape=[4096,4096],stddev=0.001,wl=0.004)
        biases=tf.Variable(tf.constant(0.0,shape=[4096]),dtype=tf.float32,trainable=True)
        FC2=tf.nn.relu(tf.matmul(FC1,weight)+biases)
        print_activations(FC2)
    with tf.name_scope('FC3') as scope:
        
        weight=variable_with_weight_loss(shape=[4096,1000],stddev=0.001,wl=0.004)
        biases=tf.Variable(tf.constant(0.0,shape=[1000]),dtype=tf.float32,trainable=True)
        FC3=tf.nn.relu(tf.matmul(FC2,weight)+biases)
        print_activations(FC3)


        return FC3, parameters

def time_tensorflow_run(session, target, info_string):
    num_steps_burn_in = 10
    total_duration = 0.0
    total_duration_squared = 0.0

    for i in range(num_batches + num_steps_burn_in):
        start_time = time.time()
        _ = session.run(target)
        duration = time.time() - start_time
        if i >= num_steps_burn_in:
            if not i % 10:
                print('%s: step %d, duration = %.3f' %(datetime.now(), i - num_steps_burn_in, duration))
            total_duration += duration
            total_duration_squared += duration * duration

    mn = total_duration / num_batches
    vr = total_duration_squared / num_batches - mn * mn
    sd = math.sqrt(vr)
    print('%s: %s across %d steps, %.3f +/- %.3f sec / batch' %(datetime.now(), info_string, num_batches, mn, sd))

def run_benchmark():
    with tf.Graph().as_default():
        image_size = 224
        images = tf.Variable(tf.random_normal([batch_size, image_size, image_size, 3], dtype = tf.float32, stddev = 1e-1))
        pool5, parameters = inference(images)

        init = tf.global_variables_initializer()
        sess = tf.Session()
        sess.run(init)

        time_tensorflow_run(sess, pool5, "Forward")

        objective = tf.nn.l2_loss(pool5)
        grad = tf.gradients(objective, parameters)
        time_tensorflow_run(sess, grad, "Forward-backward")

run_benchmark()

参考书籍

《Tensorflow 实战》黄文坚等著；

不用先生

关注

2
点赞
踩
18

收藏

觉得还不错? 一键收藏
1
评论
【Tensorflow】深度学习实战03——Tensorflow实现AlexNet

【fishing-pan：https://blog.csdn.net/u013921430转载请注明出处】前言前两篇博文中分别利用卷积神经网络识别手写数字和对CIFAR-10数据集分类，在这两次的学习中，了解了神经网络的基本组成以及一些防止网络过拟合、增强网络泛化能力的Trick，也简单的了解了TensorFlow是如何编写网络模型的。这篇博文中讲到的...
复制链接

扫一扫

专栏目录