从0到1实现基于Tornado和Tensorflow的人脸、年龄、性别识别(2)

最新推荐文章于 2021-12-20 16:13:08 发布

nanjingdreamfly

最新推荐文章于 2021-12-20 16:13:08 发布

阅读量5k

点赞数 4

分类专栏： opencv人脸识别 tornado tensorflow 年龄识别人脸识别

本文链接：https://blog.csdn.net/nanjingdreamfly/article/details/62886909

版权

本文详细介绍了使用Tornado和Tensorflow构建人脸识别系统，特别是年龄和性别识别的过程。涉及到L2范数、卷积神经网络、池化层、全连接层和dropout等技术。训练过程中，讨论了误差函数、优化方法如Adadelta和MomentumOptimizer，以及防止过拟合的策略。预测阶段，使用softmax回归进行年龄段分类。此外，还探讨了其他年龄识别算法和性别识别模型的训练及预测。最后，提出了提高模型准确性的反思和改进方法。

摘要由CSDN通过智能技术生成

年龄识别模型的训练过程

def main(argv=None):
    ### 一个图中包含有一个名称范围的堆栈，在使用name_scope(...)之后，将压(push)新名称进栈中，
#并在下文中使用该名称
    with tf.Graph().as_default():

        ##这句比较重要选择模型
        model_fn = select_model(FLAGS.model_type)
        ##假设这里选的是默认的，这里我们发现其实作者是实现了inception v3、levi_hassner_bn两种模型
        ## Open the metadata file and figure out nlabels, and size of epoch
        ## 年龄模型的年龄段labes分别是['(0, 2)','(4, 6)','(8, 12)','(15, 20)','(25, 32)','(38, 43)','(48, 53)','(60, 100)'] 应该是8个标签

可以看到，select_model选择了模型，我们以def levi_hassner(nlabels, images, pkeep, is_training):为例子，
这个模型的命名是因为Gil Levi and Tal Hassner, Age and Gender Classification using Convolutional Neural Networks, IEEE Workshop on Analysis and Modeling of Faces and Gestures (AMFG), at the IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Boston, June 2015这篇文章。
模型的实现源代码

    weight_decay = 0.0005
    weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
    with tf.variable_scope("LeviHassner", "LeviHassner", [images]) as scope:
        with tf.contrib.slim.arg_scope(
                [convolution2d, fully_connected],
                weights_regularizer=weights_regularizer,
                biases_initializer=tf.constant_initializer(1.),
                weights_initializer=tf.random_normal_initializer(stddev=0.005),
                trainable=True):
            with tf.contrib.slim.arg_scope(
                    [convolution2d],
                    weights_initializer=tf.random_normal_initializer(stddev=0.01)):

                conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
                pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
                norm1 = tf.nn.local_response_normalization(pool1, 5, alpha=0.0001, beta=0.75, name='norm1')
                conv2 = convolution2d(norm1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2') 
                pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
                norm2 = tf.nn.local_response_normalization(pool2, 5, alpha=0.0001, beta=0.75, name='norm2')
                conv3 = convolution2d(norm2, 384, [3, 3], [1, 1], biases_initializer=tf.constant_initializer(0.), padding='SAME', scope='conv3')
                pool3 = max_pool2d(conv3, 3, 2, padding='VALID', scope='pool3')
                flat = tf.reshape(pool3, [-1, 384*6*6], name='reshape')
                full1 = fully_connected(flat, 512, scope='full1')
                drop1 = tf.nn.dropout(full1, pkeep, name='drop1')
                full2 = fully_connected(drop1, 512, scope='full2')
                drop2 = tf.nn.dropout(full2, pkeep, name='drop2')

这是典型的卷积神经网络用于分类的模型结构

L2范数

其中 weights_regularizer = tf.contrib.layers.l2_regularizer(weight_decay)
l2_regularizer是L2范数的意思，在《线性代数》《矩阵论中》，我们还了解到有L1范数，那么更受宠幸的规则化范数是L2范数: ||W||2。它也不逊于L1范数，它有两个美称，在回归里面，有人把有它的回归叫“岭回归”（Ridge Regression），有人也叫它“权值衰减weight decay”。
它的强大功效是改善机器学习里面一个非常重要的问题：过拟合。至于过拟合是什么，上面也解释了，就是模型训练时候的误差很小，但在测试的时候误差很大，也就是我们的模型复杂到可以拟合到我们的所有训练样本了，但在实际预测新的样本的时候，糟糕的一塌糊涂。通俗的讲就是应试能力很强，实际应用能力很差。
下面这幅图可以看到过拟合大概是什么状况，绿色线
这里写图片描述
在上面的图像中有两个不同的类，分别由蓝色和红色圆圈表示。绿线是过度拟合的分类器。它完全遵循训练数据，同时也严重依赖于训练数据，并且可能在处理未知数据时比代表正则化模型的黑线表现更差。因此，我们的正则化目标是得到一个简单的模型，不附带任何不必要的复杂。我们选择L2-正则化来实现这一点，L2正则化将网络中所有权重的平方和加到损失函数。如果模型使用大权重，则对应重罚分，并且如果模型使用小权重，则小罚分。
这就是为什么我们在定义权重时使用了regularizer参数，并为它分配了一个l2_regularizer。这告诉了TensorFlow要跟踪l2_regularizer这个变量的L2正则化项（并通过参数reg_constant对它们进行加权）。所有正则化项被添加到一个损失函数可以访问的集合——tf.GraphKeys.REGULARIZATION_LOSSES。

将所有正则化损失的总和与先前计算的交叉熵相加，以得到我们的模型的总损失。

理解是：限制了参数很小，实际上就限制了多项式某些分量的影响很小，这样就相当于减少参数个数，过拟合状况就会被降低。
下面走到 with tf.variable_scope(“LeviHassner”, “LeviHassner”, [images]) as scope:
可以看到有不少scope，那么它是什么呢？
Tensorflow 为了更好的管理变量,提供了variable scope机制，具体参考TensorFlow官方这篇文章
https://www.tensorflow.org/programmers_guide/variable_scope
后面又来了个比较复杂的东西 tf.contrib.slim，这是什么鬼，看看官方的解释：
https://github.com/tensorflow/tensorflow/tree/master/tensorflow/contrib/slim
TF-Slim is a lightweight library for defining, training and evaluating complex models in TensorFlow. Components of tf-slim can be freely mixed with native tensorflow, as well as other frameworks, such as tf.contrib.learn.
我们用到的是arg_scope
arg_scope: provides a new scope named arg_scope that allows a user to define default arguments for specific operations within that scope.

卷积神经网络

卷积神经网络（CNN）由输入层、卷积层、激活函数、池化层、全连接层组成，即INPUT-CONV-RELU-POOL-FC
下面终于来到网络层了啊！来看代码

                conv1 = convolution2d(images, 96, [7,7], [4, 4], padding='VALID', biases_initializer=tf.constant_initializer(0.), scope='conv1')
                pool1 = max_pool2d(conv1, 3, 2, padding='VALID', scope='pool1')
                conv2 = convolution2d(pool1, 256, [5, 5], [1, 1], padding='SAME', scope='conv2') 
                pool2 = max_pool2d(conv2, 3, 2, padding='VALID', scope='pool2')
                co