Tensorflow 年龄和性别识别(Age&Gender)

最新推荐文章于 2024-07-25 08:36:57 发布

置顶 kebi199312

最新推荐文章于 2024-07-25 08:36:57 发布

阅读量1.2w

点赞数 9

分类专栏：机器视觉图像算法深度学习文章标签：深度学习年龄、性别识别

本文链接：https://blog.csdn.net/kebi199312/article/details/86978876

版权

图像算法同时被 3 个专栏收录

4 篇文章 0 订阅

订阅专栏

深度学习

4 篇文章 0 订阅

订阅专栏

机器视觉

2 篇文章 0 订阅

订阅专栏

1.配置

运行环境：win7操作系统，内存为：8GB，64位操作系统，显卡：GeForce GTX 1050；python 3.5.2，Tensorflow-gpu 1.8.0，CUDA9.0、cuDNN7.1.4，opencv_contrib_python-3.3.0.10。

windows配置Tensorflow-gpu版本可以参考博主的另一篇博客：https://blog.csdn.net/kebi199312/article/details/86549637

年龄和性别识别参考的代码：

源代码：https://github.com/dpressel/rude-carnie

模型下载：Adience数据集：http://www.openu.ac.il/home/hassner/Adience/data.html#agegender

Adience数据集有26580张图片，2284类，年龄范围8个区段(0~2､4~6、8~13、15~20、25~32、38~43、48~53、60~)，含有噪声、姿势、光照变化。其中aligned经过剪裁对齐数据，faces为原始数据。fold_0_data.txt至fold_4_data.txt 全部数据标记。fold_frontal_0_data.txt至fold_frontal_4_data.txt 仅用近似正面姿态面部标记。数据结构 user_id 用户Flickr帐户ID、original_image 图片文件名、face_id 人标识符、age、gender、x、y、dx、dy 人脸边框、tilt_ang 切斜角度、fiducial_yaw_angle 基准偏移角度、fiducial_score 基准分数。

由于FTP站点被墙，这里提供百度网盘下载链接： https://pan.baidu.com/s/1P938ZuI9xjU2KM--JxJjBA，密码：kihq

常见的图像数据库：https://blog.csdn.net/qq_14845119/article/details/51913171

参考代码：https://github.com/GilLevi/AgeGenderDeepLearning，或者可以下载：https://download.csdn.net/download/kebi199312/10906170

2.算法实现

性别分类自然而然是二分类问题，年龄预测是回归问题，这里采用的方法是把年龄划分为多个年龄段(0~2､4~6、8~13、15~20、25~32、38~43、48~53、60~)，每个年龄段相当于一个类别，这样年龄就是多分类问题了。

卷积神经网络（CNN）由输入层、卷积层、激活函数、池化层、全连接层组成，源代码中选用Inception V3和Levi_Hassner_Bn两种神经网络模型；Inception V3网络由非Inception模块组和三个Inception模块组组成，前面是普通的卷积层进行预处理，后面是三个Inception module，大家有兴趣可以搜索Inception V3网络；Levi_Hassner_Bn网络层数较少，Levi_Hassner_Bn网络包含：3个卷积层、3个池化层，还有2个全连接层，层数比较少但是可避免过拟合。

下面介绍Levi_Hassner_Bn神经网络各模块的作用。

卷积层(conv layer)

卷积层，一般用来做特征提取，可使原始图像的某些特征增强并达到降低噪声的作用。卷积层输出图像的通道数等于卷积核数量N。图1是卷积层的计算方法：

图1 卷积层的计算方法

激励层(Relu)

把卷积层输出结果做非线性映射。

池化层(maxpooling layer)

池化层是去掉卷积得到的特征映射的次要部分，进而减小网络参数，可对提取到的特征信息进行降维，一方面使特征图变小，简化网络计算复杂度并在一定程度上避免过拟合的出现；另一方面进行特征压缩，提取主要特征；池化层的输出通道数不改变；全连接层输出向量长度等于神经元的数量。这里选择的是max pooling，实际上就是在n*n的样本中取最大值，作为采样后的样本值，图2是2*2 max pooling的计算过程：

图2 池化层的计算方法

全连接层(full connceted layer)

全连接层。连接所有的特征，将输出值送给分类器（softmax分类器）进行分类，给出分类结果。全连接层输出的是n*1的向量。

网络计算流程：

conv1：输入——卷积——Relu—— max pool1

conv2：卷积——Relu——max pool 2

conv3：卷积——Relu——max pool 3

full connected 1：矩阵乘法——Relu——dropout

full connected 2：矩阵乘法——Relu——softmax

整体计算架构：

图像经过卷积层和池化层后输出图像大小的计算。

卷积层输出图像(张量)的大小：

定义：

O=输出图像的尺寸；

I=输入图像的尺寸；

F=卷积层的核尺寸；

N=卷积层的核数量；

S=卷积步长：决定滑动多少步可以到边缘；

P =填充数：在外围边缘补充若干圈0，方便从初始位置以步长为单位可以刚好滑到末尾位置；

输出图像尺寸的计算公式如下：

$O=\frac{I-F+2*P}{S}+1$

输出图像的通道数等于核数量N。 (1)

填充数P的选取：边界填充，zero padding项，即为图像加上一个边界，边界宽度为P，边界填充元素均为0，对原输入无影响，卷积核尺寸F与填充数P的关系公式为：

F=3 => zero padding with 1

F=5 => zero padding with 2

F=7 => zero padding with 3

$P=\frac{F-1}{2}$ (2)

卷积中的特征图大小计算方式有两种，分别是‘VALID’和‘SAME’，卷积层和池化层都适用，除不尽的结果都向上取整；如果卷积层里 padding=‘VALID’，则P=0，即：

$O=\frac{I-F}{S}+1$ (3)

如果padding=‘SAME’，P的选取遵循公式(2)，则：

$O=\frac{I-F+2*P}{S}+1$ (4)

池化层输出图像的大小：

定义：

O=输出图像的尺寸；

I=输入图像的尺寸；

F=池化层的核尺寸；

N=池化层核数量；

S=池化步长；

P =填充数；

输出图像尺寸的计算公式如下：

$O=\frac{I-F}{S}+1$ (5)

不同于卷积层，池化层的输出通道数不改变。

全连接层输出图像的大小：

全连接层输出向量长度等于神经元的数量。

操作层	张量大小	操作步骤
Input	2272273
conv1	565696	96 7*7 filters at stride 4, pad 0 (227-7)/4 + 1 = 56
max pool1	282896	3*3filters at stride 2, pad 0 (56-3)/2 + 1 = 28
conv2	2828256	256 5*5 filters at stride 1, pad 2 (28-5+2´2)/1 + 1 = 28
max pool2	1414256	3*3filters at stride 2, pad 0 (28-3)/2 + 1 = 14
conv3	1414384	384 3*3 filters at stride 1, pad 1 (14-3+2´1)/1 + 1 = 14
max pool3	77384	3*3filters at stride 2, pad 0 (14-3)/2 + 1 = 7
fc1	11512	512 neurons
fc2	11512	512 neurons
softmax	118	8 neurons

网络模型：

第一层(卷积层)：输入层为尺寸为227*227的3通道图像，3是它的深度(R,G,B)，卷积层有96个滤波器，每个滤波器卷积核大小为7*7，这个就相当于3个7*7大小的卷积核在每个通道进行卷积，pad为0，激活函数采用ReLU，池化采用最大重叠池化，池化层的卷积核大小选择为3*3，strides选择2，pad为0，最后输出为28*28*96。计算过程如图3所示；

图3 第一层的网络结构

第二层(卷积层)：第二层的输入是28*28*96的单通道图片，上一步已经把三通道合在一起进行卷积了；第二层结构，有256个滤波器，滤波器大小为5*5，strides为1，pad为2，这个也可以参考AlexNet的结构，池化层选择和上面的参数一样，最后输出为14*14*256；

第三层(卷积层)：第三层的输入是14*14*256的单通道图片，有384个滤波器，每个滤波器卷积核大小为3*3，strides为1，pad为1，池化层选择跟上面的参数一样，最后输出为7*7*384；

第四层(全连接层)：第一个全连接层，神经元个数选择512；

第五层(全连接层)：第二个全连接层，神经元个数选择512；

第六层(输出层)：输出层，对于年龄来说是多分类，输入神经元个数为8。

网络训练：

1、初始化参数：权重初始化方法采用标准差为0.01，均值为0的高斯正太分布；

2、网络训练：采用dropout，来限制过拟合，通过输入256*256的图片，然后进行随机裁剪，裁剪为227*227的图片，当然裁剪要以face中心为基础，进行裁剪；

3、结果预测：预测方法采用输入一张256*256的图片，然后进行裁剪5张图片为227*227大小，其中四张图片的裁剪方法分别采用以256*256的图片的4个角为基点，进行裁剪。然后最后一张，以人脸的中心为基点进行裁剪。然后对这5张图片进行预测，最后对预测结果进行平均。

3.tf.app.flags.FLAGS的使用

源代码：https://github.com/dpressel/rude-carnie

在源代码中大量使用了tf.app.flags，Tensorflow中的tf.app.flags与argparse模块有点类似，通过它们都可以定义输入参数，用于支持接受命令行传递参数，相当于接受argv。

tf.app.flags有tf.app.flags.DEFINE_string、tf.app.flags.DEFINE_boolean、tf.app.flags.DEFINE_integer等几类。

下面举个列子来说明怎么使用tf.app.flags。

import tensorflow as tf

# 第一个是参数名称，第二个参数是默认值(这里是‘model_dir’的路径)，第三个是参数描述,如果不想描述可以直接用 ''
tf.app.flags.DEFINE_string('model_dir', 'AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0/run-7656/',
                           "Model directory (where training data lives)")
tf.app.flags.DEFINE_integer('max_steps', 20000, "description")
tf.app.flags.DEFINE_boolean('single_look', False, "single look at the image or multiple crops")

FLAGS = tf.app.flags.FLAGS

def main(_):
    print(FLAGS.model_dir)
    print(FLAGS.max_steps)
    print(FLAGS.single_look)

if __name__ == '__main__':
    tf.app.run()

                                        图4  tf.app.flags的使用

执行结果：

AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0/run-7656/
20000
False

代码定义了string、integer、boolean型变量，然后再定义：FLAGS = tf.app.flags.FLAGS，到最后的执行入口会出现

if __name__ == '__main__':
    tf.app.run()

其中tf.app.run()是函数入口，类似于c/c++中的main()，首先加载flags的参数项，然后执行main函数，其中参数是使用tf.app.flags.FLAGS定义的。当程序运行tf.app.run()时，它回跳到main()中执行，通过tf.app.flags.DEFINE_string()函数来增加变量，mian()里面通过FLAGS，加上你上面增加的参数名字来引用变量。

4.数据预处理

参考代码：https://github.com/dpressel/rude-carnie/preproc.py

将Adience数据集的aligned、faces压缩文件解压缩，运行脚本preproc.py，把“fold_dir”、“data_dir”、“output_dir”更改成自己的目录。我的设置是：

“fold_dir”路径更改为：

'AgeGenderDeepLearning-master/Folds/train_val_txt_files_per_fold/test_fold_is_0'

“data_dir”路径更改为：

'AdienceBenchmarkOfUnfilteredFacesForGenderAndAgeClassification/aligned/aligned'

“output_dir”路径更改为：

'AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0'

这里“fold_dir”借助了参考代码的“https://github.com/GilLevi/AgeGenderDeepLearning-master/Folds”文件夹，这个文件夹对训练集和测试集做了划分和标注；“data_dir”改成解压后的aligned文件目录的路径，“output_dir”设置成“AgeGenderDeepLearning-master/Folds”文件夹下的“tf/test_fold_is_0”文件夹；“train_list”、“valid_list”设置成“age”，就是对年龄进行识别；设置完后，运用脚本preproc.py把Adience数据集处理成TFRecords文件，并将TFRecords文件分为训练集和测试集，图片处理成大小为256*256的RGB图像，处理完后生成的TFRecords文件如图5所示。

tf.app.flags.DEFINE_string('fold_dir', 'AgeGenderDeepLearning-master/Folds/train_val_txt_files_per_fold/test_fold_is_0',
                           'Fold directory')

tf.app.flags.DEFINE_string('data_dir', 'AdienceBenchmarkOfUnfilteredFacesForGenderAndAgeClassification/aligned/aligned',
                           'Data directory')


tf.app.flags.DEFINE_string('output_dir', 'AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0',
                           'Output directory')
tf.app.flags.DEFINE_string('train_list', 'age_train.txt',
                           'Training list')
tf.app.flags.DEFINE_string('valid_list', 'age_val.txt',
                           'Test list')

图5.1 需要修改的部分

图5.2 运行完后生成的TFRecords文件

5.构建和训练模型

参考代码：https://github.com/dpressel/rude-carnie/model.py

https://github.com/dpressel/rude-carnie/train.py

年龄和性别的构建模型的代码在代码源model.py中，为便于生成卷积网络，使用了高级API-tensorflow.contrib.slim，它可以对常见网络和一些功能进行封装，调用起来方便。

定义好网络模型后，接下来就是训练，训练模型代码在train.py，以年龄的训练为例，修改相应的参数，如图4所示。其中将‘train_dir’修改成生成TFRecords的文件目录，大家也可以根据需要修改相应的‘max_steps’以及‘batch_size’等参数。

我的设置是：

'train_dir'路径更改为生成TFRecords的文件目录：

'AgeGenderDeepLearning-master/Folds/train_val_txt_files_per_fold/test_fold_is_0'

'max_step'为40000，用GPU训练花了8个小时左右，可改小减小计算时间

tf.app.flags.DEFINE_string('pre_checkpoint_path', '',
                           """If specified, restore this pretrained model """
                           """before beginning any training.""")

tf.app.flags.DEFINE_string('train_dir', 'AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0',
                           'Training directory')

tf.app.flags.DEFINE_boolean('log_device_placement', False,
                            """Whether to log device placement.""")

tf.app.flags.DEFINE_integer('num_preprocess_threads', 4,
                            'Number of preprocessing threads')

tf.app.flags.DEFINE_string('optim', 'Momentum',
                           'Optimizer')

tf.app.flags.DEFINE_integer('image_size', 227,
                            'Image size')

tf.app.flags.DEFINE_float('eta', 0.01,
                          'Learning rate')

tf.app.flags.DEFINE_float('pdrop', 0.,
                          'Dropout probability')

tf.app.flags.DEFINE_integer('max_steps', 20000,
                          'Number of iterations')

tf.app.flags.DEFINE_integer('steps_per_decay', 10000,
                            'Number of steps before learning rate decay')
tf.app.flags.DEFINE_float('eta_decay_rate', 0.1,
                          'Learning rate decay')

tf.app.flags.DEFINE_integer('epochs', -1,
                            'Number of epochs')

tf.app.flags.DEFINE_integer('batch_size', 128,
                            'Batch size')

tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
                          'Checkpoint name')

tf.app.flags.DEFINE_string('model_type', 'default',
                           'Type of convnet')

tf.app.flags.DEFINE_string('pre_model',
                            '',#'./inception_v3.ckpt',
                           'checkpoint file')

图6 train.py需要修改的参数

进行20000次迭代后生成的checkpoint文件位于run-{pid}（进程号）目录里，如图7所示。

图7 训练完生成的检查点文件

6.验证模型(年龄验证)

参考代码：https://github.com/dpressel/rude-carnie/guess.py

验证模型的脚本guess.py，修改代码中的相应参数，如图8所示，验证图片如图9所示：

最终，我的设置是：

'model_dir'更改为生成的checkpoint文件夹目录：

'AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0/run-7656/'

'calss_type'设置成为'age'，因为之前训练的是age
'file_name'设置成需要验证的图片的路径，这里为：'3.jpg'
‘face_detection_model’，选择的模型是： haarcascade_frontalface_alt.xml，它是opencv自带的人脸检测Haar特征分类器，OpenCV安装目录中的sources \data\ haarcascades目录下的haarcascade_frontalface_alt.xml与haarcascade_frontalface_alt2.xml都是用来检测人脸的Haar分类器；也可以选择yolo_tiny模型，检查点模型文件在https://github.com/gliese581gg/YOLO_tensorflow/下载，YOLO detection 的代码在 https://github.com/gliese581gg/YOLO_tensorflow/blob/master/YOLO_tiny_tf.py。

tf.app.flags.DEFINE_string('model_dir', 'AgeGenderDeepLearning-master/Folds/tf/test_fold_is_0/run-504/',
                           'Model directory (where training data lives)')

tf.app.flags.DEFINE_string('class_type', 'age',
                           'Classification type (age|gender)')

tf.app.flags.DEFINE_string('device_id', '/cpu:0',
                           'What processing unit to execute inference on')

tf.app.flags.DEFINE_string('filename', '3.jpg',
                           'File (Image) or File list (Text/No header TSV) to process')

tf.app.flags.DEFINE_string('target', '',
                           'CSV file containing the filename processed along with best guess and score')

tf.app.flags.DEFINE_string('checkpoint', 'checkpoint',
                          'Checkpoint basename')

tf.app.flags.DEFINE_string('model_type', 'default',
                           'Type of convnet')

tf.app.flags.DEFINE_string('requested_step', '', 'Within the model directory, a requested step to restore e.g., 9000')

tf.app.flags.DEFINE_boolean('single_look', False, 'single look at the image or multiple crops')

tf.app.flags.DEFINE_string('face_detection_model', 'haarcascade_frontalface_alt.xml', 'Do frontal face detection with model specified')

tf.app.flags.DEFINE_string('face_detection_type', 'cascade', 'Face detection model type (yolo_tiny|cascade)')

图8 guess.py需要修改的参数