模型训练主要采用model_main.py脚本,里面有要训练的SSD的框架,选择哪个模型以及模型需要的参数需要在配置文件中进行配置
进入models/research/object_detection/samples/configs下,里面是配置文件,config文件对应了我们想要的网络配置,选择ssd_resnet_fpn作为主干网络
因为没有专门的文件所以copy上述文件,将文件修改为针对人脸检测业务的配置文件
修改:
ssd {
inplace_batchnorm_update: true
freeze_batchnorm: false
num_classes: 1#只进行人脸检测,定义为1,这里不包含背景
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
修改最后的文件路径:
train_input_reader: {
tf_record_input_reader {
input_path: "/Users/apple/Downloads/11人脸识别/数据集/widerface/tf-data/train.record"
}
label_map_path: "/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/models/research/object_detection/data/face_label_map.pbtxt"
}
eval_config: {
metrics_set: "coco_detection_metrics"
use_moving_averages: false
num_examples: 8000
}
eval_input_reader: {
tf_record_input_reader {
input_path: "/Users/apple/Downloads/11人脸识别/数据集/widerface/tf-data/test.record"
}
label_map_path: "/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/models/research/object_detection/data/face_label_map.pbtxt"
shuffle: false
num_readers: 1
}
因为不使用预训练模型,所以将其注释:
train_config: {
#fine_tune_checkpoint: "PATH_TO_BE_CONFIGURED/model.ckpt"
batch_size: 64
sync_replicas: true
startup_delay_steps: 0
replicas_to_aggregate: 8
num_steps: 25000
data_augmentation_options {
random_horizontal_flip {
}
}
接下来进行模型训练:
打开终端进入reserch路径,也就是目标检测文件夹的上层路径,在终端执行时还要对相应的参数进行赋值,包括网络结构配置文件路径,生成的log日志,以及model对应的路径,训练的迭代次数,关于log配置的参数
python3 object_detection/model_main.py --pipeline_config_path=/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/models/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_face.config --model_dir=/Users/apple/Downloads/11人脸识别/数据集/widerface/resnet50_fpn_model --num_train_steps=100000 --alsologtostder
然后运行就开始训练模型。
训练完成后生成pb文件:
python3 object_detection/export_inference_graph.py --input_type=image_tensor --pipeline_config_path=/anaconda3/envs/tensorflow/lib/python3.6/site-packages/tensorflow/models/research/object_detection/samples/configs/ssd_resnet50_v1_fpn_shared_box_predictor_640x640_coco14_sync_face.config --trained_checkpoint_prefix=/Users/apple/Downloads/11人脸识别/数据集/widerface/resnet50_fpn_model/model.ckpt-1 --output_directory=/Users/apple/Downloads/11人脸识别/数据集/widerface/resnet50_fpn_model/pb
分析源码:
ssd {
inplace_batchnorm_update: true
freeze_batchnorm: false
num_classes: 1
box_coder {
faster_rcnn_box_coder {
y_scale: 10.0#放大图片尺度,最终回归值变大,最终的loss也会变大
x_scale: 10.0
height_scale: 5.0
width_scale: 5.0
}
}
#判断人脸框是否是真值,是正样本还是负样本,利用预测的人脸框同真实的人脸框进行比对,阈值大与0.5是正样本,小于0.5是负样本
matcher {
argmax_matcher {
matched_threshold: 0.5
unmatched_threshold: 0.5
ignore_thresholds: false
negatives_lower_than_unmatched: true
force_match_for_each_row: true
use_matmul_gather: true
}
}
#anchor
#根据不同的参数选择不同的bdnbox
anchor_generator {
multiscale_anchor_generator {
min_level: 3
max_level: 7
anchor_scale: 4.0
aspect_ratios: [1.0, 2.0, 0.5]#人脸长宽比:1:1,1:2,2:1三个尺寸
scales_per_octave: 2
}
}
#图片大小,图片越小,时间越短,但输入图片越大越好,越大意味着保留了更多原始图片的信息
#640*640,512*512,384*384,这些尺寸都能整除128
image_resizer {
fixed_shape_resizer {
height: 640
width: 640
}
}
#
box_predictor {
#定义卷积层共享参数
weight_shared_convolutional_box_predictor {
depth: 256#深度256
class_prediction_bias_init: -4.6
conv_hyperparams {
activation: RELU_6,
regularizer {
l2_regularizer {
weight: 0.0004
}
}
initializer {
random_normal_initializer {
stddev: 0.01
mean: 0.0
}
}
batch_norm {
scale: true,
decay: 0.997,
epsilon: 0.001,
}
}
num_layers_before_predictor: 4#卷积核层数4层
kernel_size: 3#卷积核大小3x3
}
}
#特征提取,主干网络层
feature_extractor {
type: 'ssd_resnet50_v1_fpn'
#fpn考虑层数为第三到第七
fpn {
min_level: 3
max_level: 7
}
min_depth: 16#最小channel
depth_multiplier: 1.0#网络宽度
#卷积参数
conv_hyperparams {
activation: RELU_6,#relu激活函数
regularizer {
#l2正则
l2_regularizer {
weight: 0.0004
}
}
initializer {
truncated_normal_initializer {
stddev: 0.03
mean: 0.0
}
}
batch_norm {
scale: true,
decay: 0.997,
epsilon: 0.001,
}
}
override_base_feature_extractor_hyperparams: true
}
#loss
loss {
#分类的loss
classification_loss {
weighted_sigmoid_focal {
alpha: 0.25
gamma: 2.0
}
}
#回归的loss
localization_loss {
weighted_smooth_l1 {
}
}
classification_weight: 1.0
localization_weight: 1.0
}
#loss归一化参数
normalize_loss_by_num_matches: true
normalize_loc_loss_by_codesize: true
#优化器
optimizer {
momentum_optimizer: {
learning_rate: {
cosine_decay_learning_rate {
learning_rate_base: .04
total_steps: 25000
warmup_learning_rate: .013333
warmup_steps: 2000
}
}
momentum_optimizer_value: 0.9
}
use_moving_average: false
}