======================================================
01、训练数据创建
使用tf_convert_data.py 将原始数据转换为训练数据(tfrecord格式)
python tf_convert_data.py --dataset_name=pascalvoc --dataset_dir=D:\datas\cv\VOCtrainval_06-Nov-2007\VOCdevkit\VOC2007 --output_name=voc_2007_train --output_dir=./aiqm131/pascalvoc
python tf_convert_data.py --dataset_name=pascalvoc --dataset_dir=D:\datas\cv\VOCtest_06-Nov-2007\VOCdevkit\VOC2007 --output_name=voc_2007_test --output_dir=./aiqm131/pascalvoc
NOTE:底层其实调用pascalvoc_to_tfrecords.run(FLAGS.dataset_dir, FLAGS.output_dir, FLAGS.output_name)
======================================================
02、train_ssd_network.py: 训练主程序
运行参数:
最简单的运行命令:
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc
最终运行命令:(自己随机初始化)
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc --num_classes=21 --model_name=ssd_300_vgg --batch_size=2 --save_summaries_secs=30 --save_interval_secs=600 --negative_ratio=3 --train_dir=/tmp/tfmodel/ai13_voc
迁移学习的运行方式:
NOTE:预训练好的模型是在VOC2007数据集上的,VGG300的模型参数是训练好的,保存在checkpoints文件夹中,基于它做一个迁移学习。
步骤:
-1、修改必要的网络结构
-2、将ssd_300_vgg.ckpt.zip解压到一个文件夹中,并在该文件夹中加入checkpoint文件(检查点信息)
-3、运行
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc --num_classes=21 --model_name=ssd_300_vgg --batch_size=2 --save_summaries_secs=30 --save_interval_secs=600 --negative_ratio=1 --train_dir=/tmp/tfmodel/ai13_voc --checkpoint_path=D:\tmp\tfmodel\voc2 --learning_rate=0.000001 --end_learning_rate=0.000001
-4、运行恢复部分模型参数,继续训练。
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc --num_classes=21 --model_name=ssd_300_vgg --batch_size=2 --save_summaries_secs=30 --save_interval_secs=600 --negative_ratio=1 --train_dir=D:\tmp\tfmodel\ai13_voc --checkpoint_path=D:\tmp\tfmodel\voc2 --learning_rate=0.000001 --checkpoint_exclude_scopes=ssd_300_vgg/block4_box,ssd_300_vgg/block7_box,ssd_300_vgg/block8_box,ssd_300_vgg/block9_box,ssd_300_vgg/block10_box,ssd_300_vgg/block11_box --ignore_missing_vars=True
============================================================================
03、模型评估
评估代码:
python eval_ssd_network.py --dataset_dir=./ai13/pascalvoc ----checkpoint_path=D:\tmp\tfmodel\voc2
============================================================================
04、模型优化建议方向:
-a、骨干网络调整为:MobileNetV3(将BN改为 分组归一化GroupNorm)
-b、将ssd预测的多尺度预测的结构,优化为:特征金字塔结构。
-c、将单尺度训练--->图片金字塔。
-d、损失优化:将CrossEntropyLoss调整为Focal_Loss,评价标准:IOU -->GIOU
01-训练集转换运行结果
D:\Anaconda\python.exe D:/AI20/HJZ/06-CV/01-目标检测/20200314--TensorFlowSSD物体检测/TensorFlowSSD/tf_convert_data.py --dataset_name=pascalvoc --dataset_dir=D:\AI20\HJZ\06-CV\cv-datas\VOCtrainval_06-Nov-2007\VOCdevkit\VOC2007 --output_name=voc_2007_train --output_dir=./ai20-1/pascalvoc
Dataset directory: D:\AI20\HJZ\06-CV\cv-datas\VOCtrainval_06-Nov-2007\VOCdevkit\VOC2007
Output directory: ./ai20-1/pascalvoc
成功创建输出文件夹:./ai20-1/pascalvoc
>> Converting image 5011/5011
Finished converting the Pascal VOC dataset!
Process finished with exit code 0
0102-测试集数据转换运行结果
D:\Anaconda\python.exe D:/AI20/HJZ/06-CV/01-目标检测/20200314--TensorFlowSSD物体检测/TensorFlowSSD/tf_convert_data.py --dataset_name=pascalvoc --dataset_dir=D:\AI20\HJZ\06-CV\cv-datas\VOCtest_06-Nov-2007\VOCdevkit\VOC2007 --output_name=voc_2007_test --output_dir=./ai20-1/pascalvoc
Dataset directory: D:\AI20\HJZ\06-CV\cv-datas\VOCtest_06-Nov-2007\VOCdevkit\VOC2007
Output directory: ./ai20-1/pascalvoc
>> Converting image 4952/4952
Finished converting the Pascal VOC dataset!
Process finished with exit code 0
02、train_ssd_network.py: 训练主程序、
02、train_ssd_network.py: 训练主程序
运行参数:
最简单的运行命令:
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc
最终运行命令:(自己随机初始化)很慢
python train_ssd_network.py --dataset_dir=./ai20-1/pascalvoc --num_classes=21 --model_name=ssd_300_vgg --batch_size=2 --save_summaries_secs=30 --save_interval_secs=600 --negative_ratio=3 --train_dir=/tmp/tfmodel/ai20_voc
迁移学习的运行方式:
NOTE:预训练好的模型是在VOC2007数据集上的,VGG300的模型参数是训练好的,保存在checkpoints文件夹中,基于它做一个迁移学习。
步骤:
-1、修改必要的网络结构
-2、将ssd_300_vgg.ckpt.zip解压到一个文件夹中,并在该文件夹中加入checkpoint文件(检查点信息)
-3、运行
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc --num_classes=21 --model_name=ssd_300_vgg --batch_size=2 --save_summaries_secs=30 --save_interval_secs=600 --negative_ratio=1 --train_dir=/tmp/tfmodel/ai20_voc --checkpoint_path=D:\tmp\tfmodel\voc2 --learning_rate=0.000001 --end_learning_rate=0.000001 --ignore_missing_vars=True(有些图用不到)
-4、运行恢复部分模型参数,继续训练。
python train_ssd_network.py --dataset_dir=./ai13/pascalvoc --num_classes=21 --model_name=ssd_300_vgg --batch_size=2 --save_summaries_secs=30 --save_interval_secs=600 --negative_ratio=1 --train_dir=D:\tmp\tfmodel\ai13_voc --checkpoint_path=D:\tmp\tfmodel\voc2 --learning_rate=0.000001 --checkpoint_exclude_scopes=ssd_300_vgg/block4_box,ssd_300_vgg/block7_box,ssd_300_vgg/block8_box,ssd_300_vgg/block9_box,ssd_300_vgg/block10_box,ssd_300_vgg/block11_box --ignore_missing_vars=True
INFO:tensorflow:Restoring parameters from /tmp/tfmodel/voc2/ssd_300_vgg.ckpt
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/5]
INFO:tensorflow:Evaluation [2/5]
INFO:tensorflow:Evaluation [3/5]
INFO:tensorflow:Evaluation [4/5]
INFO:tensorflow:Evaluation [5/5]
2020-04-07 21:27:39.767998: W .\tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:233] Failed to run optimizer ArithmeticOptimizer, stage HoistCommonFactor. Error: Node average_precision_voc07/ArithmeticOptimizer/HoistCommonFactor_Add_AddN is missing output properties at position :0 (num_outputs=0)
AP_VOC07/mAP[0.52727272727272723]
AP_VOC12/mAP[0.52750000000000008]
Time spent : 44.685 seconds.
Time spent per BATCH: 8.937 seconds.
INFO:tensorflow:Finished evaluation at 2020-04-07-13:27:41
Process finished with exit code 0
03、模型评估
评估代码:
python eval_ssd_network.py --dataset_dir=./ai13/pascalvoc ----checkpoint_path=D:\tmp\tfmodel\voc2
```go
D:\Anaconda\python.exe D:/AI20/HJZ/06-CV/01-目标检测/20200314--TensorFlowSSD物体检测/TensorFlowSSD/eval_ssd_network.py --dataset_dir=./ai13/pascalvoc ----checkpoint_path=D:\tmp\tfmodel\voc2
WARNING:tensorflow:From D:/AI20/HJZ/06-CV/01-目标检测/20200314--TensorFlowSSD物体检测/TensorFlowSSD/eval_ssd_network.py:114: get_or_create_global_step (from tensorflow.contrib.framework.python.ops.variables) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.train.get_or_create_global_step
# =========================================================================== #
# Training | Evaluation flags:
# =========================================================================== #
{'batch_size': <absl.flags._flag.Flag object at 0x000002BF03B7C390>,
'checkpoint_path': <absl.flags._flag.Flag object at 0x000002BF03B7C550>,
# =========================================================================== #
# SSD net parameters:
# =========================================================================== #
{'anchor_offset': 0.5,
'anchor_ratios': [[2, 0.5],
[2, 0.5, 3, 0.3333333333333333],
[2, 0.5, 3, 0.3333333333333333],
[2, 0.5, 3, 0.3333333333333333],
[2, 0.5],
[2, 0.5]],
'anchor_size_bounds': [0.15, 0.9],
'anchor_sizes': [(21.0, 45.0),
(45.0, 99.0),
(99.0, 153.0),
(153.0, 207.0),
(207.0, 261.0),
(261.0, 315.0)],
'anchor_steps': [8, 16, 32, 64, 100, 300],
'feat_layers': ['block4', 'block7', 'block8', 'block9', 'block10', 'block11'],
'feat_shapes': [(38, 38), (19, 19), (10, 10), (5, 5), (3, 3), (1, 1)],
'img_shape': (300, 300),
'no_annotation_label': 21,
'normalizations': [20, -1, -1, -1, -1, -1],
'num_classes': 21,
'prior_scaling': [0.1, 0.1, 0.2, 0.2]}
# =========================================================================== #
# Training | Evaluation dataset files:
# =========================================================================== #
['.\\ai13\\pascalvoc\\voc_2007_test_000.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_001.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_002.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_003.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_004.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_005.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_006.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_007.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_008.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_009.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_010.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_011.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_012.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_013.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_014.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_015.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_016.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_017.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_018.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_019.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_020.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_021.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_022.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_023.tfrecord',
'.\\ai13\\pascalvoc\\voc_2007_test_024.tfrecord']
WARNING:tensorflow:From D:/AI20/HJZ/06-CV/01-目标检测/20200314--TensorFlowSSD物体检测/TensorFlowSSD/eval_ssd_network.py:239: streaming_mean (from tensorflow.contrib.metrics.python.ops.metric_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Please switch to tf.metrics.mean
INFO:tensorflow:Evaluating None
INFO:tensorflow:Starting evaluation at 2020-04-07-13:03:19
INFO:tensorflow:Graph was finalized.
2020-04-07 21:03:19.529399: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX AVX2
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Evaluation [10/10]
2020-04-07 21:04:02.166986: W .\tensorflow/core/grappler/optimizers/graph_optimizer_stage.h:233] Failed to run optimizer ArithmeticOptimizer, stage HoistCommonFactor. Error: Node average_precision_voc07/ArithmeticOptimizer/HoistCommonFactor_Add_AddN is missing output properties at position :0 (num_outputs=0)
AP_VOC07/mAP[0.012943937885002591]
AP_VOC12/mAP[0.012852645534828538]
Time spent : 44.715 seconds.
Time spent per BATCH: 4.471 seconds.
INFO:tensorflow:Finished evaluation at 2020-04-07-13:04:03
Process finished with exit code 0