介绍用SSD模型进行场景文字检测。举例数据集:COCO-Text。
编译部分:
1.使用cuda8编译时出错
/usr/include/boost/property_tree/detail/json_parser_read.hpp:257:264: error: ‘type name’ declared as function returning an array
解决方法:因为gcc版本过低,升级到5.3即可解决.
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update
sudo apt-get install software-properties-common
sudo apt-get install gcc-5 g++-5
cd /usr/bin
sudo rm gcc
sudo ln -s gcc-5 gcc
sudo rm g++
sudo ln -s g++-5 g++
重新编译即可解决
2. make: *** [.build_release/lib/libcaffe.so.1.0.0-rc3] 错误
解决方法:
sudo apt-get install libopenblas-dev
同样,安装后,再重新编译即可解决
数据集准备部分:
使用coco-text数据集
1.将coco-text数据集格式化为pascal_voc的数据集格式,格式方法详见博客:[训练测试过程记录]Text-Detection-with-FRCN中的第二部分:准备数据集,这里不再赘述。
2.将formatted_dataset更名为VOC2007,并放入文件夹$home/data/VOCdevkit下面。
3.创建Imdb格式的数据:
cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
# - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh
注意:
1.修改create_list.sh和create_data.sh下面的数据集路径
create_list.sh:
#root_dir=$HOME/data/VOCdevkit/
root_dir="改为自己的数据集目录'
sub_dir=ImageSets/Main
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
for dataset in trainval test
do
dst_file=$bash_dir/$dataset.txt
if [ -f $dst_file ]
then
rm -f $dst_file
fi
#for name in VOC2007 VOC2012
#这里只有VOC2007文件夹
for name in VOC2007
do
if [[ $dataset == "test" && $name == "VOC2012" ]]
then
continue
create_data.sh:
cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir=$cur_dir/../..
cd $root_dir
redo=1
#data_root_dir="$HOME/data/VOCdevkit"
data_root_dir="改为自己的数据集目录"
dataset_name="VOC0712"
mapfile="$root_dir/data/$dataset_name/labelmap_voc.prototxt"
anno_type="detection"
由于这里只有背景和text两类,因此改为:
item {
name: "none_of_the_above"
label: 0
display_name: "background"
}
item {
name: "text"
label: 1
display_name: "text"
}
训练部分:
# It will create model definition files and save snapshot models in:
# - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
# and job file, log file, and the python script in:
# - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
# and save temporary evaluation results in:
# - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
# It should reach 77.* mAP at 120k iterations.
python examples/ssd/ssd_pascal.py
解决方法:需要将4952换成测试图片的数量,例如coco-text的测试集数目为840。
#Evaluate on whole test set.
#num_test_image = 4952
num_test_image = 840
test_batch_size = 8
# Ideally test_batch_size should be divisible by num_test_image,
# otherwise mAP will be slightly off the true value.
test_iter = int(math.ceil(float(num_test_image) / test_batch_size))
问题2:loss = nan
由于是场景文字数据集的原因,初始迭代产生的loss特别大,我自己训练是到了iteration 40的时候,就开始变成Loss = nan了。
原因:梯度爆炸。梯度变得非常大,使得学习过程难以继续。
一般措施:减小solver.prototxt的base_lr,至少减小一个数量级。如果有多个loss layer,需要找出哪个损失层导致了梯度爆炸,并在train_val.prototxt中减小该层的loss_weight,而非是减小通用的base_lr。参考:使用caffe训练时Loss变为nan的原因
解决方法:
将base_lr变为原来的10倍。在/examples/ssd/ssd_pascal.py的第229行和232行处进行修改,将0.004改为0.0004和将0.00004改为0.000004。
# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
use_batchnorm = False
lr_mult = 1
# Use different initial learning rate.
if use_batchnorm:
#base_lr = 0.0004
base_lr = 0.00004
else:
# A learning rate for batch_size = 1, num_gpus = 1.
#base_lr = 0.00004
base_lr = 0.000004
问题3:OpenCV Error: Assertion failed
OpenCV Error: Assertion failed ((scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F)) in ipp_cvtColor, file /home/user1/opencv-3.1.0/modules/imgproc/src/color.cpp, line 7646
terminate called after throwing an instance of 'cv::Exception'
what(): /home/user1/opencv-3.1.0/modules/imgproc/src/color.cpp:7646: error: (-215) (scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F) in function ipp_cvtColor
*** Aborted at 1482480286 (unix time) try "date -d @1482480286" if you are using GNU date ***
PC: @ 0x7f7e541abcc9 (unknown)
*** SIGABRT (@0x3e900004df8) received by PID 19960 (TID 0x7f7e227fd700) from PID 19960; stack trace: ***
@ 0x7f7e541abd40 (unknown)
@ 0x7f7e541abcc9 (unknown)
@ 0x7f7e541af0d8 (unknown)
@ 0x7f7e54f61535 (unknown)
@ 0x7f7e54f5f6d6 (unknown)
@ 0x7f7e54f5f703 (unknown)
@ 0x7f7e54f5f922 (unknown)
@ 0x7f7e4d12fca0 cv::error()
@ 0x7f7e4d12fe20 cv::error()
@ 0x7f7e4b574c89 cv::ipp_cvtColor()
@ 0x7f7e4b57e4d4 cv::cvtColor()
@ 0x7f7e5600758a caffe::AdjustSaturation()
@ 0x7f7e5600c77a caffe::RandomSaturation()
@ 0x7f7e5600ce96 caffe::ApplyDistort()
@ 0x7f7e561dfeac caffe::DataTransformer<>::DistortImage()
@ 0x7f7e561c7beb caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7f7e560abc29 caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@ 0x7f7e55ff39f0 caffe::InternalThread::entry()
@ 0x7f7e4abf9a4a (unknown)
@ 0x7f7e4678a182 start_thread
@ 0x7f7e5426f47d (unknown)
@ 0x0 (unknown)
Aborted (core dumped)
解决方法:在/examples/ssd/ssd_pascal.py的第175行的train_transform_param中添加'force_color':True,
参考:OpenCV Error: Assertion failed #353
train_transform_param = {
'mirror': True,
'mean_value': [104, 117, 123],
#added
'force_color': True,
'resize_param': {
'prob': 1,
'resize_mode': P.Resize.WARP,
'height': resize_height,
'width': resize_width,
'interp_mode': [
P.Resize.LINEAR,
P.Resize.AREA,
P.Resize.NEAREST,
P.Resize.CUBIC,
P.Resize.LANCZOS4,
],
},
问题4:Check failed: mean_values_.size() == 1
F1203 16:07:24.865304 12717 data_transformer.cpp:621] Check failed: mean_values_.size() == 1 || mean_values_.size() == img_channels Specify either 1 mean_value or as many as channels: 1
*** Check failure stack trace: ***
@ 0x7f6168187daa (unknown)
@ 0x7f6168187ce4 (unknown)
@ 0x7f61681876e6 (unknown)
@ 0x7f616818a687 (unknown)
@ 0x7f61689df73d caffe::DataTransformer<>::Transform()
@ 0x7f61689e0993 caffe::DataTransformer<>::Transform()
@ 0x7f61689ebcdb caffe::DataTransformer<>::Transform()
@ 0x7f61689ebd98 caffe::DataTransformer<>::Transform()
@ 0x7f61689ebe3e caffe::DataTransformer<>::Transform()
@ 0x7f616887005b caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7f616884f6dc caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@ 0x7f61689a1445 caffe::InternalThread::entry()
@ 0x7f615e23ba4a (unknown)
@ 0x7f615729c184 start_thread
@ 0x7f6166aabbed (unknown)
@ (nil) (unknown)
解决方法:在/examples/ssd/ssd_pascal.py的第213行的test_transform_param中添加'force_color':True,
test_transform_param = {
'mean_value': [104, 117, 123],
'force_color': True,
'resize_param': {
'prob': 1,
'resize_mode': P.Resize.WARP,
'height': resize_height,
'width': resize_width,
'interp_mode': [P.Resize.LINEAR],
},
}
F0616 16:54:55.034394 3070141376 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR
原因:内存不够
解决方法:减小batch_size,例如我在这里将训练的batch_size由32减小到了8,修改处为在/examples/ssd/ssd_pascal.py的第338行和第339行。
#Divide the mini-batch to different GPUs.
#batch_size = 32
#accum_batch_size = 32
batch_size = 8
accum_batch_size = 8
iter_size = accum_batch_size / batch_size
solver_mode = P.Solver.CPU
device_id = 0
问题6:Check failed: label_to_name_.find(label) !=lable_to_name_.name() Cannot find label: 2 in the label map
F1027 detection_output_layer.cu:143] Check failed: label_to_name_.find(label) !=lable_to_name_.name() Cannot find label: 2 in the label map
这是由于将类别由21类改为2类造成的。
解决方法:
1.将examples/ssd/ssd_pascal.py中的269行中的num_classes ,由21改为2。
# MultiBoxLoss parameters.
# num_classes = 21
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.
2.将examples/ssd/score_ssd_pascal.py中的277行中的num_classes ,由21改为2。
# MultiBoxLoss parameters.
# num_classes = 21
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.
在py文件中改过之后,以下目录中的train.prototxt,test.prototxt,deploy.prototxt中的num_classes也会随之改变:
caffe/jobs/VGGNet/VOC0712/SSD_300x300
caffe/jobs/VGGNet/VOC0712/SSD_300x300_score
caffe/models/VGGNet/VOC0712/SSD_300x300
caffe/models/VGGNet/VOC0712/SSD_300x300_score