[训练测试过程记录]SSD:Single Shot Detector 用于场景文字检测

介绍用SSD模型进行场景文字检测。举例数据集:COCO-Text。


编译部分:

1.使用cuda8编译时出错

/usr/include/boost/property_tree/detail/json_parser_read.hpp:257:264: error: ‘type name’ declared as function returning an array

解决方法:因为gcc版本过低,升级到5.3即可解决.

sudo add-apt-repository ppa:ubuntu-toolchain-r/test  
sudo apt-get update  
   
sudo apt-get install software-properties-common  
  
sudo apt-get install gcc-5 g++-5  
  
cd /usr/bin  
  
sudo rm gcc  
  
sudo ln -s gcc-5 gcc  
  
sudo rm g++  
  
sudo ln -s g++-5 g++  

重新编译即可解决

2. make: *** [.build_release/lib/libcaffe.so.1.0.0-rc3] 错误 

解决方法:

sudo apt-get install libopenblas-dev
同样,安装后,再重新编译即可解决

数据集准备部分:

使用coco-text数据集

1.将coco-text数据集格式化为pascal_voc的数据集格式,格式方法详见博客:[训练测试过程记录]Text-Detection-with-FRCN中的第二部分:准备数据集,这里不再赘述。

2.将formatted_dataset更名为VOC2007,并放入文件夹$home/data/VOCdevkit下面。

3.创建Imdb格式的数据:

cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/VOC0712/
./data/VOC0712/create_list.sh
# You can modify the parameters in create_data.sh if needed.
# It will create lmdb files for trainval and test with encoded original image:
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_trainval_lmdb
#   - $HOME/data/VOCdevkit/VOC0712/lmdb/VOC0712_test_lmdb
# and make soft links at examples/VOC0712/
./data/VOC0712/create_data.sh

注意:

1.修改create_list.sh和create_data.sh下面的数据集路径

create_list.sh:

#root_dir=$HOME/data/VOCdevkit/
root_dir="改为自己的数据集目录'
sub_dir=ImageSets/Main
bash_dir="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
for dataset in trainval test
do
  dst_file=$bash_dir/$dataset.txt
  if [ -f $dst_file ]
  then
    rm -f $dst_file
  fi
  #for name in VOC2007 VOC2012
  #这里只有VOC2007文件夹
  for name in VOC2007
  do
    if [[ $dataset == "test" && $name == "VOC2012" ]]
    then
      continue

create_data.sh:

cur_dir=$(cd $( dirname ${BASH_SOURCE[0]} ) && pwd )
root_dir=$cur_dir/../..

cd $root_dir

redo=1
#data_root_dir="$HOME/data/VOCdevkit"
data_root_dir="改为自己的数据集目录"
dataset_name="VOC0712"
mapfile="$root_dir/data/$dataset_name/labelmap_voc.prototxt"
anno_type="detection"


2.修改labelmap_voc.prototxt下的数据集类别

由于这里只有背景和text两类,因此改为:

item {
  name: "none_of_the_above"
  label: 0
  display_name: "background"
}
item {
  name: "text"
  label: 1
  display_name: "text"
}


训练部分:

# It will create model definition files and save snapshot models in:
#   - $CAFFE_ROOT/models/VGGNet/VOC0712/SSD_300x300/
# and job file, log file, and the python script in:
#   - $CAFFE_ROOT/jobs/VGGNet/VOC0712/SSD_300x300/
# and save temporary evaluation results in:
#   - $HOME/data/VOCdevkit/results/VOC2007/SSD_300x300/
# It should reach 77.* mAP at 120k iterations.
python examples/ssd/ssd_pascal.py


问题1:num_test_image的数目不对

解决方法:需要将4952换成测试图片的数量,例如coco-text的测试集数目为840。

#Evaluate on whole test set.
#num_test_image = 4952
num_test_image = 840
test_batch_size = 8
# Ideally test_batch_size should be divisible by num_test_image,
# otherwise mAP will be slightly off the true value.
test_iter = int(math.ceil(float(num_test_image) / test_batch_size))


问题2:loss = nan

由于是场景文字数据集的原因,初始迭代产生的loss特别大,我自己训练是到了iteration 40的时候,就开始变成Loss = nan了。

原因:梯度爆炸。梯度变得非常大,使得学习过程难以继续。

一般措施:减小solver.prototxt的base_lr,至少减小一个数量级。如果有多个loss layer,需要找出哪个损失层导致了梯度爆炸,并在train_val.prototxt中减小该层的loss_weight,而非是减小通用的base_lr。参考:使用caffe训练时Loss变为nan的原因

解决方法:

将base_lr变为原来的10倍。在/examples/ssd/ssd_pascal.py的第229行和232行处进行修改,将0.004改为0.0004和将0.00004改为0.000004。

# If true, use batch norm for all newly added layers.
# Currently only the non batch norm version has been tested.
use_batchnorm = False
lr_mult = 1
# Use different initial learning rate.
if use_batchnorm:
   #base_lr = 0.0004
    base_lr = 0.00004
else:
    # A learning rate for batch_size = 1, num_gpus = 1.
    #base_lr = 0.00004
    base_lr = 0.000004


PS:降低学习率可能会带来loss收敛速度很慢的问题。之前想过更换训练模型,也就是将官方给点的pretrained model: fully convolutional reduced (atrous) VGGNet换成训练好了的 SSD300*模型。但是由于维度不一样,将num_classes的维度由21换到了2,所以只能使用官方给的pretrained model。目前除了降低base_lr,还没有想到其他更好的办法。

问题3:OpenCV Error: Assertion failed

OpenCV Error: Assertion failed ((scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F)) in ipp_cvtColor, file /home/user1/opencv-3.1.0/modules/imgproc/src/color.cpp, line 7646
terminate called after throwing an instance of 'cv::Exception'
what(): /home/user1/opencv-3.1.0/modules/imgproc/src/color.cpp:7646: error: (-215) (scn == 3 || scn == 4) && (depth == CV_8U || depth == CV_32F) in function ipp_cvtColor

*** Aborted at 1482480286 (unix time) try "date -d @1482480286" if you are using GNU date ***
PC: @ 0x7f7e541abcc9 (unknown)
*** SIGABRT (@0x3e900004df8) received by PID 19960 (TID 0x7f7e227fd700) from PID 19960; stack trace: ***
@ 0x7f7e541abd40 (unknown)
@ 0x7f7e541abcc9 (unknown)
@ 0x7f7e541af0d8 (unknown)
@ 0x7f7e54f61535 (unknown)
@ 0x7f7e54f5f6d6 (unknown)
@ 0x7f7e54f5f703 (unknown)
@ 0x7f7e54f5f922 (unknown)
@ 0x7f7e4d12fca0 cv::error()
@ 0x7f7e4d12fe20 cv::error()
@ 0x7f7e4b574c89 cv::ipp_cvtColor()
@ 0x7f7e4b57e4d4 cv::cvtColor()
@ 0x7f7e5600758a caffe::AdjustSaturation()
@ 0x7f7e5600c77a caffe::RandomSaturation()
@ 0x7f7e5600ce96 caffe::ApplyDistort()
@ 0x7f7e561dfeac caffe::DataTransformer<>::DistortImage()
@ 0x7f7e561c7beb caffe::AnnotatedDataLayer<>::load_batch()
@ 0x7f7e560abc29 caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
@ 0x7f7e55ff39f0 caffe::InternalThread::entry()
@ 0x7f7e4abf9a4a (unknown)
@ 0x7f7e4678a182 start_thread
@ 0x7f7e5426f47d (unknown)
@ 0x0 (unknown)
Aborted (core dumped)

解决方法:在/examples/ssd/ssd_pascal.py的第175行的train_transform_param中添加'force_color':True,

参考:OpenCV Error: Assertion failed #353

train_transform_param = {
        'mirror': True,
        'mean_value': [104, 117, 123],
       #added
	'force_color': True,
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [
                        P.Resize.LINEAR,
                        P.Resize.AREA,
                        P.Resize.NEAREST,
                        P.Resize.CUBIC,
                        P.Resize.LANCZOS4,
                        ],
                },

问题4:Check failed: mean_values_.size() == 1
F1203 16:07:24.865304 12717 data_transformer.cpp:621] Check failed: mean_values_.size() == 1 || mean_values_.size() == img_channels Specify either 1 mean_value or as many as channels: 1
*** Check failure stack trace: ***
    @     0x7f6168187daa  (unknown)
    @     0x7f6168187ce4  (unknown)
    @     0x7f61681876e6  (unknown)
    @     0x7f616818a687  (unknown)
    @     0x7f61689df73d  caffe::DataTransformer<>::Transform()
    @     0x7f61689e0993  caffe::DataTransformer<>::Transform()
    @     0x7f61689ebcdb  caffe::DataTransformer<>::Transform()
    @     0x7f61689ebd98  caffe::DataTransformer<>::Transform()
    @     0x7f61689ebe3e  caffe::DataTransformer<>::Transform()
    @     0x7f616887005b  caffe::AnnotatedDataLayer<>::load_batch()
    @     0x7f616884f6dc  caffe::BasePrefetchingDataLayer<>::InternalThreadEntry()
    @     0x7f61689a1445  caffe::InternalThread::entry()
    @     0x7f615e23ba4a  (unknown)
    @     0x7f615729c184  start_thread
    @     0x7f6166aabbed  (unknown)
    @              (nil)  (unknown)
解决方法:在/examples/ssd/ssd_pascal.py的第213行的test_transform_param中添加'force_color':True,
test_transform_param = {
        'mean_value': [104, 117, 123],
        'force_color': True,
        'resize_param': {
                'prob': 1,
                'resize_mode': P.Resize.WARP,
                'height': resize_height,
                'width': resize_width,
                'interp_mode': [P.Resize.LINEAR],
                },
        }


问题5:status == CUDNN_STATUS_SUCCESS

F0616 16:54:55.034394 3070141376 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0) CUDNN_STATUS_INTERNAL_ERROR

原因:内存不够

解决方法:减小batch_size,例如我在这里将训练的batch_size由32减小到了8,修改处为在/examples/ssd/ssd_pascal.py的第338行和第339行。

#Divide the mini-batch to different GPUs.
#batch_size = 32
#accum_batch_size = 32
batch_size = 8
accum_batch_size = 8
iter_size = accum_batch_size / batch_size
solver_mode = P.Solver.CPU
device_id = 0

问题6:Check failed: label_to_name_.find(label) !=lable_to_name_.name() Cannot find label: 2 in the label map

F1027 detection_output_layer.cu:143] Check failed: label_to_name_.find(label) !=lable_to_name_.name() Cannot find label: 2 in the label map

这是由于将类别由21类改为2类造成的。

解决方法:

1.将examples/ssd/ssd_pascal.py中的269行中的num_classes ,由21改为2。

# MultiBoxLoss parameters.
# num_classes = 21
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.

2.将examples/ssd/score_ssd_pascal.py中的277行中的num_classes ,由21改为2。

# MultiBoxLoss parameters.
# num_classes = 21
num_classes = 2
share_location = True
background_label_id=0
train_on_diff_gt = True
normalization_mode = P.Loss.VALID
code_type = P.PriorBox.CENTER_SIZE
ignore_cross_boundary_bbox = False
mining_type = P.MultiBoxLoss.MAX_NEGATIVE
neg_pos_ratio = 3.

在py文件中改过之后,以下目录中的train.prototxt,test.prototxt,deploy.prototxt中的num_classes也会随之改变:

caffe/jobs/VGGNet/VOC0712/SSD_300x300
caffe/jobs/VGGNet/VOC0712/SSD_300x300_score


caffe/models/VGGNet/VOC0712/SSD_300x300
caffe/models/VGGNet/VOC0712/SSD_300x300_score



  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 3
    评论
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值