YOLO安装
源码获取
从 darknet 下载。
git clone https://github.com/pjreddie/darknet.git
安装
修改Makefile文件,启用GPU、CUDNN等,不启用的话,相应值置零:
GPU=1
CUDNN=1
OPENCV=1
OPENMP=0
DEBUG=0
然后执行:
make -j32
无错误安装完成
车牌检测
转换样本格式
关于样本参见:YOLO样本
VOC格式为采用xml标记格式:
<xmin> <ymin> <xmax> <ymax>
如下面为一段标记代码
<annotation>
<folder>VOC2007</folder>
<filename>000001.jpg</filename>
<source>
<database>The VOC2007 Database</database>
<annotation>PASCAL VOC2007</annotation>
<image>flickr</image>
<flickrid>341012865</flickrid>
</source>
<owner>
<flickrid>Fried Camels</flickrid>
<name>Jinky the Fruit Bat</name>
</owner>
<size>
<width>353</width>
<height>500</height>
<depth>3</depth>
</size>
<segmented>0</segmented>
<object>
<name>dog</name>
<pose>Left</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>48</xmin>
<ymin>240</ymin>
<xmax>195</xmax>
<ymax>371</ymax>
</bndbox>
</object>
<object>
<name>person</name>
<pose>Left</pose>
<truncated>1</truncated>
<difficult>0</difficult>
<bndbox>
<xmin>8</xmin>
<ymin>12</ymin>
<xmax>352</xmax>
<ymax>498</ymax>
</bndbox>
</object>
</annotation>
YOLO的标记格式为:
<类别标签数字> <物体中心水平方向坐标/宽度> <物体中心垂直方向坐标/高度> <物体区域宽度> <物体区域高度>
如上面的xml标记对应于YOLO格式为:
11 0.341359773371 0.609 0.416430594901 0.262
14 0.507082152975 0.508 0.974504249292 0.972
其中,11 14 表示类别,(0.34,0.609) (0.50,0.508) 为中心坐标,(0.416,0.262) (0.97,0.972) 分别为宽度和高度
。
使用scripts中的 voc_label.py
转换即可。
修改网络
车牌检测仅包含1类,需要修改以下内容:
复制文件 "cfg/yolov2-tiny-voc.cfg“ 并重命名为 “cfg/yolov2-tiny-lpd.cfg” ,修改以下内容:
###########
[convolutional]
batch_normalize=1
size=3
stride=1
pad=1
filters=1024
activation=leaky
[convolutional]
size=1
stride=1
pad=1
filters=30 # (classes+ coords+ 1)* (NUM)
activation=linear
[region]
anchors = 1.08,1.19, 3.42,4.41, 6.63,11.38, 9.42,5.11, 16.62,10.52
bias_match=1
classes=1
coords=4
num=5
softmax=1
jitter=.2
rescore=1
object_scale=5
noobject_scale=1
class_scale=1
coord_scale=1
absolute=1
thresh = .6
random=1
修改内容:
classes=1
: 只有1类filters=30
“filters”的值需要修改,计算方法 num×(classes+ coords + 1)
。
其它
修改 cfg/data/lpd/lpd.data
,其中,train、test、valid
来自 voc_label.py
运行输出的对应文件:
classes=1
train = /mnt/d/DataSets/LPR/2018_trainval.txt
valid = /mnt/d/DataSets/LPR/2018_val.txt
test = /mnt/d/DataSets/LPR/2018_test.txt
names = [yourdir]/darknet/cfg/data/lpd/lpd.names
backup = [yourdir]/darknet/backup/
cfg/data/lpd/lpd.names
修改为:
LicensePlate
训练
./darknet detector train cfg/data/lpd/lpd.data cfg/yolov2-tiny-lpd.cfg darknet53.conv.74 -gpus 0,1 >> backup/training.log
可视化训练
新建两个文件 extract_log.py
, train_visualization.py
, 并以此执行:
python extract_log.py
python train_visualization.py
可视化的车牌检测网络训练损失函数:
由上图可见网络训练还没有收敛到最优。
extract_log.py
:
#!/usr/bin/env python
in_log_file = '../../backup/training.log'
out_loss_file = './train_loss.log'
out_iou_file = './train_iou.log'
def extract_log(log_file, new_log_file, key_word):
f = open(log_file)
train_log = open(new_log_file, 'w')
for line in f:
# remove Sync log of multi-gpu
if 'Syncing' in line:
continue
# remove error log
if 'nan' in line:
continue
if key_word in line:
train_log.write(line)
f.close()
train_log.close()
extract_log(in_log_file, out_loss_file, 'images')
extract_log(in_log_file, out_iou_file, 'IOU')
train_visualization.py
:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
#%matplotlib inline
loss_log_file = './train_loss.log'
lines = 100
skiprows = [x for x in range(lines)]
print(skiprows)
result = pd.read_csv(loss_log_file, skiprows=skiprows, error_bad_lines=False, names=[
'loss', 'avg', 'rate', 'seconds', 'images'])
result.head()
result['loss'] = result['loss'].str.split(' ').str.get(1)
result['avg'] = result['avg'].str.split(' ').str.get(1)
result['rate'] = result['rate'].str.split(' ').str.get(1)
result['seconds'] = result['seconds'].str.split(' ').str.get(1)
result['images'] = result['images'].str.split(' ').str.get(1)
result.head()
result.tail()
# print(result.head())
# print(result.tail())
# print(result.dtypes)
print(result['loss'])
print(result['avg'])
print(result['rate'])
print(result['seconds'])
print(result['images'])
result['loss'] = pd.to_numeric(result['loss'])
result['avg'] = pd.to_numeric(result['avg'])
result['rate'] = pd.to_numeric(result['rate'])
result['seconds'] = pd.to_numeric(result['seconds'])
result['images'] = pd.to_numeric(result['images'])
result.dtypes
fig = plt.figure()
ax = fig.add_subplot(1, 1, 1)
ax.plot(result['avg'].values, label='avg_loss')
# ax.plot(result['loss'].values,label='loss')
ax.legend(loc='best')
ax.set_title('The loss curves')
ax.set_xlabel('batches')
fig.savefig('avg_loss')
# fig.savefig('loss')
测试
./darknet detector test cfg/data/lpd/lpd.data cfg/yolov2-tiny-lpd.cfg backup/yolov2-tiny-lpd_80000.weights data/009.jpg -thresh 0.3
带文字,83%
不知到是车牌颜色问题,还是角度问题,这个只有5%的置信度,可能是训练样本中没有白色的车牌
分辨率较低的情况,42%的置信度,摩托的只有1%,这里给的0.3的置信度
多个车辆,分别为73%,70%置信度
夜间,87%置信度
斜视,67%置信度