简介
本系列文章旨在学习如何在opencv中基于Adaboost+haar-like特征训练自己的分类器,并且用该分类器用于模式识别。
haar是一种特征,adaboost是一种重采样技术,一般称之为adaboost分类器;没有haar分类器一说;另外,opencv自带的人脸检测.xml就是用haar+adaboost训练的;
该过程大致可以分为一下几个大步骤:
1.准备训练样本图片,包括正例及反例样本
2.生成样本描述文件
3.训练样本
4.目标识别
本文主要对步骤1、步骤2进行说明。
1.准备训练样本图片,包括正例及反例样本
1)正样本的采集:
所谓正样本,是指只包含待识别的物体的图片,一般是一些局部的图片,且最好能转化为灰度图。比如,若你想识别人脸,则正样本应尽可能只包含人脸,可以留一点周边的背景但不要过多。在正样本的采集上,我们有两种图形标定工具可以使用:(1)opencv的imageClipper (2)objectMarker。这两个工具都支持傻瓜式地对图片中的物体进行矩形标定,可以自动生成样本说明文件,自动逐帧读取文件夹内的下一帧。我用的是objectMarker。objectMarker下载链接
在标定的时候尽量保持长宽比例一致,也就是尽量用接近正方形的矩形去标定待识别的物体,至于正方形的大小影响并不大。尽管OpenCV推荐训练样本的最佳尺寸是20x20,但是在下一步生成样本描述文件时可以轻松地将其它尺寸缩放到20x20或者20X40。标定完成后生成的样本说明文件info.txt内容举例如下:
其中rawdata文件夹存放了所有待标定的大图,objectMarker.exe与rawdata文件夹同级。这个描述文件的格式已经很接近opencv所要求的了。在第1步中我们用objectMarker完成标定后会自动生成info.txt,现在我们需要对其格式做一定的微调,通过editplus或者ultraedit将路径信息rawdata都替换掉,并命名为sample_pos.dat,也可自定义名字。
使用时有bmp要求,太麻烦,所以我直接使用Python的文件操作生成并命名为sample_pos.dat,
正样本的文件名字,后面的1表示这个图里面有一个正样本,它在图中的位置是起点是左上角的坐标是(0,0),然后宽高是228,72。这里宽高之所以是228,72的原因是我把所有正样本都缩放到了228,72的尺寸,所以正样本图片就是整个图片。到时候还要进一步缩放。坐标意义:
代码:
import PIL
from PIL import ImageFont
from PIL import Image
from PIL import ImageDraw
import matplotlib.pyplot as plt
import cv2;
import numpy as np;
import os;
import random
import tensorflow as tf
from math import *
index = {"京": 0, "沪": 1, "津": 2, "渝": 3, "冀": 4, "晋": 5, "蒙": 6, "辽": 7, "吉": 8, "黑": 9, "苏": 10, "浙": 11, "皖": 12,
"闽": 13, "赣": 14, "鲁": 15, "豫": 16, "鄂": 17, "湘": 18, "粤": 19, "桂": 20, "琼": 21, "川": 22, "贵": 23, "云": 24,
"藏": 25, "陕": 26, "甘": 27, "青": 28, "宁": 29, "新": 30, "0": 31, "1": 32, "2": 33, "3": 34, "4": 35, "5": 36,
"6": 37, "7": 38, "8": 39, "9": 40, "A": 41, "B": 42, "C": 43, "D": 44, "E": 45, "F": 46, "G": 47, "H": 48,
"J": 49, "K": 50, "L": 51, "M": 52, "N": 53, "P": 54, "Q": 55, "R": 56, "S": 57, "T": 58, "U": 59, "V": 60,
"W": 61, "X": 62, "Y": 63, "Z": 64};
chars = ["京", "沪", "津", "渝", "冀", "晋", "蒙", "辽", "吉", "黑", "苏", "浙", "皖", "闽", "赣", "鲁", "豫", "鄂", "湘", "粤", "桂",
"琼", "川", "贵", "云", "藏", "陕", "甘", "青", "宁", "新", "0", "1", "2", "3", "4", "5", "6", "7", "8", "9", "A",
"B", "C", "D", "E", "F", "G", "H", "J", "K", "L", "M", "N", "P", "Q", "R", "S", "T", "U", "V", "W", "X",
"Y", "Z"
];
# CHAR_SET_LEN = 31 + 24 + 5*34 # 31个省,24个字母,34个数字字母
CHAR_SET_LEN = 34 # 后来predict = tf.reshape要统一大小,只能为MAX_CAPTCHA*sCHAR_SET_LEN
MAX_CAPTCHA = 7
def AddSmudginess(img, Smu):
rows = r(Smu.shape[0] - 50)
cols = r(Smu.shape[1] - 50)
adder = Smu[rows:rows + 50, cols:cols + 50];
adder = cv2.resize(adder, (50, 50));
img = cv2.resize(img, (50, 50))
img = cv2.bitwise_not(img)
img = cv2.bitwise_and(adder, img)
img = cv2.bitwise_not(img)
return img
def rot(img, angel, shape, max_angel):
""" 添加仿射畸变,使图像轻微的畸变
img 输入图像
factor 畸变的参数
size 为图片的目标尺寸
"""
size_o = [shape[1], shape[0]]
size = (shape[1] + int(shape[0] * cos((float(max_angel) / 180) * 3.14)), shape[0])
interval = abs(int(sin((float(angel) / 180) * 3.14) * shape[0]));
pts1 = np.float32([[0, 0], [0, size_o[1]], [size_o[0], 0], [size_o[0], size_o[1]]])
if(angel > 0):
pts2 = np.float32([[interval, 0], [0, size[1]], [size[0], 0 ], [size[0] - interval, size_o[1]]])
else:
pts2 = np.float32([[0, 0], [interval, size[1]], [size[0] - interval, 0], [size[0], size_o[1]]])
M = cv2.getPerspectiveTransform(pts1, pts2);
dst = cv2.warpPerspective(img, M, size);
return dst;
def rotRandrom(img, factor, size):
shape = size; # 添加透视畸变
pts1 = np.float32([[0, 0], [0, shape[0]], [shape[1], 0], [shape[1], shape[0]]])
pts2 = np.float32([[r(factor), r(factor)], [ r(factor), shape[0] - r(factor)], [shape[1] - r(factor), r(factor)],
[shape[1] - r(factor), shape[0] - r(factor)]])
M = cv2.getPerspectiveTransform(pts1, pts2);
dst = cv2.warpPerspective(img, M, size);
return dst;
def tfactor(img):
hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV); # 增加饱和度光照的噪声
hsv[:, :, 0] = hsv[:, :, 0] * (0.8 + np.random.random() * 0.2);
hsv[:, :, 1] = hsv[:, :, 1] * (0.3 + np.random.random() * 0.7);
hsv[:, :, 2] = hsv[:, :, 2] * (0.2 + np.random.random() * 0.8);
img = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR);
return img
def random_envirment(img, data_set):
index = r(len(data_set)) # 添加自然环境的噪声
env = cv2.imread(data_set[index])
env = cv2.resize(env, (img.shape[1], img.shape[0]))
bak = (img == 0);
bak = bak.astype(np.uint8) * 255;
inv = cv2.bitwise_and(bak, env)
img = cv2.bitwise_or(inv, img)
return img
def GenCh(f, val, rand_choose_platecolor):
img = Image.new("RGB", (45, 70), (255, 255, 255)) # 生成中文字符
draw = ImageDraw.Draw(img)
if rand_choose_platecolor == 0:
draw.text((0, 2), val, fill=(0, 0, 0), font=f)
else:
draw.text((0, 2), val, fill=(0, 255, 255), font=f)
img = img.resize((23, 70))
A = np.array(img)
return A
def GenCh1(f, val, rand_choose_platecolor):
img = Image.new("RGB", (23, 70), (255, 255, 255)) # 生成英文字符
draw = ImageDraw.Draw(img)
if rand_choose_platecolor == 0:
draw.text((0, 2), val.encode('utf-8').decode('utf-8'), (0, 0, 0), font=f)
else:
draw.text((0, 2), val.encode('utf-8').decode('utf-8'), (0, 255, 255), font=f)
A = np.array(img)
return A
def AddGauss(img, level):
return cv2.blur(img, (level * 2 + 1, level * 2 + 1)); # 添加高斯噪声
def r(val):
return int(np.random.random() * val)
def rand_range(lo, hi):
return lo + r(hi - lo);
def AddNoiseSingleChannel(single):
diff = 255 - single.max();
noise = np.random.normal(0, 1 + r(6), single.shape);
noise = (noise - noise.min()) / (noise.max() - noise.min())
noise = diff * noise;
noise = noise.astype(np.uint8)
dst = single + noise
return dst
def addNoise(img, sdev=0.5, avg=10):
img[:, :, 0] = AddNoiseSingleChannel(img[:, :, 0]);
img[:, :, 1] = AddNoiseSingleChannel(img[:, :, 1]);
img[:, :, 2] = AddNoiseSingleChannel(img[:, :, 2]);
return img;
def gen_rand():
"""
:rtype: object
"""
name = "";
label = [];
label.append(rand_range(0, 31));
label.append(rand_range(41, 65));
for i in range(5):
label.append(rand_range(31, 65))
name += chars[label[0]]
name += chars[label[1]]
for i in range(5):
name += chars[label[i + 2]]
return name, label
def label2vec(lab):
text_len = len(lab)
if text_len != MAX_CAPTCHA:
raise ValueError('车牌为7个字符')
vector = np.zeros(MAX_CAPTCHA * CHAR_SET_LEN)
for i, c in enumerate(lab):
if i == 0:
idx = c
vector[idx] = 1
elif i == 1:
idx = c - 41 + CHAR_SET_LEN
vector[idx] = 1
else:
idx = (c - 31) + CHAR_SET_LEN * i#(c - 31) + 68 + CHAR_SET_LEN * (i - 2)
vector[idx] = 1
return vector
def vec2text(vec):
char_pos = vec.nonzero()[0]
#print("char_pos_vec:", char_pos)
text = []
rel_pos = []
for i, c in enumerate(char_pos):
if i == 0:
pos = c
char_code = chars[pos] # 省
elif i == 1:
pos = c + 7 # c-CHAR_SET_LEN+41
char_code = chars[pos] # 市
else:
pos = c - CHAR_SET_LEN * i +31#c - 37 - CHAR_SET_LEN * (i - 2)
char_code = chars[pos]
rel_pos.append(pos)
text.append(char_code)
return "".join(text)
# 把彩色图像转为灰度图像(色彩对没有什么用)
def convert2gray(img):
if len(img.shape) > 2:
gray = np.mean(img, -1)
# 上面的转法较快,正规转法如下
# r, g, b = img[:,:,0], img[:,:,1], img[:,:,2]
# gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
else:
return img
class GenPlate:
def __init__(self, fontCh, fontEng, NoPlates):
self.fontC = ImageFont.truetype(fontCh, 43, 0);
self.fontE = ImageFont.truetype(fontEng, 60, 0);
self.img = np.array(Image.new("RGB", (226, 70), (255, 255, 255)))
self.bg = [];
self.noplates_path = [];
for parent, parent_folder, filenames in os.walk(NoPlates):
for filename in filenames:
path = parent + "/" + filename;
self.noplates_path.append(path);
def draw(self, val):
offset = 2 ;
rand_choose_platecolor = random.choice([0, 1])
self.img[0:70, offset + 8:offset + 8 + 23] = GenCh(self.fontC, val[0], rand_choose_platecolor);
self.img[0:70, offset + 8 + 23 + 6:offset + 8 + 23 + 6 + 23] = GenCh1(self.fontE, val[1], rand_choose_platecolor);
for i in range(5):
base = offset + 8 + 23 + 6 + 23 + 17 + i * 23 + i * 6 ;
self.img[0:70, base : base + 23] = GenCh1(self.fontE, val[i + 2], rand_choose_platecolor);
return self.img
def generate(self, text):
if len(text) == 7:
rand_choose_plate = random.choice([0, 1])
fg = self.draw(text.encode('utf-8').decode(encoding="utf-8"));
fg = cv2.bitwise_not(fg);
if rand_choose_plate == 0:
self.bg = cv2.resize(cv2.imread("./images/template_blue.bmp"), (226, 70));
else:
self.bg = cv2.resize(cv2.imread("./images/template_yellow.bmp"), (226, 70));
com = cv2.bitwise_or(fg, self.bg);
com = rot(com, r(60) - 30, com.shape, 30);
com = rotRandrom(com, 10, (com.shape[1], com.shape[0]));
com = tfactor(com)
com = random_envirment(com, self.noplates_path);
com = AddGauss(com, 1 + r(4));
com = addNoise(com);
else:
print('len(text) != 7')
return com
def genBatch(self, batchSize, outputPath, size):
batch_x = np.zeros([batchSize, size[0] * size[1]])
batch_y = np.zeros([batchSize, MAX_CAPTCHA * CHAR_SET_LEN])
if (not os.path.exists(outputPath)):
os.mkdir(outputPath)
for i in range(batchSize):
char_name, label = gen_rand()
# print(char_name, label)
img = G.generate(char_name);
img = cv2.resize(img, size);
#cv2.imshow('img',img)
#cv2.waitKey(0)
img = convert2gray(img)
batch_x[i, :] = img.flatten() / 255
batch_y[i, :] = label2vec(label)
cv2.imwrite(outputPath + "/" + str(i).zfill(8) + ".jpg", img);
with open("sample_pos.dat","a",encoding='utf-8') as f:
f.write("{_name_}.jpg 1 0 0 {_w_} {_h_}\n".format(
_name_=str(i).zfill(8),_w_=size[0],_h_=size[1]))
return batch_x, batch_y
G = GenPlate("./font/platech.ttf", './font/platechar.ttf', "./NoPlates")
if __name__ == '__main__':
# 测试
Image_X, Batch_y = G.genBatch(3000, "./plate", (228, 72))
with tf.Session() as sess:
max_idx_l = sess.run(tf.argmax(tf.reshape(Batch_y, [-1, MAX_CAPTCHA, CHAR_SET_LEN]), 2))
text = max_idx_l[0].tolist()
vector = np.zeros(MAX_CAPTCHA * CHAR_SET_LEN)
i = 0
for n in text:
vector[i * CHAR_SET_LEN + n] = 1
i += 1
正样本图片:
2)负样本的采集:
所谓负样本,是指不包含待识别物体的任何图片,因此你可以将天空、海滩、大山等所有东西都拿来当负样本。但是,很多时候你这样做是事倍功半的。大多数模式识别问题都是用在视频监控领域,摄像机的角度跟高度都相对固定。如果你知道你的项目中摄像机一般都在拍什么,那负样本可以非常有针对性地选取,而且可以事半功倍。举个例子,你现在想做火车站广场的异常行为检测,在这个课题中行人检测是必须要做的。而视频帧的背景基本都是广场的地板、建筑物等。那你可以在人空旷的时候选择取一张图,不同光照不同时段下各取一张图,然后在这些图上随机取图像块,每个块H=400,W=600,每个块就是一个负样本。这几张图就能缠上数以千计数以万计的负样本!负样本图像的大小只要不小于正样本就可以。opencv在使用你提供的一张负样本图片时会自动从其中抠出一块与正样本同样大小的图像作为负样本,具体的函数可见opencv系统函数cvGetNextFromBackgroundData()。而且针对性强。因为海洋、大山等东西对你的识别一点帮助也没有,还会增加训练的时间,吃力不讨好的事还是少做为好。 拿视频存成图片来作为负样本的,图片的个数是5000张,但是图片的个数不代表负样本的个数,上面解释到。把这些图片放在一个叫做neg的文件夹中,pos和neg文件夹在同一个目录下。还是上图吧:
同样负样本取名”neg_sample.dat”,打开.dat,用editplus替换jpg为jpg 1 0 0 W H。这样负样本说明文件就产生了。我写了一段小程序,功能是根据背景图片中间裁剪自动随机生成指定数量指定尺寸的负样本:
'''
Created on 2017年10月5日
@author: XT
'''
import os.path
import cv2
import numpy as np
# 把彩色图像转为灰度图像
def convert2gray(img):
if len(img.shape) > 2:
gray = np.mean(img, -1)
# 上面的转法较快,正规转法如下
# r, g, b = img[:,:,0], img[:,:,1], img[:,:,2]
# gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
else:
return img
file_dir = "F:\\2345Do\\"
classes = {"StreetScene"}
i = 0
for index,name in enumerate(classes):
class_path = file_dir+name+"\\"
for img_name in os.listdir(class_path):
img_path = class_path+img_name#读取每一个图片路径
image = cv2.imread(img_path)
H,W,_ = image.shape
if W>=600 and H>=400:
y = H/2
x = W/2
winW = 600/2#目标裁剪宽的一半
winH = 400/2#目标裁剪高的一半
#cv2.rectangle(image, (int(x-winW), int(y-winH)), (int(x + winW), int(y + winH)), (0, 255, 0), 2)
#cv2.imshow('rectimage',image)
#cv2.waitKey(0)
cropImg = image[int(y-winH):int(y + winH),int(x-winW):int(x + winW)]
grayimg = convert2gray(cropImg)
cv2.imwrite('F:\\objectmarker\\neg\\streets{:08d}.jpg'.format(i),grayimg)
with open("F:\\objectmarker\\sample_neg.dat",'a',encoding='utf-8') as f:
f.write("streets{_i_}.jpg 1 0 0 {_w_} {_h_}\n".format(_i_=str(i).zfill(8),_w_=winW*2,_h_=winH*2))
i +=1
生成的图片和neg_sample.dat文件或者取名sample_neg.dat无所谓:
若还不够,可将图片resize为H=400,W=600,代码:
'''
Created on 2017年10月5日
@author: XT
'''
import os.path
import cv2
import numpy as np
# 把彩色图像转为灰度图像
def convert2gray(img):
if len(img.shape) > 2:
gray = np.mean(img, -1)
# 上面的转法较快,正规转法如下
# r, g, b = img[:,:,0], img[:,:,1], img[:,:,2]
# gray = 0.2989 * r + 0.5870 * g + 0.1140 * b
return gray
else:
return img
file_dir = "F:\\2345Do\\"
classes = {"StreetScene"}
i = 0
for index,name in enumerate(classes):
class_path = file_dir+name+"\\"
for img_name in os.listdir(class_path):
img_path = class_path+img_name#读取每一个图片路径
image = cv2.imread(img_path)
winW = 600#目标宽resize
winH = 400#目标高resize
resizeImg = cv2.resize(image,(winW,winH))
cv2.imshow("resizeImg",resizeImg)
cv2.waitKey(0)
grayimg = convert2gray(resizeImg)
cv2.imwrite('F:\\objectmarker\\new_neg\\streets{:08d}.jpg'.format(i+5268),grayimg)
with open("F:\\objectmarker\\new_sample_neg.dat",'a',encoding='utf-8') as f:
f.write("streets{_i_}.jpg 1 0 0 {_w_} {_h_}\n".format(_i_=str(i+5268).zfill(8),_w_=winW,_h_=winH))
i +=1
resize结果:
2.生成样本描述文件
样本描述文件也即.vec文件,里面存放二进制数据,是为opencv训练做准备的。只有正样本需要生成.vec文件,负样本不用,负样本用.dat文件就够。在生成描述文件过程中,我们需要用到opencv自带的opencv_createsamples.exe可执行文件。这个文件一般存放在opencv安装目录的/bin文件夹下(请善用ctrl+F搜索)。如果没有,可以自己编译一遍也很快。这里提供懒人版:http://en.pudn.com/downloads204/sourcecode/graph/texture_mapping/detail958471_en.html 这是别人编译出来的opencv工程,在bin底下可以找到该exe文件。要注意,该exe依赖于cv200.dll、cxcore200.dll、highgui200.dll这三个动态库,要保持这四个文件在同个目录下。
使用opencv_createsamples.exe创建样本描述文件,先把sample_pos.dat放到pos文件夹里:
打开cmd.exe,cd到opencv_createsamples.exe所在的目录,执行命令:
opencv_createsamples.exe -info ./pos/sample_pos.dat -vec ./pos/sample_pos.vec -num 3011 -w 40 -h 20 -show YES
参数说明:-info,指样本说明文件
-vec,样本描述文件的名字及路径
-num,总共几个样本,要注意,这里的样本数是指标定后的h=20,w=40的样本数,而不是大图的数目,其实就是样本说明文件第2列的所有数字累加和。
提供计数程序:
'''
Created on 2017年10月7日
@author: XT
'''
with open("F:\\objectmarker\\neg\\sample_neg.dat",'r',encoding='utf-8') as f1,\
open("F:\\objectmarker\\neg\\new_sample_neg.dat",'w',encoding='utf-8')as f2:
total_sum = 0#用于计数
for line in f1:
str_line = line.split(" ")[:-2]
str_line.append('600')
str_line.append('400')
new_line = " ".join(str_line)
total_sum += int(str_line[1])
f2.write(new_line)
f2.write("\n")
print("count_pos_number:",total_sum)
-w -h 指明想让样本缩放到什么尺寸。这里的奥妙在于你不必另外去处理第1步中被矩形框出的图片的尺寸,因为这个参数帮你统一缩放!
-show 是否显示每个样本。样本少可以设为YES,要是样本多的话最好设为NO,或者不要显式地设置,因为关窗口会关到你哭
实际操作:
cd /d F:\objectmarker
opencv_createsamples.exe -info ./pos/sample_pos.dat -vec ./pos/sample_pos.vec -num 3011 -w 40 -h 20
参考
【0】【MFC基础入门】基于Adaboost算法的车牌检测在OpenCV上的研究与实现 - CSDN博客
http://blog.csdn.net/mkrcpp/article/details/8930381
【1】【原】训练自己haar-like特征分类器并识别物体(1) - 编程小翁 - 博客园
http://www.cnblogs.com/wengzilin/p/3845271.html
【2】训练自己haar-like特征分类器并识别物体(2) - 罗索实验室
http://www.rosoo.net/a/201504/17273.html
【3】如何利用OpenCV自带的haar training程序训练分类器 - 计算机视觉小菜鸟的专栏 - CSDN博客
http://blog.csdn.net/carson2005/article/details/8171571
【4】使用Adaboost训练手掌检测器 - xbcReal的博客 - CSDN博客
http://blog.csdn.net/xbcreal/article/details/77994719
【5】Haar+cascade训练说明手册_百度文库
https://wenku.baidu.com/view/65714b4da0116c175e0e4869.html
【6】OpenCV训练分类器制作xml文档-博客-云栖社区-阿里云
https://yq.aliyun.com/articles/9316?spm=5176.100239.blogcont9317.7.Yk8EnH