[脚本] 用OpenCV的级联分类器一键训练自己的目标检测数据集

Mitre-Z

已于 2023-03-23 18:31:50 修改

阅读量4.5k

点赞数 12

文章标签： python opencv

于 2022-04-29 17:17:18 首次发布

本文链接：https://blog.csdn.net/m0_57193659/article/details/124450430

版权

本文介绍了一个使用Python简化OpenCV级联分类器训练过程的脚本，包括参数配置和测试。作者分享了从YOLO转向OpenCV级联分类器的经验，指出在时间紧迫的情况下，OpenCV提供了可靠的目标检测解决方案。训练涉及正负样本的准备，通过调整参数如样本尺寸、阶段数、特征类型等来优化性能。测试阶段展示了静态目标检测的实现，强调了训练参数对识别精度的影响。

摘要由CSDN通过智能技术生成

这是一个帮助一键完成OpenCV级联分类器参数配置以及训练的脚本，简化了程序的调用与配置

最近需要编写一个目标检测项目，本来用的是YOLO框架，结果一通大刀阔斧自信满满的“优化”再加上长达3个小时的高强度训练后，十个测试样本能检测到一个都算走运了。迫于无奈与时间紧迫，试了一下OpenCV的级联分类器，没想到效果虽谈不上惊人但还算可用。关键时刻还得靠OpenCV救急。

使用步骤：

1.基础文件准备：

需要创建一个文件夹，并且该文件夹下至少存在如下文件：

该脚本已开源至 GitHub:

https://github.com/Z-MiCTrue/iOCO-Cascade_Classifierhttps://github.com/Z-MiCTrue/iOCO-Cascade_Classifier

其中

# opencv_traincascade.exe 并不依赖于python所安装的opencv版本（例如这里的训练器版本源自 OpenCV 4.5.2 而我python环境下的OpenCV版本为 3.4.1）

# pos 文件夹下存放正样本，neg文件夹下存放负样本。其中正样本长宽比最好一致，因为后面算法会自动下采样至统一尺寸，长宽比保持一直可尽可能保证特征不失真，负样本尺寸无限制。同时正负样本数最好保证在1：3左右。存放样例如下图所示（图片文件名随意）：

# xml文件夹存在即可

2.开始训练

训练程序脚本 train.py 部分代码：

import cv2
import os

from options import Parameters


class train_xml:
    def __init__(self, para):
        self.work_path = os.getcwd()
        self.para = para
        # 样本数记录数据
        self.pos_num = 0
        self.neg_num = 0

    def generate_txt(self):
        # 生成正样本txt数据说明以及规范数据格式
        write_str = ''
        for root, dirs, files in os.walk(self.work_path + '\\pos'):  # 工作目录, 子目录, 文件
            for img_name in files:
                img = cv2.imread('pos\\' + img_name)
                h, w = img.shape[:2]
                if w < self.para.aim_w or h < self.para.aim_h:  # 尺寸过小的样本将被舍弃
                    print('log: positive sample discarded')
                else:
                    write_str += f'pos\\{img_name} 1 0 0 {w} {h}\n'
                    self.pos_num += 1
        with open('pos.txt', 'w') as result_file:
            result_file.write(write_str)
        # 生成负样本txt数据说明以及规范数据格式
        write_str = ''
        for root, dirs, files in os.walk(self.work_path + '\\neg'):  # 工作目录, 子目录, 文件
            for img_name in files:
                write_str += f'neg\\{img_name}\n'
                self.neg_num += 1
        with open('neg.txt', 'w') as result_file:
            result_file.write(write_str[:-1])
        # 生成正样本vec数据文件以及打印当前命令
        cmd = f'opencv_createsamples.exe -info {self.work_path}\\pos.txt ' \
              f'-vec pos.vec -num {self.pos_num} -w {self.para.aim_w} -h {self.para.aim_h}'
        print(f'command: {cmd}')
        os.system(cmd)

    def start_train(self, batch_size=48):
        # 这里只是类似于batch size, 即为每一级分类器所用到的正样本数, 设置的数量要小于总体正样本数, 太大会报错
        # 生成批处理bat数据文件
        if worker.pos_num <= batch_size:
            pos_use = worker.pos_num
        else:
            pos_use = batch_size
        if 3 * pos_use >= self.neg_num:
            neg_use = self.neg_num
        else:
            neg_use = 3 * pos_use
        write_str = f'opencv_traincascade -data xml -vec pos.vec -bg {self.work_path}\\neg.txt ' \
                    f'-numStages {self.para.numStages} ' \
                    f'-featureType {self.para.featureType} ' \
                    f'-minHitRate {self.para.minHitRate} ' \
                    f'-maxFalseAlarmRate {self.para.maxFalseAlarmRate} ' \
                    f'-mode {self.para.mode} ' \
                    f'-w {self.para.aim_w} -h {self.para.aim_h} -numPos {pos_use} -numNeg {neg_use}\n\npause'
        with open('start.bat', 'w') as result_file:
            result_file.write(write_str)
        # 输入1开始训练
        continue_switch = int(input('\nFile writing completed. Continue? (0 & 1)\necho: '))
        if continue_switch:
            cmd = 'start.bat'
            print(f'\ncommand: {cmd}')
            os.system(cmd)


if __name__ == '__main__':
    parameters = Parameters()

    worker = train_xml(parameters)
    worker.generate_txt()
    worker.start_train(batch_size=36)

参数配置脚本 options.py 部分代码：

class Parameters:
    def __init__(self):
        # 正样本目标尺度, 也为检测时的尺度
        self.aim_w = 28
        self.aim_h = 28
        # 训练参数
        self.numStages = 20
        self.featureType = 'LBP'
        self.minHitRate = 0.996
        self.maxFalseAlarmRate = 0.12
        self.mode = 'ALL'

路径等参数都已自动获取写入，所有需要调试的参数都在 options.py 里，需要注意的参数如下：

# 正样本目标尺度, 也为检测时的尺度
self.aim_w = '228'
self.aim_h = '228'

# 这决定了最终训练时的输入图像大小，尽量与训练的正样本保持纵横比一直，需要从两个方面考虑：

一方面是硬件的的RAM限制，这里由于我的笔记本RAM是16G，在大约在230多的像素时内存占用就将近90%，所以这东西真的很吃内存；

二是使用时实际环境的输入图片中目标像素大小的限制，而并非这个训练的正样本尺寸越大越清晰就实际使用效果越好。因为最终检测时大多是将输入图像以不同尺度的下采样，再以当前这个目标尺度作为滑动窗口去目标检测。所以这个目标尺度最好包含在实际待检测图片中的实际物体尺寸除以缩放比例组（例如1.1、1.1^2 ...）所组成的尺度组中。可参考下图（图片来源自网络）：

以及一些其他参数的含义如下：

# 这些训练参数需要视具体场景调整（如偏重纹理特征则使用LBP训练），这里就不再做解释，具体原理参见其他文章：

https://spacevision.blog.csdn.net/article/details/82012519https://spacevision.blog.csdn.net/article/details/82012519# 训练样例（就很费内存）：

同时貌似该训练器运算是单线程，即在我8核心16线程的CPU下，CPU占用率稳定下来占用率也只达到百分之十几，这可能也正是它最遗憾的缺陷之一。

3.完成测试

测试脚本 test.py 部分代码:

import numpy as np
import cv2


class Static_detection:
    def __init__(self):
        self.classifier = cv2.CascadeClassifier('cascade.xml')
        self.img = None

    def detect(self, img, draw_box=False):
        loc = []
        res = self.classifier.detectMultiScale(img, scaleFactor=1.1, minNeighbors=8, minSize=(28, 28))
        if draw_box:
            for (x, y, w, h) in res:
                cv2.rectangle(img, (x, y), (x + w, y + h), (255, 0, 0), 2)
                loc.append([x, y, w, h])
            loc = np.array(loc)
            cv2.imshow('detection', img)
            cv2.waitKey(0)
            print('locations are: ', loc)
            return loc
        else:
            for (x, y, w, h) in res:
                loc.append([x, y, w, h])
            loc = np.array(loc)
            return loc


if __name__ == '__main__':
    eagle_eye = Static_detection()
    frame = cv2.imread('test.jpg', 1)
    # frame = cv2.resize(frame, None, fx=1/10, fy=1/10, interpolation=cv2.INTER_AREA)
    eagle_eye.detect(frame, draw_box=True)

将该py文件置于xml文件夹内即可，测试效果：