使用wider face/自己的数据集训练faster rcnn等模型的记录

AQUILIOS

已于 2024-02-10 07:29:39 修改

阅读量1.9k

点赞数 38

文章标签：目标检测 python 改行学it

于 2024-02-09 22:57:33 首次发布

本文链接：https://blog.csdn.net/AQUILIOS/article/details/136087373

版权

文章目录

1 模型代码来源
2 对Wider Face数据集的训练处理
3 成果展示
未完待续....

1 模型代码来源

我使用的模型是B站博主@霹雳吧啦Wz的代码，他写模型的读取方法时考虑的是pascal voc这个数据集，因此如果我们如果使用其他数据集，难免会遇见一些问题，这里记录了我是如何解决的。
ps: 如果使用ultralytics版本的yolov8，他们的代码会有忽略错误标框的能力。

2 对Wider Face数据集的训练处理

2.1 转换为voc格式

参考这篇的做法，一套流程下来，voc格式，coco格式和yolo格式都有了。

2.2 错误标框导致训练不能进行的问题

我一开始时用他仓库目录下的retinanet这个模型，代码有一定的容错能力，在碰见宽度/长度为0的框时会raise error，具体的代码在retinaNet/network_files/retinanet.py的482-493行

        if targets is not None:
            for target_idx, target in enumerate(targets):
                boxes = target["boxes"]
                degenerate_boxes = boxes[:, 2:] <= boxes[:, :2]
                if degenerate_boxes.any():
                    continue # <----这个continue是我加的，不然训练会停
                    # print the first degenerate box
                    bb_idx = torch.where(degenerate_boxes.any(dim=1))[0][0]
                    degen_bb: List[float] = boxes[bb_idx].tolist()
                    raise ValueError("All bounding boxes should have positive height and width."
                                     " Found invalid box {} for target at index {}."
                                     .format(degen_bb, target_idx))