（八）fastai 目标检测 object detection

_helen_520

已于 2022-11-03 16:38:58 修改

阅读量1.1k

点赞数

分类专栏： fastai学习笔记文章标签：目标检测计算机视觉深度学习

于 2022-08-25 17:07:09 首次发布

本文链接：https://blog.csdn.net/haronchou/article/details/126518447

版权

fastai学习笔记专栏收录该内容

20 篇文章 7 订阅

订阅专栏

首先fastai使用SSD在VOC2007数据集上，源于fastai2018 course part2, lesson8和lesson9。这两课的学习笔记为：（七）fastai 2018 lesson8 目标检测 ~ lesson9 目标检测__helen_520的博客-CSDN博客
对应的代码为：环境是fastai0.7 即，fastai1/old https://nbviewer.org/github/fastai/fastai1/tree/master/courses/
- 很多函数是自己写的，可以详细研究怎么做的。
- pascal.ipynb, pascal_multi.ipynb 是自己写的dataloader，代码更详细
jav将其改为fastai1.0的版本，使用fastai1.0的dataloader库，数据块用的现成的。但是训练效果更好，利于研究。自己需要把mAP的指标加上去。
下面研究fastai1.0的SSD.ipynb代码是具体咋实现的，以及和paper的逻辑对应。
- 更重要的是：要一点点的提高mAP指标。

0. 参考资料

fastai course: fastai2018 part2 lesson8~9 是目标检测专题。只是dl2的jupyter notebook全是基于fastai0.7的版本的。在fastai1的环境下，也可以执行。只要是插入了fastai0.7的源码库就可以了。可以运行 https://nbviewer.org/github/fastai/fastai1/tree/master/courses/
jav将这些在fastai1的库环境中实现了，并进行了优化，有三个版本v1,v2,v3三个版本，还加上了新的数据增强的方法。https://forums.fast.ai/t/ssd-object-detection-in-v1/52512/14
IceVision 是一个CV库；
还有一个fastai目标检测库：Metrics | fastai_object_detection

Howard 教授在 2018 年高级课程（第 2 部分）中展示了 SSD Single Shot Object Detection 的示例。这是一个非常有趣但很复杂的笔记本；不幸的是，它是用 Fastai V0.7 编写的。
我已将 SSD 笔记本移植到 Fastai V1 中，效果非常好。您可以在我的 GitHub 中找到它，网址为 https://colab.research.google.com/github/jav0927/course-v3/blob/master/SSD_Object_Detection.ipynb 212
这是课程中最具挑战性的模型，也是一个很好的学习工具。我希望它对社区有价值。在开发模型时，我受到了 Henin 之前工作的启发。

1. 调试记录

为了方便作比较，先写metric函数，计算目标检测中的mAP指标。

【YOLO学习】召回率（Recall），精确率（Precision），平均正确率（Average_precision(AP) ），交除并（Intersection-over-Union（IoU））_Joe_quan-DevPress官方社区

precision和recall，调整阈值 Precision-recall 曲线
- 如果你想评估一个分类器的性能，一个比较好的方法就是：观察当阈值变化时，Precision与Recall值的变化情况。如果一个分类器的性能比较好，那么它应该有如下的表现：被识别出的图片中飞机所占的比重比较大，并且在识别出大雁之前，尽可能多地正确识别出飞机，也就是让Recall值增长的同时保持Precision的值在一个很高的水平。而性能比较差的分类器可能会损失很多Precision值才能换来Recall值的提高。通常情况下，文章中都会使用Precision-recall曲线，来显示出分类器在Precision与Recall之间的权衡。
- 目标检测（Object Detection）中性能衡量指标mean Average Precision（mAP）的含义与计算_asasasaababab的博客-CSDN博客
  ① 不同的排序，precision就不一样。②

1.1 mAP metric实现

参考资料：https://fastai1.fast.ai/metrics.html 在fastai1中自定义metric的实现方法

mAP fastai 实现

mean_average_precision 代码阅读：目标检测mAP

https://img-blog.csdnimg.cn/5af8d43fa7204ab09fe4b02ca683ef7a.png

https://www.jianshu.com/p/82be426f776e?u_atoken=92f1a4ac-1f38-424e-bd5d-e13ae75eca06&u_asession=01MvY9RNeNKwQlYnMD9Hc_K3eX6exil6XSk_X4_wWZz0_LZRfZ9R9ZQmxJo6YLqJquX0KNBwm7Lovlpxjd_P_q4JsKWYrT3W_NKPr8w6oU7K9ftvHNHiPBVlg33AaBaFhpCvvWHyhA8I9G3hxoTho1LGBkFo3NEHBv0PZUm6pbxQU&u_asig=05tZzfVi1GOV4qK6fh5h9Gg0CQEsM4j3i7J0ECfvAn5RiOj8400gbKYtvl3pud9waa9YpWWkFQdASvdk_UXLWFrUQjJwawMWwm9WEnGPOyqUD9R9a-imlIWQHmOnXxzB815E-9WTWOdh6FHU9YrpOzF7rpdbkP2u41mr2nrhHbssr9JS7q8ZD7Xtz2Ly-b0kmuyAKRFSVJkkdwVUnyHAIJzfoNu94mvcy7Lbsw_tvtWXTp3s9_xExIa5wcJrmlNdnsChTz2MQxpCmDDGYlh3aZze3h9VXwMyh6PgyDIVSG1W_1F70odfP7dqZSPD-SupvBMcP86Dpk-9o5WW2hT6bxqnLVjL3gJGcIUWUdcOMi4KpUXppGBbE7EV6HNvNEWtsNmWspDxyAEEo4kbsryBKb9Q&u_aref=fVQ8LZgBzeAw4fqILwiJYD%2BtNLo%3D

目标检测中的mAP是什么含义？ - 知乎

import numpy as np
from mean_average_precision import MetricBuilder

# [xmin, ymin, xmax, ymax, class_id, difficult, crowd]
gt = np.array([
    [439, 157, 556, 241, 0, 0, 0],
    [437, 246, 518, 351, 0, 0, 0],
    [515, 306, 595, 375, 0, 0, 0],
    [407, 386, 531, 476, 0, 0, 0],
    [544, 419, 621, 476, 0, 0, 0],
    [609, 297, 636, 392, 0, 0, 0]
])

# [xmin, ymin, xmax, ymax, class_id, confidence]
preds = np.array([
    [429, 219, 528, 247, 0, 0.460851],
    [433, 260, 506, 336, 0, 0.269833],
    [518, 314, 603, 369, 0, 0.462608],
    [592, 310, 634, 388, 0, 0.298196],
    [403, 384, 517, 461, 0, 0.382881],
    [405, 429, 519, 470, 0, 0.369369],
    [433, 272, 499, 341, 0, 0.272826],
    [413, 390, 515, 459, 0, 0.619459]
])

# print list of available metrics
print(MetricBuilder.get_metrics_list())

# create metric_fn
metric_fn = MetricBuilder.build_evaluation_metric("map_2d", async_mode=True, num_classes=1)

# add some samples to evaluation
for i in range(10):
    metric_fn.add(preds, gt)

# compute PASCAL VOC metric
print(f"VOC PASCAL mAP: {metric_fn.value(iou_thresholds=0.5, recall_thresholds=np.arange(0., 1.1, 0.1))['mAP']}")

# compute PASCAL VOC metric at the all points
print(f"VOC PASCAL mAP in all points: {metric_fn.value(iou_thresholds=0.5)['mAP']}")

# compute metric COCO metric
print(f"COCO mAP: {metric_fn.value(iou_thresholds=np.arange(0.5, 1.0, 0.05), recall_thresholds=np.arange(0., 1.01, 0.01), mpolicy='soft')['mAP']}")

1.2 fastai 2018 lesson9 SSD

mAP在fastai 2.x版本中如何运行的？如何实现网络到nms，到最后的检测框这一步的？

fit_one_cycle->learn.fit->

callback主要有两个：TrainEvalCallback和Recorder

TrainEvalCallback一般没有太多工作，就是记录epoch的个数，当前的训练进度，到了第几个epoch的第几个batch之类的。进度控制等。

fit的逻辑：

二、SSD代码阅读记录

Howard 教授在 2018 年高级课程（第 2 部分）中展示了 SSD Single Shot Object Detection 的示例。这是一个非常有趣但很复杂的笔记本；不幸的是，它是用 Fastai V0.7 编写的。
Jav0927已将 SSD 笔记本移植到 Fastai V1 中，效果非常好。您可以在我的 GitHub 中找到它，网址为 https://colab.research.google.com/github/jav0927/course-v3/blob/master/SSD_Object_Detection.ipynb 212

基于fastai v1，VOC，SSD网络

2.1 dataloader部分

Rectangle的bbox格式：[xcol, yrow, width, height]
VOC原始标签中的格式也是 [xcol, yrow, width, height]
fastai的格式为：[y1, x1, y2, x2]

数据做的内容
- 见 <三、函数关系图>

path = Path('/home/helen/dataset/pascal_2007')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu') 


trn_im_names, trn_truths = get_annotations(path/'train.json')
val_im_names, val_truths = get_annotations(path/'valid.json')
tst_im_names, tst_truths = get_annotations(path/'test.json') 


tot_im_names, tot_truths = [trn_im_names + val_im_names, trn_truths + val_truths]
# Create a dictionary containing the composite of the above
img_y_dict = dict(zip(tot_im_names, tot_truths))
# Define a function, based on the dictionary created above, to use in a Fastai Data Block to structure the input data
truth_data_func = lambda o: img_y_dict[o.name]


sz=224       # Image size
bs=128        # Batch size


np.random.seed(35) # cutout,在图像上随机挖几个洞；
tfms = get_transforms(cutout(n_holes=(1,4), length=(10, 160), p=.5), max_rotate=4., max_zoom=1.1, p_affine=0.5, p_lighting=0.5 )
data = (ObjectItemList
        .from_folder(path/'train')#get_files->ImageList
        .split_by_rand_pct(0.02)#->ItemLists
        .label_from_func(truth_data_func) # ->LabelLists
        .transform(tfms=[],size=sz,tfm_y=True,resize_method=ResizeMethod.SQUISH)
        .databunch(bs=bs,collate_fn=bb_pad_collate,num_workers=8,pin_memory=False)#->ImageDataBunch
        .normalize(imagenet_stats)
       )

2.2 loss函数的设计

分类任务有助于回归任务的。

由于loss分为两部分，可以分别计算两个loss的值。
- 参考pascal.ipynb的detn_l1好detn_acc的metric设计，可以将l1-loss单独打印出来。可以看出每个epoch的l1-loss和bbox-loss、clas-loss单独是怎样的。占多少比重。
- 在fastai1的版本中，如何添加自定义的metric呢。
2.2.1 fastai1 自定义metric设计
https://fastai1.fast.ai/metrics.html

训练指标

用于训练 fastai 模型的指标只是简单的函数，它接受input和target张量，并返回一些感兴趣的训练指标。Learner您可以通过定义该类型的函数并将其传递给参数来编写自己的指标metrics，或者使用以下预定义函数之一。

metric的参数表为：(input, target)。
查看代码后，发现可以写为Learn.metrics，给metrics增加就可以。可以写loss，可以写metrics
- 所以去看下在哪里调用的metrics计算，又如何调用的，接口怎么

与上面对照起来看，如果传递常规的函数，就是用AverageCallback回调函数，同时只在valid阶段来调用。
- AverageCallback.func就是用来添加func的；
调用的时候，在CallbackHandler的__call__函数中，就会来调用。先调用metirc计算，再调用其他callbacks

三、函数关系图

对fastai1的fastai文件夹进行类图分析：

python:利用Graphviz和pyreverse 分析类文件自动生成UML图_一从际发的博客-CSDN博客 python:利用Graphviz和pyreverse 分析类文件自动生成UML图
Python自动绘制UML类图、函数调用图（Call Graph）_虾米小馄饨的博客-CSDN博客_python 类图 PyllPython自动绘制UML类图、函数调用图（Call Graph）

# linux端
sudo apt install graphviz
conda install python-graphviz
pip install pylint
# https://blog.csdn.net/qq_36879201/article/details/126244459 pycallgraph 安装失败，一直报subprocess-exited-with-error
pip show setuptools
pip install --upgrade setuptools==57.5.0
pip install pycallgraph
# 使用方法：
pyreverse -o png -p fastai ~/fastai1/fastai

生成了两个图：package_fastai.png classes_fastai.png，下图是 package_fastai.png，包关系依赖图。

https://download.csdn.net/download/haronchou/86868480 fastai类关系图继承图。

下面是fastai的一个大概图：

SSD.py的源码部分的数据加载：

数据加载的类的继承关系图

ImageDataBunch继承自DataBunch

x,y=next(iter(data.train_dl)) 所以去的是LabelList的数据，LabelList(x,y), x为ObjectItem类型，y为ObjectCategory类型。

dl.dataset的getitem调用 x的getitem ，依次调用ImageList的get函数，再调用爷类的ItemList的getitem。
- 爷类负责获取图片路径；父类ImageList负责open图像；子类ObjectItemList负责transform
dl.dataset的getitem调用x的aply_transform。
y在进行transform的时候，进行bbox的create，其中bbox进行了归一化，为[-1,-1]
- bbox的格式为：fastai进去的时候为[y1,x1,y2,x2]，归一化之后，[y1,x1,y2,x2]/hs*2, /ws*2。就是[-1,1]
- 由于没有padding，所以是直接缩放的。

附录：UML类图的箭头含义