自定义检测训练
文章大纲 (Article Outline)
- Overview of Detectron2 Detectron2概述
- Overview of our custom dataset 我们的自定义数据集概述
- Install Detectron2 dependencies 安装Detectron2依赖项
- Download custom Detectron2 object detection data 下载自定义Detectron2对象检测数据
- Visualize Detectron2 training data 可视化Detectron2训练数据
- Write our Detectron2 training configuration 编写我们的Detectron2培训配置
- Run Detectron2 training 运行Detectron2培训
- Evaluate Detectron2 performance 评估Detectron2性能
- Run Detectron2 inference on test images 在测试图像上运行Detectron2推理
自定义Detectron2培训资源 (Custom Detectron2 Training Resources)
Detectron2概述 (Overview of Detectron2)
Detectron2 is a popular PyTorch based modular computer vision model library. It is the second iteration of Detectron, originally written in Caffe2. The Detectron2 system allows you to plug in custom state of the art computer vision technologies into your workflow. Quoting the Detectron2 release blog:
Detectron2是一个流行的基于PyTorch的模块化计算机视觉模型库。 这是Detectron的第二次迭代,最初是用Caffe2编写的。 Detectron2系统使您可以将自定义的最先进的计算机视觉技术插入工作流程。 引用Detectron2发布博客 :
Detectron2 includes all the models that were available in the original Detectron, such as Faster R-CNN, Mask R-CNN, RetinaNet, and DensePose. It also features several new models, including Cascade R-CNN, Panoptic FPN, and TensorMask, and we will continue to add more algorithms. We’ve also added features such as synchronous Batch Norm and support for new datasets like LVIS
In this post, we review how to train Detectron2 on custom data for specifically object detection. Though, after you finish reading you will be familiar with the Detectron2 ecosystem and you will be able to generalize to other capabilities included in Detectron2.
在这篇文章中,我们回顾了如何在自定义数据上训练Detectron2 专门用于对象检测 。 但是,在阅读完之后,您将熟悉Detectron2生态系统,并且可以将其推广到Detectron2中包含的其他功能。
我们的自定义数据概述 (Overview of Our Custom Data)
We will be training our custom Detectron2 detector on public blood cell detection data hosted for free at Roboflow. The blood cell detection dataset is representative of a small custom object detection dataset that one might collect to construct a custom object detection system. Notably, blood cell detection is not a capability available in Detectron2 — we need to train the underlying networks to fit our custom task.
我们将在Roboflow免费托管的公共血细胞检测数据上训练定制的Detectron2检测器。 血细胞检测数据集代表一个小型定制对象检测数据集,人们可能会收集该数据来构建定制对象检测系统。 值得注意的是,Detronron2无法提供血细胞检测功能-我们需要训练基础网络以适应我们的自定义任务。
If you want to follow along step by step in the tutorial, you can fork this public blood cell dataset. Otherwise you can upload your own dataset in any format (more below).
如果您想按照本教程中的步骤进行操作,则可以分叉此公共血细胞数据集 。 否则,您可以上载任何格式的数据集(更多信息请参见下文)。
安装Detectron2依赖项 (Install Detectron2 dependencies)
To get started make a copy of this Colab Notebook Implementing Detectron2 on Custom Data. Google Colab provides us with free GPU resources so make sure to enable them by checking Runtime → Change runtime type → GPU.
首先,制作一份在Custom Data上实现Detectron2的Colab Notebook副本。 Google Colab为我们提供了免费的GPU资源,因此请确保通过选中运行时→更改运行时类型→GPU来启用它们。
To start training our custom detector we install torch==1.5
and torchvision==0.6
- then after importing torch
we can check the version of torch and make doubly sure that a GPU is available printing 1.5.0+cu101 True
.
开始训练我们的自定义检测我们安装torch==1.5
和torchvision==0.6
-然后导入之后torch
就可以检查火炬的版本,为了以防万一,一个GPU可打印1.5.0+cu101 True
。
Then we pip install the Detectron2 library and make a number of submodule imports.
然后,我们pip安装Detectron2库并进行许多子模块导入。
!pip install detectron2==0.1.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/index.htmlimport detectron2
from detectron2.utils.logger import setup_logger
setup_logger()# import some common libraries
import numpy as np
import cv2
import random
from google.colab.patches import cv2_imshow# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer
from detectron2.data import MetadataCatalog
from detectron2.data.catalog import DatasetCatalog
Detectron2 dependencies installed.
已安装Detectron2依赖项。
下载自定义Detectron2对象检测数据 (Download custom Detectron2 object detection data)
We download our custom data in COCO JSON format from Roboflow with a single line of code — this is the only line of code you need to change to train on your own custom objects!
我们仅需一行代码即可从Roboflow下载COCO JSON格式的自定义数据-这是您需要更改以对自己的自定义对象进行训练的唯一一行代码!
NOTE: In this tutorial we export object detection data with bounding boxes. Roboflow does not currently support semantic segmentation annotation formats. Sign up to be notified when we do.
注意:在本教程中,我们将导出带有边界框的对象检测数据。 Roboflow当前不支持语义分段注释格式。 注册后我们会通知您。
If you have unlabeled images, you will first need to label them. For free open source labeling tools, we recommend the following guides on getting started with LabelImg or getting started with CVAT annotation tools. Try labeling ~50 images to proceed in this tutorial. To improve your model’s performance later, you will want to label more.
如果您有未标记的图像,则首先需要标记它们。 对于免费的开源标签工具,我们建议以下有关LabelImg入门或CVAT注释工具入门的指南 。 尝试标记约50张图像以继续本教程。 为了以后提高模型的性能,您将需要添加更多标签。
You may also consider building a free object detection dataset from Open Images.
您也可以考虑从Open Images构建免费的对象检测数据集 。
Once you have labeled data, to get move your data into Roboflow, create a free account and then you can drag your dataset in in any format: (VOC XML, COCO JSON, TensorFlow Object Detection CSV, etc).
标记数据后,要将数据移动到Roboflow中,请创建一个免费帐户 ,然后可以以任何格式拖动数据集:(VOC XML,COCO JSON,TensorFlow对象检测CSV等)。
Once uploaded you can choose preprocessing and augmentation steps:
上传后,您可以选择预处理和扩充步骤:
Then, click Generate
and Download
and you will be able to choose COCO JSON format.
然后,单击Generate
并Download
,您将能够选择COCO JSON格式。
When prompted, be sure to select “Show Code Snippet.” This will output a download curl script so you can easily port your data into Colab in the proper format.
出现提示时,请确保选择“显示代码段”。 这将输出一个下载curl脚本,因此您可以轻松地将数据以正确的格式移植到Colab中。
Then, Detectron2 keeps track of a list of available datasets in a registry
, so we must register our custom data with Detectron2 so it can be invoked for training.
然后,Detectron2跟踪registry
中的可用数据集列表,因此我们必须向Detectron2注册我们的自定义数据,以便可以调用它进行训练。
from detectron2.data.datasets import register_coco_instances
register_coco_instances("my_dataset_train", {}, "/content/train/_annotations.coco.json", "/content/train")
register_coco_instances("my_dataset_val", {}, "/content/valid/_annotations.coco.json", "/content/valid")
register_coco_instances("my_dataset_test", {}, "/content/test/_annotations.coco.json", "/content/test")
Detectron2 data registered.
Detectron2数据已注册。
可视化Detectron2训练数据 (Visualize Detectron2 training data)
Detectron2 makes it easy to view our training data to make sure the data has imported correctly. We do so with the following
Detectron2使查看我们的训练数据变得容易,以确保正确导入了数据。 我们这样做如下
#visualize training data
my_dataset_train_metadata = MetadataCatalog.get("my_dataset_train")
dataset_dicts = DatasetCatalog.get("my_dataset_train")import random
from detectron2.utils.visualizer import Visualizerfor d in random.sample(dataset_dicts, 3):
img = cv2.imread(d["file_name"])
visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_train_metadata, scale=0.5)
vis = visualizer.draw_dataset_dict(d)
cv2_imshow(vis.get_image()[:, :, ::-1])
Looks like our dataset registered correctly.
看起来我们的数据集已正确注册。
编写我们的Detectron2培训配置 (Write our Detectron2 training configuration)
Next we write our custom training configuration.
接下来,我们编写我们的自定义训练配置。
cfg = get_cfg()
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml"))
cfg.DATASETS.TRAIN = ("my_dataset_train",)
cfg.DATASETS.TEST = ("my_dataset_val",)cfg.DATALOADER.NUM_WORKERS = 4
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml") # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 4
cfg.SOLVER.BASE_LR = 0.001cfg.SOLVER.WARMUP_ITERS = 1000
cfg.SOLVER.MAX_ITER = 1500 #adjust up if val mAP is still rising, adjust down if overfit
cfg.SOLVER.STEPS = (1000, 1500)
cfg.SOLVER.GAMMA = 0.05cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 4cfg.TEST.EVAL_PERIOD = 500
The biggest fixtures we have invoked here are the type of object detection model — the large Faster RCNN. Detectron2 allows you many options in determining your model architecture, which you can see in the Detectron2 model zoo.
我们在这里调用的最大装置是对象检测模型的类型-大型Faster RCNN。 Detectron2为您提供了许多确定模型架构的选项,您可以在Detectron2模型动物园中看到它们 。
For object detection alone, the following models are available:
仅对于对象检测,可以使用以下模型:
The other large config choice we have made is the MAX_ITER
parameter. This specifies how long the model will train for, you may need to adjust up and down based on the validation metrics you are seeing.
我们做出的另一个大型配置选择是MAX_ITER
参数。 这指定了模型训练的时间,您可能需要根据看到的验证指标上下调整。
运行Detectron2培训 (Run Detectron2 training)
Before starting training, we need to make sure that the model validates against our validation set. Unfortunately, this does not happen by default 🤔.
在开始训练之前,我们需要确保模型根据我们的验证集进行验证。 不幸的是,默认情况下不会发生这种情况。
We can easily do this by defining our custom trainer based on the Default Trainer
with the COCO Evaluator
:
我们可以通过使用COCO Evaluator
的Default Trainer
定义自定义教练来轻松地做到这一点:
from detectron2.engine import DefaultTrainer
from detectron2.evaluation import COCOEvaluatorclass CocoTrainer(DefaultTrainer):@classmethod
def build_evaluator(cls, cfg, dataset_name, output_folder=None):if output_folder is None:
os.makedirs("coco_eval", exist_ok=True)
output_folder = "coco_eval"return COCOEvaluator(dataset_name, cfg, False, output_folder)
Ok now that we have our COCO Trainer
we can kick off training:
好了,现在有了我们的COCO Trainer
我们可以开始培训了:
The training will run for a while and print out evaluation metrics on our validation set. Curious to learn what mAP is for evaluation? Check out this article on breaking down mAP.
培训将进行一段时间,并在我们的验证集中打印出评估指标。 好奇地了解什么是评估的mAP? 查看有关分解mAP的本文。
Once training is finished, we can move on to evaluation and inference!
培训结束后,我们可以继续进行评估和推断!
评估Detectron2性能 (Evaluate Detectron2 performance)
First, we can display a tensorboard of results to see how the training procedure has performed.
首先,我们可以显示结果张量板,以查看训练过程的执行情况。
There are a lot of metrics of interest in there — most notably total_loss
and validation mAP
.
那里有很多有趣的指标,最值得注意的是total_loss
和validation mAP
。
We run the same evaluation procedure used in our validation mAP on the test set.
我们对测试集运行验证mAP中使用的相同评估程序。
from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader
from detectron2.evaluation import COCOEvaluator, inference_on_datasetcfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85
predictor = DefaultPredictor(cfg)
evaluator = COCOEvaluator("my_dataset_test", cfg, False, output_dir="./output/")
val_loader = build_detection_test_loader(cfg, "my_dataset_test")
inference_on_dataset(trainer.model, val_loader, evaluator)
Yielding:
屈服:
Accumulating evaluation results...
DONE (t=0.03s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.592
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.881
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.677
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.178
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.613
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.411
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.392
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.633
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.684
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.257
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.709
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.439
[06/23 18:39:47 d2.evaluation.coco_evaluation]: Evaluation results for bbox:
| AP | AP50 | AP75 | APs | APm | APl |
|:------:|:------:|:------:|:------:|:------:|:------:|
| 59.169 | 88.066 | 67.740 | 17.805 | 61.333 | 41.070 |
[06/23 18:39:47 d2.evaluation.coco_evaluation]: Per-category bbox AP:
| category | AP | category | AP | category | AP |
|:-----------|:-------|:-----------|:-------|:-----------|:-------|
| cells | nan | Platelets | 40.141 | RBC | 60.326 |
| WBC | 77.039 | | | | |
This evaluation will give you a good idea of how your new custom Detectron2 detector will perform in the wild. Again, if you are curious to learn more about these metrics see this post breaking down mAP.
该评估将使您对新的自定义Detectron2检测器在野外的性能有一个很好的了解。 同样,如果您想了解更多有关这些指标的信息,请参阅这篇分解了mAP的文章 。
在测试图像上运行Detectron2推理 (Run Detectron2 inference on test images)
And finally, we can run our new custom Detectron2 detector on real images! Note, these are images that the model has never seen
最后,我们可以在真实图像上运行新的自定义Detectron2检测器! 请注意,这些是模型从未见过的图像
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.DATASETS.TEST = ("my_dataset_test", )
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model
predictor = DefaultPredictor(cfg)
test_metadata = MetadataCatalog.get("my_dataset_test")from detectron2.utils.visualizer import ColorMode
import globfor imageName in glob.glob('/content/test/*jpg'):
im = cv2.imread(imageName)
outputs = predictor(im)
v = Visualizer(im[:, :, ::-1],
metadata=test_metadata,
scale=0.8
)
out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(out.get_image()[:, :, ::-1])
Yielding:
屈服:
Our model makes good predictions showing that it has learned how to identify red blood cells, white blood cells, and platelets.
我们的模型做出了很好的预测,表明它已经学会了如何识别红细胞,白细胞和血小板。
You may consider playing with the SCORE_THRESH_TEST
to change the confidence threshold that the model requires to make a prediction.
您可以考虑使用SCORE_THRESH_TEST
来更改模型进行预测所需的置信度阈值。
You can now save the weights in the os.path.join(cfg.OUTPUT_DIR, "model_final.pt")
for future inference by exporting to Google Drive.
现在,您可以通过导出到Google云端硬盘将权重保存在os.path.join(cfg.OUTPUT_DIR, "model_final.pt")
以备将来推断。
You can also see the underlying prediction tensor in the outputs
object to use elsewhere in your app.
您还可以在outputs
对象中看到基础的预测张量,以在应用程序的其他地方使用。
结论 (Conclusion)
Congratulations! Now you know how to train your own custom Detectron2 detector on a completely new domain.
恭喜你! 现在,您知道了如何在全新的域上训练自己的自定义Detectron2检测器。
Not seeing the results you need to move forward? Object detection models have been improved since the release of the Detectron2 model zoo — consider checking out some of our other tutorials such as How to Train YOLOv5 and How to Train YOLOv4, or this writeup on improvements in YOLO v5.
没有看到需要前进的结果吗? 自从Detectron2模型库发布以来,对象检测模型已经得到了改进-考虑查看我们的其他一些教程,例如“ 如何训练YOLOv5”和“ 如何训练YOLOv4” ,或者有关YOLO v5改进的文章 。
翻译自: https://towardsdatascience.com/how-to-train-detectron2-on-custom-object-detection-data-be9d1c233e4
自定义检测训练