Detectron2源码分析- demo-对象检测2-数据解析

最新推荐文章于 2024-07-30 16:23:33 发布

维民所止1226

最新推荐文章于 2024-07-30 16:23:33 发布

阅读量1.4k

点赞数

分类专栏： ML 文章标签： detectron2

本文链接：https://blog.csdn.net/xiashaohua/article/details/106163124

版权

ML 专栏收录该内容

7 篇文章 0 订阅

订阅专栏

输入命令：

python demo/demo.py --config-file configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml --input 001.jpg --output results --opts MODEL.WEIGHTS detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl

采用的图像：

输出结果：

从predictor()返回的数据如下，

[32m[05/14 15:39:49 detectron2]:[0mArguments: Namespace(confidence_threshold=0.5, config_file='configs/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x.yaml', input=['001.jpg'], opts=['MODEL.WEIGHTS', 'detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl'], output='results', video_input=None, webcam=False)
: cpu_device= cpu
[32m[05/14 15:39:51 fvcore.common.checkpoint]:[0mLoading checkpoint from detectron2://COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[32m[05/14 15:39:51 fvcore.common.file_io]:[0mURL https://dl.fbaipublicfiles.com/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl cached in /home/lappai/.torch/fvcore_cache/detectron2/COCO-InstanceSegmentation/mask_rcnn_R_50_FPN_3x/137849600/model_final_f10217.pkl
[32m[05/14 15:39:51 fvcore.common.checkpoint]:[0mReading a file from 'Detectron2 Model Zoo'
: args.input= ['001.jpg']
__call__:

: run_on_image predictions= {'instances': Instances(num_instances=16, image_height=342, image_width=512, fields=[pred_boxes: Boxes(tensor([[8.4740e+00, 4.6892e+01, 1.4996e+02, 3.3636e+02],
[1.2094e+02, 2.8676e+01, 2.4164e+02, 3.4125e+02],
[3.9989e+02, 1.1977e+02, 5.0410e+02, 3.4135e+02],
[2.3525e+02, 5.9974e+01, 3.8057e+02, 3.4017e+02],
[3.5989e+02, 1.0638e+02, 4.3428e+02, 3.2155e+02],
[4.1590e+02, 1.0385e+02, 4.4406e+02, 1.5214e+02],
[2.7101e+02, 8.3826e+01, 3.0224e+02, 1.5380e+02],
[2.8008e+02, 1.1305e+02, 3.2311e+02, 1.8048e+02],
[3.1624e+02, 1.6404e+02, 4.0676e+02, 2.9497e+02],
[3.0986e+02, 5.6478e+01, 3.8312e+02, 2.0319e+02],
[1.1140e+00, 8.9818e+01, 6.5928e+01, 1.8706e+02],
[0.0000e+00, 1.0031e+02, 5.6573e+01, 3.3716e+02],
[1.3246e-01, 1.2312e+02, 6.7227e+01, 1.6550e+02],
[1.3788e-02, 8.6321e+01, 2.8173e+01, 1.4170e+02],
[4.8467e+02, 1.7300e+02, 5.1018e+02, 2.8373e+02],
[4.0865e+02, 9.6892e+01, 4.2856e+02, 1.4300e+02]], device='cuda:0')), scores: tensor([0.9969, 0.9952, 0.9943, 0.9886, 0.9663, 0.9632, 0.8624, 0.7518, 0.6952,
0.6793, 0.5957, 0.5795, 0.5773, 0.5474, 0.5355, 0.5209],
device='cuda:0'), pred_classes: tensor([ 0, 0, 0, 0, 0, 0, 0, 0, 26, 0, 0, 0, 0, 0, 26, 0],
device='cuda:0'), pred_masks: tensor([[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]],

...,

[[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
...,
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False],
[False, False, False, ..., False, False, False]]], device='cuda:0')])}
[32m[05/14 15:39:51 detectron2]:[0m001.jpg: detected 16 instances in 0.12s
: args.output= results
: out_filename= results

根据如下的格式说明

https://detectron2.readthedocs.io/tutorials/models.html#model-output-format

Model Output Format

When in training mode, the builtin models output a dict[str->ScalarTensor] with all the losses.

When in inference mode, the builtin models output a list[dict], one dict for each image. Based on the tasks the model is doing, each dict may contain the following fields:

“instances”: Instances object with the following fields:
- “pred_boxes”: Boxes object storing N boxes, one for each detected instance.
- “scores”: Tensor, a vector of N scores.
- “pred_classes”: Tensor, a vector of N labels in range [0, num_categories).
- “pred_masks”: a Tensor of shape (N, H, W), masks for each detected instance.
- “pred_keypoints”: a Tensor of shape (N, num_keypoint, 3). Each row in the last dimension is (x, y, score). Scores are larger than 0.
“sem_seg”: Tensor of (num_categories, H, W), the semantic segmentation prediction.
“proposals”: Instances object with the following fields:
- “proposal_boxes”: Boxes object storing N boxes.
- “objectness_logits”: a torch vector of N scores.
“panoptic_seg”: A tuple of (Tensor, list[dict]). The tensor has shape (H, W), where each element represent the segment id of the pixel. Each dict describes one segment id and has the following fields:
- “id”: the segment id
- “isthing”: whether the segment is a thing or stuff
- “category_id”: the category id of this segment. It represents the thing class id when isthing==True, and the stuff class id otherwise.