detectron2官方tutorials翻译-Use Models

最新推荐文章于 2022-11-10 09:35:42 发布

智能血压计

最新推荐文章于 2022-11-10 09:35:42 发布

阅读量394

点赞数 1

分类专栏：目标检测 detectron2 文章标签： pytorch python 人工智能

本文链接：https://blog.csdn.net/lz867422770/article/details/107620049

版权

目标检测同时被 2 个专栏收录

9 篇文章 0 订阅

订阅专栏

detectron2

7 篇文章 0 订阅

订阅专栏

Use Models

Detectron2中的模型构建是通过以下函数实现，例如：build_model、build_backbone、build_roi_heads：

from detectron2.modeling import build_model
model = build_model(cfg)  # 返回一个torch.nn.Module

build_model仅仅是创建了模型，并且填入了随机参数，下面说明如何加载已有模型，和如何使用model对象。

Load/Save a Checkpoint

from detectron2.checkpoint import DetectionCheckpointer
DetectionCheckpointer(model).load(file_path_or_url)   # 从cfg.MODEL.WEIGHTS加载文件
checkpointer = DetectionCheckpointer(model, save_dir="output")
checkpointer.save("model_999")

Detectron的checkpointer通过.pth和.pkl来识别模型文件。

pth文件可以被torch.{load, save}、pkl文件pickle.{dump, load}来使用。

Use a Model

模型可以通过outputs=model(inputs)，其中inputs是一个list[dict]。每个dict关联与一张图片，其keys取决于模型的类型，以及模型的mode是training/evaluation。例如，在推理阶段，所有的模型需要“image”，可选的key有“：和“height”和“width”。以下是输入输出的细节：

Training：在训练模式，所有的模型被要求在EventStorage下使用。训练的统计数据将会被放在这个event下：

from detectron2.utils.events import EventStorage
with EventStorage() as storage:
    losses = model(inputs)

Interence：如果你只是想要简单的运行一下推理，DefaultPredictor是一个模型包装器，正提供了这样的功能，包括：模型加载，预处理，单个图片的处理而不是一个batch。

你可以直接运行下面的代码来进行推理：

model.eval()
with torch.no_grad():
    outputs = model(inputs)

Model Input Format

使用者可以实现支持任意输入格式的模型。这里我们描述一下detectron2中所有预建模型支持的标准输入格式，格式均为list[dict]，每个dict相关联与一张图片。

dict可能包含以下keys：

images: Tensor,(C,H,W)格式，其中C的意义定义在cfg.INPUT.FORMAT中，如果有cfg.MODEL.PIXEL_{MEAN,STD}则会在模型使用中进行数据归一化操作
height, width: 预期输出高宽，不一定与image的输入保持一致
instances: 一个训练过程中模型的instances包括如下字段：
    gt_boxes: 一个Boxes对象，包含N个boxes，每一个都是一个实例
    gt_classes: Tensor或者long类型，取值范围为[0, num_categories]
    gt_maskes: 一个PolygonMasks或BitMasks对象，存储N个masks，每一个都是一个实例
    gt_keypoints: 一个KeyPoints对象
proposals: 一个Instances对象，仅被Fast-RCNN类型的模型使用
sem_seg: Tensor[int]类型，(H, W)尺寸，语义分割的训练标签，其中的取值从0开始

How it connects to data loader:

DataMapper的默认输出符合上述dict的格式，dataloader输出的batching过后的数据，就可以用于模型模型的训练了。

Model Output Format

在训练阶段，构建的模型会输出dict[str->ScalarTensor]的所有loss。

在推理阶段，构建的模型会输出一个list[dict]，每张图片一个。不同的任务模型输出可能不同，大概有以下几种（具体解释与训练阶段的模型输入基本一致，这里不再展开翻译）：

“instances”: Instances object with the following fields:
    “pred_boxes”: Boxes object storing N boxes, one for each detected instance.
    “scores”: Tensor, a vector of N scores.
    “pred_classes”: Tensor, a vector of N labels in range [0, num_categories).
    “pred_masks”: a Tensor of shape (N, H, W), masks for each detected instance.
    “pred_keypoints”: a Tensor of shape (N, num_keypoint, 3). Each row in the last dimension is (x, y, score). Scores are larger than 0.
“sem_seg”: Tensor of (num_categories, H, W), the semantic segmentation prediction.
“proposals”: Instances object with the following fields:
    “proposal_boxes”: Boxes object storing N boxes.
    “objectness_logits”: a torch vector of N scores.
“panoptic_seg”: A tuple of (Tensor, list[dict]). The tensor has shape (H, W), where each element represent the segment id of the pixel. Each dict describes one segment id and has the following fields:
    “id”: the segment id
    “isthing”: whether the segment is a thing or stuff
    “category_id”: the category id of this segment. It represents the thing class id when isthing==True, and the stuff class id otherwise.

Partially execute a model:

有时候，你可能仅仅需要一个模型内部的某个Tensor，比如特定的层的输出。因为通常有上百个中间Tensor，因此提供了一个你需要的中间结果获取的API。你需要遵循以下几点：

编写模型。顺着Totorial，你可以覆写一个模型的组成部分，如head，使得它可以完成原来的工作，又可以输出所需结果
部分执行模型。你可以向通常一样创建一个模型，但使用自定义代码执行它，而不是使用它的forward()方法。例如，下面的代码可以在mask head前获取mask特征：

images = ImageList.from_tensors(...) .  # preprocessed input tensor
model = build_model(cfg)
features = model.backbone(images.tensor)
proposal, _ = model.proposal_generator(images, features)
instances = model.roi_heads._forward_box(features, proposals)
mask_features = [features[f] for f in model.roi_heads.in_features]
mask_features = model.ror_heads.mask_pooler(mask_features, [x.pred_boxes for x in instances])