mmdetection源码解析

yuyuelongfly

已于 2022-04-13 09:59:42 修改

阅读量3.8k

点赞数

分类专栏：深度学习文章标签：深度学习

于 2022-04-12 21:38:35 首次发布

本文链接：https://blog.csdn.net/Cxiazaiyu/article/details/123995333

版权

mmdetection开源代码链接：

第一章架构设计与实现

配置文件:

config每个配置文件里有一个metafile.yml的配置文件，给了一个系列(colletions)下的不同实现（不同backbone，neck等），以及相应的权重链接。

每一个model的config可以分为如下几块：

model：说明model结构backbone\neck\head 及其参数（loss是在head里配置的），get targets（由标注值生成feature）和decode（由feature生成目标形式）都是在head里以方法的形式直接定义的。在loss方法中调用get_targets。

schedule：说明optimizer和learning policy

dataset：train_pipeline和test_pipeline，即train和test时的transform

mmdet/models

第二章网络结构设计 --mmdetection/mmdet/models

在模型层级上，detectors是Architecture，给出了检测的框架。检测框架的基类为BaseDetector，定义了接口，包括框架的forward接口（包括forward_train和forward_test两个逻辑）、一些属性（是否包含neck结构、是否在ROI head中是否包含shared head，是否包含mask等）、extract_feats接口、show_result方法等。

class BaseDetector(BaseModule, metaclass=ABCMeta):
    """Base class for detectors."""
    @auto_fp16(apply_to=('img', ))
    def forward(self, img, img_metas, return_loss=True, **kwargs):
        """Calls either :func:`forward_train` or :func:`forward_test` depending
        on whether ``return_loss`` is ``True``.

        Note this setting will change the expected inputs. When
        ``return_loss=True``, img and img_meta are single-nested (i.e. Tensor
        and List[dict]), and when ``resturn_loss=False``, img and img_meta
        should be double nested (i.e.  List[Tensor], List[List[dict]]), with
        the outer list indicating test time augmentations.
        """
        if torch.onnx.is_in_onnx_export():
            assert len(img_metas) == 1
            return self.onnx_export(img[0], img_metas[0])

        if return_loss:
            return self.forward_train(img, img_metas, **kwargs)
        else:
            return self.forward_test(img, img_metas, **kwargs)

    def forward_train(self, imgs, img_metas, **kwargs):
        """
        Args:
            img (Tensor): of shape (N, C, H, W) encoding input images.
                Typically these should be mean centered and std scaled.
            img_metas (list[dict]): List of image info dict where each dict
                has: 'img_shape', 'scale_factor', 'flip', and may also contain
                'filename', 'ori_shape', 'pad_shape', and 'img_norm_cfg'.
                For details on the values of these keys, see
                :class:`mmdet.datasets.pipelines.Collect`.
            kwargs (keyword arguments): Specific to concrete implementation.
        """
        # NOTE the batched image size information may be useful, e.g.
        # in DETR, this is needed for the construction of masks, which is
        # then used for the transformer_head.
        batch_input_shape = tuple(imgs[0].size()[-2:])
        for img_meta in img_metas:
            img_meta['batch_input_shape'] = batch_input_shape

    def forward_test(self, imgs, img_metas, **kwargs):
        """
        Args:
            imgs (List[Tensor]): the outer list indicates test-time
                augmentations and inner Tensor should have a shape NxCxHxW,
                which contains all images in the batch.
            img_metas (List[List[dict]]): the outer list indicates test-time
                augs (multiscale, flip, etc.) and the inner list indicates
                images in a batch.
        """
        for var, name in [(imgs, 'imgs'), (img_metas, 'img_metas')]:
            if not isinstance(var, list):
                raise TypeError(f'{name} must be a list, but got {type(var)}')

        num_augs = len(imgs)
        if num_augs != len(img_metas):
            raise ValueError(f'num of augmentations ({len(imgs)}) '
                             f'!= num of image meta ({len(img_metas)})')

        # NOTE the batched image size information may be useful, e.g.
        # in DETR, this is needed for the construction of masks, which is
        # then used for the transformer_head.
        for img, img_meta in zip(imgs, img_metas):
            batch_size = len(img_meta)
            for img_id in range(batch_size):
                img_meta[img_id]['batch_input_shape'] = tuple(img.size()[-2:])

        if num_augs == 1:
            # proposals (List[List[Tensor]]): the outer list indicates
            # test-time augs (multiscale, flip, etc.) and the inner list
            # indicates images in a batch.
            # The Tensor should have a shape Px4, where P is the number of
            # proposals.
            if 'proposals' in kwargs:
                kwargs['proposals'] = kwargs['proposals'][0

最低0.47元/天解锁文章

yuyuelongfly

关注

0
点赞
踩
11

收藏

觉得还不错? 一键收藏
打赏
0
评论
mmdetection源码解析

配置文件可以分为如下几块：model：说明model结构及其参数backbone\neck\head （loss是在head里配置的）schedule：说明optimizer和learning policydataset：train和test时的transform问题：target在哪里配置？
复制链接

扫一扫