mmdet3.0系列 BaseDataset类

最新推荐文章于 2023-09-06 10:16:18 发布

KevinDB

最新推荐文章于 2023-09-06 10:16:18 发布

阅读量769

点赞数 1

分类专栏：深度学习文章标签： python 人工智能深度学习

本文链接：https://blog.csdn.net/everysigleday/article/details/127426920

版权

深度学习专栏收录该内容

4 篇文章 0 订阅

订阅专栏

BaseDaset

介绍

mmdet3.0中，原先的CusomDataset变成了BaseDataset和用于detection的BaseDetDataset。BaseDataset位于mmengine中

`init`

        # Full initialize the dataset.
        if not lazy_init:
            self.full_init()

在__init__函数中，调用BaseDataset中定义的self.full_init()函数进行对象的初始化

`self.full_init`

在self.full_init函数中，初始化分为以下几个步骤：

调用self.load_data_list从annotation file中加载annotation
调用self.filter_data根据filter_cfg对annotation进行过滤
（可选）如果self._indices不为None，调用self._get_unserialized_subset对dataset进行slice
（可选）如果self.serialize_data为True，调用self._serialize_data()将self.data_list进行序列化

    def full_init(self):
        """Load annotation file and set ``BaseDataset._fully_initialized`` to
        True.

        If ``lazy_init=False``, ``full_init`` will be called during the
        instantiation and ``self._fully_initialized`` will be set to True. If
        ``obj._fully_initialized=False``, the class method decorated by
        ``force_full_init`` will call ``full_init`` automatically.

        Several steps to initialize annotation:

            - load_data_list: Load annotations from annotation file.
            - filter data information: Filter annotations according to
              filter_cfg.
            - slice_data: Slice dataset according to ``self._indices``
            - serialize_data: Serialize ``self.data_list`` if
            ``self.serialize_data`` is True.
        """
        if self._fully_initialized:
            return
        # load data information
        self.data_list = self.load_data_list()
        # filter illegal data, such as data that has no annotations.
        self.data_list = self.filter_data()
        # Get subset data according to indices.
        if self._indices is not None:
            self.data_list = self._get_unserialized_subset(self._indices)

        # serialize data_list
        if self.serialize_data:
            self.data_bytes, self.data_address = self._serialize_data()

        self._fully_initialized = True

`self.load_data_list`

在self.load_data_list中，调用mmengine中的load函数读取yml、json或pickle文件，这些文件读取后会得到一个字典，该字典即是mmdet3.0中BaseDataset定义的annotation format，格式为：

{
    'metainfo':
        {
            'classes': ('person', 'bicycle', 'car', 'motorcycle'),
            ...
        },
    'data_list':
        [
            {
                "img_path": "xxx/xxx_1.jpg",
                "height": 604,
                "width": 640,
                "instances":
                [
                  {
                    "bbox": [0, 0, 10, 20],
                    "bbox_label": 1,
                    "ignore_flag": 0
                  },
                  {
                    "bbox": [10, 10, 110, 120],
                    "bbox_label": 2,
                    "ignore_flag": 0
                  }
                ]
              },
            {
                "img_path": "xxx/xxx_2.jpg",
                "height": 320,
                "width": 460,
                "instances":
                [
                  {
                    "bbox": [10, 0, 20, 20],
                    "bbox_label": 3,
                    "ignore_flag": 1,
                  }
                ]
              },
            ...
        ]
}

调用函数self.parse_data_info对data_list中的字典元素做处理，其实就是将其中的img_path字段与path prefix结合，再将处理后的字典元素加入self.data_list中，最后返回self.data_list

`self.parse_data_info`

该函数将上面列出的annotation解析成target format。它读入的数是data_list中的一个元素，即一个包含一张图片标注的dict。如果格式发生变化，可以override该函数以适应新标注

data_prefix: dict = dict(img_path=''),，该函数将self.data_prefix与字典中的img_path做了一个路径的join，并返回字典

    def parse_data_info(self, raw_data_info: dict) -> Union[dict, List[dict]]:
        """Parse raw annotation to target format.

        Args:
            raw_data_info (dict): Raw data information load from ``ann_file``

        Returns:
            list or list[dict]: Parsed annotation.
        """
        for prefix_key, prefix in self.data_prefix.items():
            assert prefix_key in raw_data_info, (
                f'raw_data_info: {raw_data_info} dose not contain prefix key'
                f'{prefix_key}, please check your data_prefix.')
            raw_data_info[prefix_key] = osp.join(prefix,
                                                 raw_data_info[prefix_key])
        return raw_data_info

`self.filter_data`

self.filter_data根据filter_cfg对annotation进行过滤。该函数没有实现，但是如果self.data_list需要根据特定方式进行过滤，可以在子类中override这个方法

    def filter_data(self) -> List[dict]:
        """Filter annotations according to filter_cfg. Defaults return all
        ``data_list``.

        If some ``data_list`` could be filtered according to specific logic,
        the subclass should override this method.

        Returns:
            list[int]: Filtered results.
        """
        return self.data_list

KevinDB

关注

1
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
mmdet3.0系列 BaseDataset类

mmdet3.0中，原先的CusomDataset变成了BaseDataset和用于detection的BaseDetDataset。BaseDataset位于mmengine中__init__在__init__函数中，调用BaseDataset中定义的函数进行对象的初始化在调用从annotation file中加载annotation调用根据filter_cfg对annotation进行过滤（可选）如果不为None，调用对dataset进行slice（可选）如果为True，调用将进行序列化。
复制链接

扫一扫