Openpcdet代码阅读-数据预处理模块

最新推荐文章于 2024-08-16 18:55:06 发布

mezdh

最新推荐文章于 2024-08-16 18:55:06 发布

阅读量241

点赞数 1

文章标签： python 目标检测人工智能计算机视觉神经网络深度学习 ubuntu

本文链接：https://blog.csdn.net/mezdh/article/details/133799127

版权

数据预处理模块

数据处理的部分为OpenPCDet文件夹中openpcdet/datasets，文件夹中除了定义的各种不同数据集的数据处理工具外，主要包括两个文件_init_.py 和 datatset.py。下面对两个文件进行分别的讲解。

`_init_.py`

文件中主要定义了 DistributedSampler(_DistributedSampler)类，该类继承字pytorh官方的分布式数据采样器，并对其中的功能进行了部分的客制化处理。在代码中，OpenPCDet并没有直接覆盖或扩展父类中的方法。相反，它在__iter__方法中进行了一些个性化的操作，以适应OpenPCDet项目的需求。以下是DistributedSampler中涉及的扩展或重写的部分：

__init__方法：
- DistributedSampler在__init__方法中增加了一个额外的参数shuffle，用于控制是否在每个epoch中对数据进行洗牌。
__iter__方法：
- 当shuffle为True时，手动创建了一个torch.Generator对象，并设置了随机数种子，然后使用torch.randperm生成随机排列的索引。
- 当shuffle为False时，直接生成顺序排列的索引。
- 对生成的索引进行重复采样，确保总数达到或超过self.total_size。
- 最后，根据self.rank、self.total_size和self.num_replicas对索引进行间隔采样，确保每个进程获取相应的样本。

这些操作的目的是根据OpenPCDet项目的需求，在分布式训练中有效地对数据进行采样和分发。通过扩展__iter__方法，OpenPCDet能够控制数据在分布式环境中的随机性和顺序性，以提高训练的效果。具体的代码注释如下：

# _init_.py的作用：在将该目录作为模块引用时，默认先运行该目录中的init.py
                                        # 在较低版本的python中如果没有init.py则不能作为模块引用
import torch
from functools import partial
from torch.utils.data import DataLoader
from torch.utils.data import DistributedSampler as _DistributedSampler

from pcdet.utils import common_utils

from .dataset import DatasetTemplate
from .kitti.kitti_dataset import KittiDataset
from .nuscenes.nuscenes_dataset import NuScenesDataset
from .waymo.waymo_dataset import WaymoDataset
from .pandaset.pandaset_dataset import PandasetDataset
from .lyft.lyft_dataset import LyftDataset
from .once.once_dataset import ONCEDataset
from .argo2.argo2_dataset import Argo2Dataset
from .custom.custom_dataset import CustomDataset

__all__ = {
    'DatasetTemplate': DatasetTemplate,
    'KittiDataset': KittiDataset,
    'NuScenesDataset': NuScenesDataset,
    'WaymoDataset': WaymoDataset,
    'PandasetDataset': PandasetDataset,
    'LyftDataset': LyftDataset,
    'ONCEDataset': ONCEDataset,
    'CustomDataset': CustomDataset,
    'Argo2Dataset': Argo2Dataset
}


class DistributedSampler(_DistributedSampler):#父类为pytorch中的分布式采样器（_DistributedSampler）


        # dataset (Dataset) - 此参数必须是 Dataset 的一个子类实例或实现了 __len__ 的 Python 对象，用于生成样本下标。

        # batch_size (int) - 每 mini-batch 中包含的样本数。

        # num_replicas (int，可选) - 分布式训练时的进程个数。如果是 None，会依据 ParallelEnv 获取值。默认是 None。

        # rank (int，可选) - num_replicas 个进程中的进程序号。如果是 None，会依据 ParallelEnv 获取值。默认是 None。

        # shuffle (bool，可选) - 是否需要在生成样本下标时打乱顺序。默认值为 False。

        # drop_last (bool，可选) - 是否需要丢弃最后无法凑整一个 mini-batch 的样本。默认值为 False。
        
        #采用的迭代器方法，返回样本下标数组的迭代器。

    def __init__(self, dataset, num_replicas=None, rank=None, shuffle=True):#子类构造
        super().__init__(dataset, num_replicas=num_replicas, rank=rank)#父类构造
        self.shuffle = shuffle

    def __iter__(self):
        if self.shuffle:
            g = torch.Generator()#手动创建随机数生成器
                                                       #PyTorch 通过 torch.Generator 类来操作随机数的生成
                                                       # 我们通常不会手动实例化 torch.Generator, 
                                                       # 当需要操作随机数时, PyTorch 会自动创建一个全局的 torch.Generator 实例,
                                                       # 随后的随机数操作默认使用该 torch.Generator 实例
                                                            ## 1. 使用默认的随机数生成器
                                                                # torch.manual_seed(1)
                                                            # # 结果 tensor([0, 4, 2, 3, 1])
                                                                # torch.randperm(5)
                                                            # # 2. 手动创建随机数生成器
                                                                # g = torch.Generator()
                                                                # g.manual_seed(1)
                                                            # # 结果也为 tensor([0, 4, 2, 3, 1])
                                                                # torch.randperm(5, generator=g)

            g.manual_seed(self.epoch)#设置随机种子，随机数种子就是个序号，
                                                                    #这个序号交给一个数列管理器，通过这个序号，
                                                                    # 你从管理器中取出一个数列，
                                                                    # 这个数列就是你通过那个序号得到的随机数。
                                                                    #但为什么将随机数设置为epoch的值？
                                                                    
            indices = torch.randperm(len(self.dataset), generator=g).tolist()#默认返回类型为tensor
        else:
            indices = torch.arange(len(self.dataset)).tolist()#使用默认随机数生成器

        indices += indices[:(self.total_size - len(indices))]#如果total_size比indices长，则重复采样
                                                                                                           #list数据类型的相加为拼接
        assert len(indices) == self.total_size#确保两者的长度一致

        indices = indices[self.rank:self.total_size:self.num_replicas]#a[start:stop:step] 
                                                                                                                                  # start through not past stop, by step
                                                                                                                                  #list  提供的切片操作
                                                                                                                                  #从进程序号开始，以进程数为步长，以total_size作为结束
        assert len(indices) == self.num_samples#确保长度等于采样数

        return iter(indices)#返回迭代器
                                                #a=[1,2,3]
                                                # it=iter(a) 
                                                # #创建迭代器对象
                                                # next(it)  
                                                # #输出迭代器下一项
                                                # next(it)
                                                # next(it)
                                                # #输出：
                                                # #1
                                                # #2
                                                # #3
                                                
                                                
def build_dataloader(dataset_cfg, class_names, batch_size, dist, root_path=None, workers=4, seed=None,
                     logger=None, training=True, merge_all_iters_to_one_epoch=False, total_epochs=0):
#构建数据集并调用Dataloader
#Args：
            #dataset_cfg：数据集配置文件
            #class_names：类比名称
            #batch_size：batch的大小
            #dist：是否进行训练
            #root_path：根目录
            #workers:线程数
            #seed：随机数种子,
            #logger：日志记录器
            # training=训练模式 
            # merge_all_iters_to_one_epoch：是否将所有迭代次数合并到一个epoch
            # total_epoch：总epoch数
    #返回值：
            #dataset：数据集
            #dataloader：加载器
            #sampler：采样器
            
            
    #根据数据集名称对数据集进行初始化       
    dataset = __all__[dataset_cfg.DATASET](
        dataset_cfg=dataset_cfg,
        class_names=class_names,
        root_path=root_path,
        training=training,
        logger=logger,
    )

    if merge_all_iters_to_one_epoch:
        assert hasattr(dataset, 'merge_all_iters_to_one_epoch')#hasattr函数用于判断对象是否包含对应的属性
        dataset.merge_all_iters_to_one_epoch(merge=True, epochs=total_epochs)

    if dist:
        if training:#原采样器和重写的采样器采样方式不同，对_iter_函数进行重写
            sampler = torch.utils.data.distributed.DistributedSampler(dataset)
        else:
            rank, world_size = common_utils.get_dist_info()
            sampler = DistributedSampler(dataset, world_size, rank, shuffle=False)#初始化分布式采样器
    else:
        sampler = None
    #初始化dataloader，此时没有加载数据和采样，只有在训练中才会调用_getitem_来加载数据
    #在单卡训练中，通过Dataloader来实现加载
    dataloader = DataLoader(
        dataset, batch_size=batch_size, pin_memory=True, num_workers=workers,
        shuffle=(sampler is None) and training, collate_fn=dataset.collate_batch,
        drop_last=False, sampler=sampler, timeout=0, worker_init_fn=partial(common_utils.worker_init_fn, seed=seed)
    )

    return dataset, dataloader, sampler

`dataset.py`

文件中定义了一个DatasetTemplate模板类，该类继承自torch_data.Dataset，它在父类的基础上进行了扩展和定制，以适应3D物体检测任务的特定需求。以下是一些主要的不同点：

数据处理和增强： DatasetTemplate中引入了DataProcessor和DataAugmentor，用于处理数据的预处理和增强。这包括点云编码、数据打乱、体素化等操作，以适应点云物体检测的需求。
点云特征编码器： 引入了PointFeatureEncoder类，用于对点云数据进行特征编码。这有助于在训练模型时对点云进行更有效的表示。
数据集配置和参数： DatasetTemplate中包含了一些特定于数据集的配置和参数，如点云范围、体素大小等。这些参数在构建数据集时进行设置，以适应不同数据集的要求。
合并所有iters到一个epoch： 引入了merge_all_iters_to_one_epoch方法，用于控制是否将所有iterations合并成一个epoch。这对于一些特殊的训练需求可能是有用的。
数据集保存和预测： 提供了generate_prediction_dicts方法，用于将模型预测的结果转换到自定义坐标系，并可选择性地保存到磁盘。
统一的数据处理接口： 提供了prepare_data方法，用于对原始数据进行统一的处理，包括数据的筛选、预处理和编码。
具体的代码注释如下：

from collections import defaultdict # 当字典中的key不存在但被查找时，返回默认值，而不是keyError
from pathlib import Path

import numpy as np
import torch.utils.data as torch_data

from ..utils import common_utils
from .augmentor.data_augmentor import DataAugmentor
from .processor.data_processor import DataProcessor
from .processor.point_feature_encoder import PointFeatureEncoder


class DatasetTemplate(torch_data.Dataset):
    def __init__(self, dataset_cfg=None, class_names=None, training=True, root_path=None, logger=None):
        super().__init__() # super()函数用于访问和调用一个对象上的父对象上的函数，继承中的父类初始化。
        self.dataset_cfg = dataset_cfg # 数据集配置文件
        self.training = training # 训练模式
        self.class_names = class_names # 类别
        self.logger = logger # 日志
        self.root_path = root_path if root_path is not None else Path(self.dataset_cfg.DATA_PATH) # 数据集根目录
        if self.dataset_cfg is None or class_names is None:
            return

        self.point_cloud_range = np.array(self.dataset_cfg.POINT_CLOUD_RANGE, dtype=np.float32) # 点云范围
        # 创建点云特征编码器类
        self.point_feature_encoder = PointFeatureEncoder(
            self.dataset_cfg.POINT_FEATURE_ENCODING,
            point_cloud_range=self.point_cloud_range
        )
        # 创建数据增强器类
        self.data_augmentor = DataAugmentor(
            self.root_path, self.dataset_cfg.DATA_AUGMENTOR, self.class_names, logger=self.logger
        ) if self.training else None
        # 创建数据预处理器类
        self.data_processor = DataProcessor(
            self.dataset_cfg.DATA_PROCESSOR, point_cloud_range=self.point_cloud_range, training=self.training
        )

        self.grid_size = self.data_processor.grid_size # 网格数量 = 点云范围 / 体素大小
        self.voxel_size = self.data_processor.voxel_size # 体素大小
        self.total_epochs = 0
        self._merge_all_iters_to_one_epoch = False

        if hasattr(self.data_processor, "depth_downsample_factor"):
            self.depth_downsample_factor = self.data_processor.depth_downsample_factor
        else:
            self.depth_downsample_factor = None

    
    @property
    def mode(self):
        """@property 可以让对象像访问属性一样区访问方法 self.mode"""
        return 'train' if self.training else 'test'

    def __getstate__(self):
        """Return state values to be pickled
        获取对象的属性（__init__中定义的属性,可以使用self.__dict__获取）返回去掉'logger'的属性dict
        """
        d = dict(self.__dict__) 
        del d['logger']#去掉logger项
        return d

    def __setstate__(self, d):
        self.__dict__.update(d) # 根据字典d更新类的属性值

    @staticmethod
    def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
        # @staticmethod不需要表示自身对象的self和自身类的cls参数，就和使用函数一样
        # @classmethod也不需要self参数，但第一个参数需要是表示自身类的cls参数
            #1、self表示一个具体的实例本身。如果用了staticmethod，那么就可以无视这个self，将这个方法当成一个普通的函数使用
            #2、cls表示这个类本身
        """
        To support a custom dataset, implement this function to receive the predicted results from the model, and then
        transform the unified normative coordinate to your required coordinate, and optionally save them to disk.

        Args:
            batch_dict: dict of original data from the dataloader
            pred_dicts: dict of predicted results from the model
                pred_boxes: (N, 7), Tensor
                pred_scores: (N), Tensor
                pred_labels: (N), Tensor
            class_names:
            output_path: if it is not None, save the results to this path
        Returns:

        """

    def merge_all_iters_to_one_epoch(self, merge=True, epochs=None):
        """
        合并所有的iters到一个epoch中
        """
        if merge:
            self._merge_all_iters_to_one_epoch = True
            self.total_epochs = epochs
        else:
            self._merge_all_iters_to_one_epoch = False

    def __len__(self):
        # 类似c++中的虚函数，子类如果继承必须重写
        raise NotImplementedError

    def __getitem__(self, index):
        """
        To support a custom dataset, implement this function to load the raw data (and labels), then transform them to
        the unified normative coordinate and call the function self.prepare_data() to process the data and send them
        to the model.

        Args:
            index:

        Returns:

        """
        raise NotImplementedError

    def prepare_data(self, data_dict):
        """
        接受统一坐标系下的数据字典（points，box和class），进行数据筛选，数据预处理，包括数据增强，点云编码等
        Args:
            data_dict:
                points: optional, (N, 3 + C_in)
                gt_boxes: optional, (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]
                gt_names: optional, (N), string
                ...

        Returns:
            data_dict:
                frame_id: string
                points: (N, 3 + C_in)
                gt_boxes: optional, (N, 7 + C) [x, y, z, dx, dy, dz, heading, ...]
                gt_names: optional, (N), string
                use_lead_xyz: bool
                voxels: optional (num_voxels, max_points_per_voxel, 3 + C)
                voxel_coords: optional (num_voxels, 3)
                voxel_num_points: optional (num_voxels)
                ...
        """
        # 训练模式下，对存在于class_name中的数据进行增强
        if self.training:
            assert 'gt_boxes' in data_dict, 'gt_boxes should be provided for training'
            # 返回一个bool数组，记录自定义数据集中ground_truth_name列表在不在我们需要检测的类别列表self.class_name里面
            # 比如kitti数据集中data_dict['gt_names']='car','person','cyclist'
            gt_boxes_mask = np.array([n in self.class_names for n in data_dict['gt_names']], dtype=np.bool_)
            # 数据增强 传入字典参数，**data_dict是将data_dict里面的key-value对都拿出来
            # 下面在原数据的基础上增加gt_boxes_mask，构造新的字典传入data_augmentor的forward函数
            data_dict = self.data_augmentor.forward(
                data_dict={
                    **data_dict,
                    'gt_boxes_mask': gt_boxes_mask
                }
            )

        # 筛选需要检测的gt_boxes
        if data_dict.get('gt_boxes', None) is not None:#返回字典中对应的值，若是没有值，返回定义的值，这里是None
            # 返回data_dict[gt_names]中存在于class_name的下标(np.array)
            selected = common_utils.keep_arrays_by_name(data_dict['gt_names'], self.class_names)
            # 根据selected，选取需要的gt_boxes和gt_names
            data_dict['gt_boxes'] = data_dict['gt_boxes'][selected]
            data_dict['gt_names'] = data_dict['gt_names'][selected]
            # 将当帧数据的gt_names中的类别名称对应到class_names的下标
            # 举个栗子，我们要检测的类别class_names = 'car','person'
            # 对于当前帧，类别gt_names = 'car', 'person', 'car', 'car'，当前帧出现了3辆车，一个人，获取索引后，gt_classes = 1, 2, 1, 1
            gt_classes = np.array([self.class_names.index(n) + 1 for n in data_dict['gt_names']], dtype=np.int32)
            # 将类别index信息放到每个gt_boxes的最后
            gt_boxes = np.concatenate((data_dict['gt_boxes'], gt_classes.reshape(-1, 1).astype(np.float32)), axis=1)#沿着某个纬度对向量或矩阵进行拼接
                                                                                                                                                                                                                                #这里-1是指未设定行数，程序随机分配，所以这里-1表示任一正整数
                                                                                                                                                                                                                                # 所以reshape(-1,1)表示（任意行，1列）
            data_dict['gt_boxes'] = gt_boxes

            # 如果box2d不同，根据selected，选取需要的box2d
            if data_dict.get('gt_boxes2d', None) is not None:
                data_dict['gt_boxes2d'] = data_dict['gt_boxes2d'][selected]

        # 使用点的哪些属性 比如x,y,z等
        if data_dict.get('points', None) is not None:
            data_dict = self.point_feature_encoder.forward(data_dict)

        # 对点云进行预处理，包括移除超出point_cloud_range的点、 打乱点的顺序以及将点云转换为voxel
        data_dict = self.data_processor.forward(
            data_dict=data_dict
        )

        if self.training and len(data_dict['gt_boxes']) == 0:
            """
                如果处于训练模式，并且数据中含有gt_boxes
                首先，在数据长度范围内产生一个随机数
                然后调用__getitem__方法获取该索引的数据字典
            """
            new_index = np.random.randint(self.__len__())
            return self.__getitem__(new_index)

        data_dict.pop('gt_names', None) # pop() 方法删除字典给定键 key 及对应的值，返回值为被删除的值

        return data_dict

    @staticmethod
    def collate_batch(batch_list, _unused=False):
        """
        由于训练集中不同的点云的gt框个数不同，需要重写collate_batch函数，
        将不同item的boxes和labels等key放入list，返回batch_size的数据
        
        `collate_batch`方法的作用是将不同item的数据按照一定的规则拼接成一个batch。
        在这个特定的实现中，主要用于处理点云物体检测中不同点云的gt框（ground truth bounding boxes）个数不同的情况。
        具体来说，这个方法接收一个包含多个样本（item）的列表，每个样本都是一个数据字典。
        每个数据字典包含了一个点云及其相关的ground truth信息（如gt框、类别等）。
        由于不同点云的gt框个数可能不同，需要将它们合并成一个统一的batch。
        """
        # defaultdict创建一个带有默认返回值的字典，当key不存在时，返回默认值，list默认返回一个空
        data_dict = defaultdict(list)
        # 把batch里面的每个sample按照key-value合并
        for cur_sample in batch_list:
            for key, val in cur_sample.items():
                data_dict[key].append(val)
        batch_size = len(batch_list)
        ret = {}

        # 将合并后的key内的value进行拼接，先获取最大值，构造空矩阵，不足的部分补0
        # 因为pytorch要求输入数据维度一致
        for key, val in data_dict.items():#.item()将字典内的key与val以远祖的形式返回。
            try:
                # voxels: optional (num_voxels, max_points_per_voxel, 3 + C)
                # voxel_coords: optional (num_voxels, 3)
                # voxel_num_points: optional (num_voxels)
                if key in ['voxels', 'voxel_num_points']:
                    ret[key] = np.concatenate(val, axis=0)
                elif key in ['points', 'voxel_coords']:
                    coors = []
                    for i, coor in enumerate(val):
                        # 在每个坐标前面加上序号
                        coor_pad = np.pad(coor, ((0, 0), (1, 0)), mode='constant', constant_values=i)
                        """
                            ((0,0),(1,0))
                            在二维数组array第一维（此处便是行）前面填充0行，最后面填充0行；
                            在二维数组array第二维（此处便是列）前面填充1列，最后面填充0列
                            mode='constant'表示指定填充的参数
                            constant_values=i 表示第一维填充i
                        """
                        coors.append(coor_pad)
                    ret[key] = np.concatenate(coors, axis=0) # （B, N, 4)
                elif key in ['gt_boxes']:
                    max_gt = max([len(x) for x in val]) # 获取一个batch中所有帧中3D box最大的数量
                    batch_gt_boxes3d = np.zeros((batch_size, max_gt, val[0].shape[-1]), dtype=np.float32) # 构造空的box3d矩阵（B, N, 7）
                    for k in range(batch_size):
                        batch_gt_boxes3d[k, :val[k].__len__(), :] = val[k] #val[k]表示一个batch中的第k帧
                    ret[key] = batch_gt_boxes3d
                # gt_boxes2d同gt_boxes
                elif key in ['gt_boxes2d']: 
                    max_boxes = 0
                    max_boxes = max([len(x) for x in val])
                    batch_boxes2d = np.zeros((batch_size, max_boxes, val[0].shape[-1]), dtype=np.float32) # (B, N, 4)
                    for k in range(batch_size):
                        if val[k].size > 0:
                            batch_boxes2d[k, :val[k].__len__(), :] = val[k]
                    ret[key] = batch_boxes2d
                elif key in ["images", "depth_maps"]:
                    # Get largest image size (H, W)
                    max_h = 0
                    max_w = 0
                    for image in val:
                        max_h = max(max_h, image.shape[0])
                        max_w = max(max_w, image.shape[1])

                    # Change size of images
                    images = []
                    for image in val:
                        pad_h = common_utils.get_pad_params(desired_size=max_h, cur_size=image.shape[0])
                        pad_w = common_utils.get_pad_params(desired_size=max_w, cur_size=image.shape[1])
                        pad_width = (pad_h, pad_w)
                        # Pad with nan, to be replaced later in the pipeline.
                        pad_value = np.nan

                        if key == "images":
                            pad_width = (pad_h, pad_w, (0, 0))
                        elif key == "depth_maps":
                            pad_width = (pad_h, pad_w)

                        image_pad = np.pad(image,
                                           pad_width=pad_width,
                                           mode='constant',
                                           constant_values=pad_value)

                        images.append(image_pad)
                    ret[key] = np.stack(images, axis=0) # (B, H, W, C)
                else:
                    ret[key] = np.stack(val, axis=0)
            except:
                print('Error in collate_batch: key=%s' % key)
                raise TypeError

        ret['batch_size'] = batch_size
        return ret

mezdh

关注

1
点赞
踩
3

收藏

觉得还不错? 一键收藏
打赏
0
评论
Openpcdet代码阅读-数据预处理模块

类，该类继承字pytorh官方的分布式数据采样器，并对其中的功能进行了部分的客制化处理。数据处理的部分为OpenPCDet文件夹中openpcdet/datasets，文件夹中除了定义的各种不同数据集的数据处理工具外，主要包括两个文件。这对于一些特殊的训练需求可能是有用的。这些操作的目的是根据OpenPCDet项目的需求，在分布式训练中有效地对数据进行采样和分发。，它在父类的基础上进行了扩展和定制，以适应3D物体检测任务的特定需求。方法，用于对原始数据进行统一的处理，包括数据的筛选、预处理和编码。
复制链接

扫一扫