3D-Detection系列论文1 ---- Pointpillars --creat_date篇

最新推荐文章于 2024-06-26 09:53:10 发布

LimitOut

最新推荐文章于 2024-06-26 09:53:10 发布

阅读量1.2k

点赞数 3

分类专栏： 3D障碍物检测论文文章标签：自动驾驶 pytorch 深度学习

本文链接：https://blog.csdn.net/LimitOut/article/details/106210669

版权

3D障碍物检测论文专栏收录该内容

6 篇文章 7 订阅

订阅专栏

$\color{red}{注：此文是解析的是pointpillar不是second}$

一、creat_data 准备

数据按照格式官网格式准备好，将不同的文件放入不同的文件夹。

└── KITTI_DATASET_ROOT
       ├── training    <-- 7481 train data
       |   ├── image_2 <-- for visualization
       |   ├── calib
       |   ├── label_2
       |   ├── velodyne
       |   └── velodyne_reduced <-- empty directory
       └── testing     <-- 7580 test data
           ├── image_2 <-- for visualization
           ├── calib
           ├── velodyne
           └── velodyne_reduced <-- empty directory

然后执行：python create_data.py create_kitti_info_file --data_path=KITTI_DATASET_ROOT操作。

二、代码部分（一）

我们来看一下create_date的代码部分：

def create_kitti_info_file(data_path,
                           save_path=None,
                           create_trainval=False,
                           relative_path=True)

这是我们原函数的参数列表，把KITTI的原数据路径加入。

    train_img_ids = _read_imageset_file("./data/ImageSets/train.txt")
    val_img_ids = _read_imageset_file("./data/ImageSets/val.txt")
    trainval_img_ids = _read_imageset_file("./data/ImageSets/trainval.txt")
    test_img_ids = _read_imageset_file("./data/ImageSets/test.txt")

ImageSets里面存放的的是训练集、验证集、测试集的对应kittii的是序列号。

kitti_infos_test =kitti.get_kitti_image_info()
kitti_infos_val = kitti.get_kitti_image_info()
kitti_infos_train = kitti.get_kitti_image_info(data_path,
        training=True,
        velodyne=True,
        calib=True,
        image_ids=train_img_ids,
        relative_path=relative_path)
产生三个数据集的一些信息，image_infos 包括 图像和velodyne，calib以及gt的一些信息。

下面是get_kitti_image_info的原型，返回的是一个list。

def get_kitti_image_info(path,
                         training=True,
                         label_info=True,
                         velodyne=False,
                         calib=False,
                         image_ids=7481,
                         extend_matrix=True,
                         num_worker=8,
                         relative_path=True,
                         with_imageshape=True)

经过get_kitti_image_info后生成的信息如下：

{'image_idx':
 'pointcloud_num_features': 4, 
 'velodyne_path': 
 'img_path':
  'img_shape':、
   'calib/P0': 'calib/P1':, 'calib/P2': , 'calib/P3': , 'calib/R0_rect': ， 'calib/Tr_velo_to_cam': , 'calib/Tr_imu_to_velo':
  'annos': {'name': , 'truncated': 'occluded': , 'alpha': , 'bbox': , 'dimensions': , 'location': , 'rotation_y': , 'score':, 'index':, 'group_ids': , }'difficulty':  }

接着是`_calculate_num_points_in_gt(data_path, kitti_infos_train, relative_path)`

下面是原型

def _calculate_num_points_in_gt(
data_path, 
infos,
relative_path, 
remove_outside=True,
num_features=4):

这个函数的目的是增加annos[“num_points_in_gt”]属性，就是一帧点云所检测到的每个障碍物的点云数量，期间做了坐标系的转换。

test、val、train,每个都做了类似的工作，最终将信息存放在pkl里面，最终的一帧pkl如下面所示：

{'image_idx': 9, 'pointcloud_num_features': 4, 'velodyne_path': 'training/velodyne/000009.bin', 'img_path': 'training/image_2/000009.png', 'img_shape': array([ 375, 1242], dtype=int32), 'calib/P0': array([[721.5377,   0.    , 609.5593,   0.    ],
       [  0.    , 721.5377, 172.854 ,   0.    ],
       [  0.    ,   0.    ,   1.    ,   0.    ],
       [  0.    ,   0.    ,   0.    ,   1.    ]]), 'calib/P1': array([[ 721.5377,    0.    ,  609.5593, -387.5744],
       [   0.    ,  721.5377,  172.854 ,    0.    ],
       [   0.    ,    0.    ,    1.    ,    0.    ],
       [   0.    ,    0.    ,    0.    ,    1.    ]]), 'calib/P2': array([[7.215377e+02, 0.000000e+00, 6.095593e+02, 4.485728e+01],
       [0.000000e+00, 7.215377e+02, 1.728540e+02, 2.163791e-01],
       [0.000000e+00, 0.000000e+00, 1.000000e+00, 2.745884e-03],
       [0.000000e+00, 0.000000e+00, 0.000000e+00, 1.000000e+00]]), 'calib/P3': array([[ 7.215377e+02,  0.000000e+00,  6.095593e+02, -3.395242e+02],
       [ 0.000000e+00,  7.215377e+02,  1.728540e+02,  2.199936e+00],
       [ 0.000000e+00,  0.000000e+00,  1.000000e+00,  2.729905e-03],
       [ 0.000000e+00,  0.000000e+00,  0.000000e+00,  1.000000e+00]]), 'calib/R0_rect': array([[ 0.9999239 ,  0.00983776, -0.00744505,  0.        ],
       [-0.0098698 ,  0.9999421 , -0.00427846,  0.        ],
       [ 0.00740253,  0.00435161,  0.9999631 ,  0.        ],
       [ 0.        ,  0.        ,  0.        ,  1.        ]]), 'calib/Tr_velo_to_cam': array([[ 7.533745e-03, -9.999714e-01, -6.166020e-04, -4.069766e-03],
       [ 1.480249e-02,  7.280733e-04, -9.998902e-01, -7.631618e-02],
       [ 9.998621e-01,  7.523790e-03,  1.480755e-02, -2.717806e-01],
       [ 0.000000e+00,  0.000000e+00,  0.000000e+00,  1.000000e+00]]), 'calib/Tr_imu_to_velo': array([[ 9.999976e-01,  7.553071e-04, -2.035826e-03, -8.086759e-01],
       [-7.854027e-04,  9.998898e-01, -1.482298e-02,  3.195559e-01],
       [ 2.024406e-03,  1.482454e-02,  9.998881e-01, -7.997231e-01],
       [ 0.000000e+00,  0.000000e+00,  0.000000e+00,  1.000000e+00]]), 'annos': {'name': array(['Car', 'Car', 'Car', 'DontCare', 'DontCare'], dtype='<U8'), 'truncated': array([ 0.,  0.,  0., -1., -1.]), 'occluded': array([ 0,  2,  0, -1, -1]), 'alpha': array([ -1.5 ,   1.75,   1.78, -10.  , -10.  ]), 'bbox': array([[601.96, 177.01, 659.15, 229.51],
       [600.14, 177.09, 624.65, 193.31],
       [574.98, 178.64, 598.45, 194.01],
       [710.6 , 167.73, 736.68, 182.35],
       [758.52, 156.27, 782.52, 179.23]]), 'dimensions': array([[ 3.2 ,  1.61,  1.66],
       [ 3.66,  1.44,  1.61],
       [ 3.37,  1.41,  1.53],
       [-1.  , -1.  , -1.  ],
       [-1.  , -1.  , -1.  ]]), 'location': array([[ 7.000e-01,  1.760e+00,  2.388e+01],
       [ 2.400e-01,  1.840e+00,  6.637e+01],
       [-2.190e+00,  1.960e+00,  6.825e+01],
       [-1.000e+03, -1.000e+03, -1.000e+03],
       [-1.000e+03, -1.000e+03, -1.000e+03]]), 'rotation_y': array([ -1.48,   1.76,   1.75, -10.  , -10.  ]), 'score': array([0., 0., 0., 0., 0.]), 'index': array([ 0,  1,  2, -1, -1], dtype=int32), 'group_ids': array([0, 1, 2, 3, 4], dtype=int32), 'difficulty': array([ 0, -1, -1, -1, -1], dtype=int32), 'num_points_in_gt': array([215,   4,   1,  -1,  -1], dtype=int32)}

完整的PKL已经解析成txt了，大家可以下载下来看一下。

代码部分（二）

第二部分代码是删减点云：

def create_reduced_point_cloud(data_path,
                               train_info_path=None,
                               val_info_path=None,
                               test_info_path=None,
                               save_path=None,
                               with_back=False):

	##主要调用函数
_create_reduced_point_cloud(data_path, train_info_path, save_path)

下面是主要函数：

def _create_reduced_point_cloud(data_path,
                                info_path,
                                save_path=None,
                                back=False):
     ##打开之前保存的pkl文件
    with open(info_path, 'rb') as f:
        kitti_infos = pickle.load(f)
    for info in prog_bar(kitti_infos):
        v_path = info['velodyne_path']
        v_path = pathlib.Path(data_path) / v_path
        points_v = np.fromfile(
            str(v_path), dtype=np.float32, count=-1).reshape([-1, 4])
        rect = info['calib/R0_rect']
        P2 = info['calib/P2']
        Trv2c = info['calib/Tr_velo_to_cam']
        # first remove z < 0 points
        # keep = points_v[:, -1] > 0
        # points_v = points_v[keep]
        # then remove outside.

        if back:
            points_v[:, 0] = -points_v[:, 0]
        ##去除p2相机范围外的点云。
        points_v = box_np_ops.remove_outside_points(points_v, rect, Trv2c, P2,
                                                    info["img_shape"])

        if save_path is None:
            save_filename = v_path.parent.parent / (v_path.parent.stem + "_reduced") / v_path.name
            # save_filename = str(v_path) + '_reduced'
        #是否要备份点云，如果否，则会覆盖velodyne_path的点云
            if back:
                save_filename += "_back"
        else:
            save_filename = str(pathlib.Path(save_path) / v_path.name)
            if back:
                save_filename += "_back"
        with open(save_filename, 'w') as f:
            points_v.tofile(f)

总的来说就是把之前产生的pkl文件中的点云，去除P2相机外的冗余点云。

代码部分（三）

主要代码：

def create_groundtruth_database(data_path,
                                info_path=None,
                                used_classes=None,
                                database_save_path=None,
                                db_info_save_path=None,
                                relative_path=True,
                                lidar_only=False,
                                bev_only=False,
                                coors_range=None):

主要代码段：

##打开train pkl文件
with open(info_path, 'rb') as f:
        kitti_infos = pickle.load(f)
    all_db_infos = {}
    ##获取所有目标检测所有类别
    if used_classes is None:
        used_classes = list(kitti.get_classes())
    ### 去除DontCare类别
        used_classes.pop(used_classes.index('DontCare'))
    for name in used_classes:
        all_db_infos[name] = []
    group_counter = 0
    for info in prog_bar(kitti_infos):
        velodyne_path = info['velodyne_path']
        if relative_path:
            # velodyne_path = str(root_path / velodyne_path) + "_reduced"
            velodyne_path = str(root_path / velodyne_path)
        num_features = 4
        if 'pointcloud_num_features' in info:
            num_features = info['pointcloud_num_features']
        points = np.fromfile(
            velodyne_path, dtype=np.float32, count=-1).reshape([-1, num_features])

        image_idx = info["image_idx"]
        rect = info['calib/R0_rect']
        P2 = info['calib/P2']
        Trv2c = info['calib/Tr_velo_to_cam']
        if not lidar_only:
            points = box_np_ops.remove_outside_points(points, rect, Trv2c, P2,
                                                        info["img_shape"])

        annos = info["annos"]
        ##障碍物名称
        names = annos["name"]
        bboxes = annos["bbox"]
        difficulty = annos["difficulty"]
        gt_idxes = annos["index"]
        ### 查看检测到的障碍物数目（不包括DontCare        ）
        num_obj = np.sum(annos["index"] >= 0)
        ## 获取相机坐标系的Bbox
        rbbox_cam = kitti.anno_to_rbboxes(annos)[:num_obj]
        ## 获得雷达坐标系下的3Dbox，[yz_lidar, w, l, h, r」
        rbbox_lidar = box_np_ops.box_camera_to_lidar(rbbox_cam, rect, Trv2c)
        if bev_only: # set z and h to limits
            assert coors_range is not None
            rbbox_lidar[:, 2] = coors_range[2]
            rbbox_lidar[:, 5] = coors_range[5] - coors_range[2]
        
        group_dict = {}
        group_ids = np.full([bboxes.shape[0]], -1, dtype=np.int64)
        if "group_ids" in annos:
            group_ids = annos["group_ids"]
        else:
            group_ids = np.arange(bboxes.shape[0], dtype=np.int64)
            ##获取障碍物box内的点云序列
        point_indices = box_np_ops.points_in_rbbox(points, rbbox_lidar)
        for i in range(num_obj):
        	##类似于-007465_Cyclist_4.bin- 
            filename = f"{image_idx}_{names[i]}_{gt_idxes[i]}.bin"
            filepath = database_save_path / filename
            ##获取障碍物BBox里面的点云
            gt_points = points[point_indices[:, i]]

            gt_points[:, :3] -= rbbox_lidar[i, :3]
            ##把Bbox内的点云写入到文件里
            with open(filepath, 'w') as f:
                gt_points.tofile(f)
            if names[i] in used_classes:
                if relative_path:
                    db_path = str(database_save_path.stem + "/" + filename)
                else:
                    db_path = str(filepath)
                db_info = {
                    "name": names[i],
                    "path": db_path,
                    "image_idx": image_idx,
                    "gt_idx": gt_idxes[i],
                    "box3d_lidar": rbbox_lidar[i],
                    "num_points_in_gt": gt_points.shape[0],
                    "difficulty": difficulty[i],
                    # "group_id": -1,
                    # "bbox": bboxes[i],
                }

                local_group_id = group_ids[i]
                # if local_group_id >= 0:
                if local_group_id not in group_dict:
                    group_dict[local_group_id] = group_counter
                    group_counter += 1
                db_info["group_id"] = group_dict[local_group_id]
                if "score" in annos:
                    db_info["score"] = annos["score"][i]
                all_db_infos[names[i]].append(db_info)
    for k, v in all_db_infos.items():
        print(f"load {len(v)} {k} database infos")
	##把每个障碍物的信息写入到文件。
    with open(db_info_save_path, 'wb') as f:
        pickle.dump(all_db_infos, f)

简单的来讲就是生成了每个点云.bin文件内的障碍物点云xyz的信息写入到文件夹，类似于-007465_Cyclist_4.bin-这样的文件。另外生成了每个点云，每个障碍物的信息保存到kitti_dbinfos_train.pkl文件内。

这里面的calib信息详见kitti数据集的calib文件解析部分。

下面是calib文件00000.txt的解析：

P0: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 0.000000000000e+00 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P1: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 -3.797842000000e+02 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P2: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 4.575831000000e+01 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 -3.454157000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 4.981016000000e-03
P3: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 -3.341081000000e+02 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 2.330660000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 3.201153000000e-03
R0_rect: 9.999128000000e-01 1.009263000000e-02 -8.511932000000e-03 -1.012729000000e-02 9.999406000000e-01 -4.037671000000e-03 8.470675000000e-03 4.123522000000e-03 9.999556000000e-01
Tr_velo_to_cam: 6.927964000000e-03 -9.999722000000e-01 -2.757829000000e-03 -2.457729000000e-02 -1.162982000000e-03 2.749836000000e-03 -9.999955000000e-01 -6.127237000000e-02 9.999753000000e-01 6.931141000000e-03 -1.143899000000e-03 -3.321029000000e-01
Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01

令，可能由于本人能力有限，会有部分错误解读的现象，请大家不吝指出，多谢。

参考文章：calib解析https://blog.csdn.net/QFJIZHI/article/details/103682310

LimitOut

关注

3
点赞
踩
13

收藏

觉得还不错? 一键收藏
1
评论
3D-Detection系列论文1 ---- Pointpillars --creat_date篇

一、creat_data 准备数据按照格式官网格式准备好，将不同的文件放入不同的文件夹。└── KITTI_DATASET_ROOT ├── training <-- 7481 train data | ├── image_2 <-- for visualization | ├── calib | ├── label_2 | ├── velodyne | └── velodyne_r
复制链接

扫一扫