3D点云目标检测Complex-YOLO（训练篇）（一）———KITTI数据集预处理与制作

最新推荐文章于 2025-03-14 19:53:32 发布

Vanessa Ni

最新推荐文章于 2025-03-14 19:53:32 发布

阅读量6.8k

点赞数 11

分类专栏： 3D目标检测

本文链接：https://blog.csdn.net/weixin_44145782/article/details/118029730

版权

KITTI dataset

Download dataset

KITTI 3D Object Detection Evaluation 2017 link
下载四个部分，共41.4GB
解压后为四部分内容（相机校准矩阵calib、RGB图像image_2、标签label_2、点云数据velodyne）对应的testing和training数据。其中，training数据为7481张（图片和点云对应的场景）,testing数据 7518张（无label_2数据）。

Data Preprocess

Part 1 3D Point Cloud Data Croping

clone这个利用TensorFlow实现的voxelNet，里面有数据预处理的相关内容。voxelnet
这个tensorflow版本的voxelnet，首先将training集里的所有点云数据进行裁剪操作，具体的就是利用calib中存放的相机校准矩阵，将3d点云的点投影到2dRGB图像中（利用Tr_velo_to_cam将3d点云坐标映射到0号3d相机坐标系中，然后利用R_rect将多个相机图像位于同一平面内，最后利用对应相机的投影矩阵P将点投影到相机的平面上），在2dRGB图像坐标系以外的点云将被移除掉。

这里有两个点需要注意一下

图片读取的问题：1.6.3版本的scipy无法import imread，两种解决方法1）scipy降级 2）换库引入 from imageio import imread

输出文件名，和原来保持一致，6位数字高位补0
output_name = PC_NEW_ROOT + str(frame).zfill(6) + '.bin'
print(output_name)
# str(frame) int -> str
# zfill(6) 六位数字 高位补0
想要复现voxelnet的小伙伴，这个项目中原本用c生成的部分需要删掉，然后再在自己电脑里重新生成。还有就是pip install（速度慢只能换源解决）的时候强力推荐中科大源，https://pypi.mirrors.ustc.edu.cn/simple/

Crop.py代码详解，建议断点debug过一遍流程，过程中的矩阵中间量就很清晰了~

import numpy as np
from imageio import imread
# 换了imageio库来进行图片读取

CAM = 2 # 最终投影到2号相机上（0，1，2，3）

def load_velodyne_points(filename):
    # 读某一个.bin文件里的数据，000000.bin中，就有大概19000个点，即19000行
    # 4列表示x，y，z，intensity。intensity是回波强度，和物体与雷达的距离以及物体本身的反射率有关。
    points = np.fromfile(filename, dtype=np.float32).reshape(-1, 4)
    return points

def load_calib(calib_dir):
    # P2 * R0_rect * Tr_velo_to_cam * y
    # 最后就是利用上面这个转换式来转换

    # 按行读取并保存
    lines = open(calib_dir).readlines()
    lines = [ line.split()[1:] for line in lines ][:-1]

    # 投影矩阵 3x4矩阵
    # 这里是最终要投影到2号相机上，即左边彩色相机
    P = np.array(lines[CAM]).reshape(3,4)

    # Tr_velo_to_cam，4x4矩阵
    # 点云坐标系转换到0号相机坐标系，都是3维的哦~
    Tr_velo_to_cam = np.array(lines[5]).reshape(3,4)
    Tr_velo_to_cam = np.concatenate(  [ Tr_velo_to_cam, np.array([0,0,0,1]).reshape(1,4)  ]  , 0     )

    # R_cam_to_rect，4x4矩阵
    # 校准，将多个相机图像位于同一个平面上
    R_cam_to_rect = np.eye(4)
    R_cam_to_rect[: