KITTI数据集激光雷达-图像坐标系转换关系

最新推荐文章于 2024-04-16 11:44:45 发布

幸福回头

最新推荐文章于 2024-04-16 11:44:45 发布

阅读量1.1w

点赞数 18

分类专栏：深度学习 Python 文章标签：自动驾驶

本文链接：https://blog.csdn.net/zt1091574181/article/details/114838741

版权

深度学习同时被 2 个专栏收录

33 篇文章 5 订阅

订阅专栏

Python

8 篇文章 0 订阅

订阅专栏

关于KITTI坐标系中的坐标转换，研究了好久，网络上也没有很详细的解释，自己了解了一些转换的内容，写在这里，供大家参考学习。

KITTI数据集中一共有三个坐标系：

1. 激光雷达坐标系 (下图1中的蓝色坐标系)

2. 相机坐标系 (下图1中的红色坐标系)

3. 图像坐标系 (下图2相机采集的图像)

图1

图2

而KITTI数据集中的坐标转换牵扯到以下相关文件：

lable_2、calib

lable_2是KITTI数据集的标注内容，给出示例：

Truck 0.00 0 -1.57 599.41 156.40 629.75 189.25 2.85 2.63 12.34 0.47 1.49 69.44 -1.56
Car 0.00 0 1.85 387.63 181.54 423.81 203.12 1.67 1.87 3.69 -16.53 2.39 58.49 1.57
Cyclist 0.00 3 -1.65 676.60 163.95 688.98 193.93 1.86 0.60 2.02 4.59 1.32 45.84 -1.55
DontCare -1 -1 -10 503.89 169.71 590.61 190.13 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 511.35 174.96 527.81 187.45 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 532.37 176.35 542.68 185.27 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 559.62 175.83 575.40 183.15 -1 -1 -1 -1000 -1000 -1000 -10

对于标注文件的解释释义，包含了以下部分：

可以看出，标注文件中的location是按照 camera coordinates(相机坐标系)进行标注的，因此要想对KITTI数据集中的坐标进行转换时离不开calib文件。

calib文件内容示例如下：

P0: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 0.000000000000e+00 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P1: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 -3.797842000000e+02 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P2: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 4.575831000000e+01 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 -3.454157000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 4.981016000000e-03
P3: 7.070493000000e+02 0.000000000000e+00 6.040814000000e+02 -3.341081000000e+02 0.000000000000e+00 7.070493000000e+02 1.805066000000e+02 2.330660000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 3.201153000000e-03
R0_rect: 9.999128000000e-01 1.009263000000e-02 -8.511932000000e-03 -1.012729000000e-02 9.999406000000e-01 -4.037671000000e-03 8.470675000000e-03 4.123522000000e-03 9.999556000000e-01
Tr_velo_to_cam: 6.927964000000e-03 -9.999722000000e-01 -2.757829000000e-03 -2.457729000000e-02 -1.162982000000e-03 2.749836000000e-03 -9.999955000000e-01 -6.127237000000e-02 9.999753000000e-01 6.931141000000e-03 -1.143899000000e-03 -3.321029000000e-01
Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01

P0、P1、P2、P3是KITTI采集工具中的三个相机对应的参数，共计12个数字，reshape为（3,4）

R0_rect：9个数字，reshape为（3，3），补0扩展为（4，4），右下角元素置位 1

Tr_velo_to_cam：12个数字，reshape为（3，4），补0扩展为（4，4），右下角置位 1

坐标转换关系如下：

设 y为激光雷达坐标写下的点（x，y，z，r）

Tr_velo_to_cam * y : 把激光雷达坐标系下的点y投影到相机坐标系

R0_rect * Tr_velo_to_cam * y: 将激光雷达坐标系下的点投影到编号为2的相机坐标系，结果为（x，y，z，1），直接取前三个为投影结果，当计算出z<0的时候表明该点在相机的后面

P2 * R0_rect * Tr_velo_to_cam * y：将激光雷达坐标系下的点投影到编号为2的相机采集的图像中，结果形式为(u，v，w)。 Ps：u，w需要除以w后取整才是最终的像素。

投影前的3D BBox共计有8个点，投影到图像坐标中也会有8个点，选取八个点中最大值最小值组成 (x1，y1，x2，y2）就是最终的2D BBox

转换代码参考如下：

def velodyne2img(calib_dir, img_id, velo_box):
    """
    :param calib_dir: calib文件的地址
    :param img_id: 要转化的图像id
    :param velo_box: (n,8,4)，要转化的velodyne frame下的坐标，n个3D框，每个框的8个顶点，每个点的坐标（x,y,z,1）
    :return: (n,4)，转化到 image frame 后的 2D框 的 x1y1x2y2
    """
    # 读取转换矩阵
    calib_txt=os.path.join(calib_dir, img_id) + '.txt'
    calib_lines = [line.rstrip('\n') for line in open(calib_txt, 'r')]
    for calib_line in calib_lines:
        if 'P2' in calib_line:
            P2=calib_line.split(' ')[1:]
            P2=np.array(P2, dtype='float').reshape(3,4)
        elif 'R0_rect' in calib_line:
            R0_rect=np.zeros((4,4))
            R0=calib_line.split(' ')[1:]
            R0 = np.array(R0, dtype='float').reshape(3, 3)
            R0_rect[:3,:3]=R0
            R0_rect[-1,-1]=1
        elif 'velo_to_cam' in calib_line:
            velo_to_cam = np.zeros((4, 4))
            velo2cam=calib_line.split(' ')[1:]
            velo2cam = np.array(velo2cam, dtype='float').reshape(3, 4)
            velo_to_cam[:3,:]=velo2cam
            velo_to_cam[-1,-1]=1

    tran_mat=P2.dot(R0_rect).dot(velo_to_cam)  # 3x4

    velo_box=velo_box.reshape(-1,4).T
    img_box = np.dot(tran_mat, velo_box).T
    img_box=img_box.reshape(-1,8,3)

    img_box[:,:,0]=img_box[:,:,0]/img_box[:,:,2]
    img_box[:, :, 1] = img_box[:, :, 1] / img_box[:, :, 2]
    img_box=img_box[:,:,:2]   # （n,8,2）

    x1y1=np.min(img_box,axis=1)
    x2y2 = np.max(img_box, axis=1)
    result =np.hstack((x1y1,x2y2))   #（n,4）

    return result

本文参考了部分博客内容，感谢各位的分享！

幸福回头

关注

18
点赞
踩
84

收藏

觉得还不错? 一键收藏
23
评论
KITTI数据集激光雷达-图像坐标系转换关系

关于KITTI坐标系中的坐标转换，研究了好久，网络上也没有很详细的解释，自己了解了一些转换的内容，写在这里，供大家参考学习。KITTI数据集中一共有三个坐标系：1. 激光雷达坐标系 (下图1中的蓝色坐标系)2. 相机坐标系 (下图1中的红色坐标系)3. 图像坐标系 (下图2相机采集的图像) ...
复制链接

扫一扫