Pointpillars（三）工程实践

逍遥郎wj

已于 2025-06-10 12:06:18 修改

阅读量741

点赞数 21

分类专栏： 3D感知 Pointpillars 文章标签：自动驾驶目标检测

于 2025-05-06 15:25:30 首次发布

本文链接：https://blog.csdn.net/xiaoyaolangwj/article/details/147669103

版权

3D感知同时被 2 个专栏收录

5 篇文章

订阅专栏

Pointpillars

3 篇文章

订阅专栏

一、点云标注

1、力推-CVAT

这个github网址统计了标注工具。

https://github.com/HumanSignal/awesome-data-labeling

按照下面这个网址的推荐：我尝试了两种标注工具。实测心得如下：

https://zhuanlan.zhihu.com/p/694524430

labelCloud标注时，对于点云放大缩小不敏感，特别慢，而且调整框的大小，需要手动修改右边的数字，不够精确。而且输出多次标注文件为空。
- 如果是通过pip安装的，特别是在conda环境下：那么软件的配置文件路径为：
- 通过pip show labelCloud来定位安装路径。/home/vincent/.conda/envs/labelcloud/lib/python3.9/site-packages/labelCloud
- 然后在安装路径下搜索文件。
- 参考下面链接评论区的指引：
- 利用 labelCloud 开源工具标注自己的点云数据集为KITTI标注格式教程（支持pcd、bin格式点云）_如何标注kitti数据集-CSDN博客y
- 要在路径中所有文件中找到near_plane和far_plane，因此搜索：
- grep "plane" ./* -r
- 定位路径为：./resources/default_config.ini:，可以找到软件的配置选项。
- 如果是git源码安装路径为labelCloud下的resources路径下default_config.ini。
cvat网页版的，非常方便。用github登录注册一下就可以了。在线管理方便。唯一限制就是上传文件必须是zip格式的，并且必须不超过4个G。

3D点云目标检测数据集标注工具保姆级教程——CVAT （附json转kitti代码）_3d点云标注工具-CSDN博客 这篇文章讲标注，讲的很好。也包含了，如何将json文件，转为kitti格式的txt标注文件。

cvat官网Leading Image & Video Data Annotation Platform | CVAT一看就是正经大工具。

也有官网手册Documentation | CVAT

这个网址对cvat做了简介：CVAT——计算机视觉标注工具-CSDN博客

4.1.CVAT——目标检测的标注详细步骤_cvat标注工具教程-CSDN博客

2、建议尝试-SUSTechPOINTS

这个软件最强大的地方在于：按住ctrl键，直接鼠标画框就可以自动标注，然后在左侧调整一下边界即可，如果标定效果不好，还可以点击小灯泡，让他继续自动匹配。拖动俯视图上的箭头还可以实现旋转。让标定工作大大简化。

SUSTechPOINTS三维点云标注工具使用-CSDN博客

这篇文章也包含了将label转成OpenPCDet的代码。SUSTech代表的应该是南方科技大学。

中文说明https://github.com/naurril/SUSTechPOINTS/blob/dev-auto-annotate/README_guide.md可参照教程：下面这个链接更正经。【OpenPCDet】自定义数据集（kitti格式）训练PointPillars并评估&可视化，全过程debug_pointpillars训练自己的数据-CSDN博客

SUSTechPOINTS三维点云标注工具使用-CSDN博客

我在安装时：如果是python=3.9的conda环境，无法启动。建议采用3.8的conda环境。

SUSTechPOINTS/README_cn.md at dev-auto-annotate · naurril/SUSTechPOINTS · GitHub

SUSTechPOINTS/README_guide.md at dev-auto-annotate · naurril/SUSTechPOINTS · GitHub

官网给的这两个教程可以详细读一读。

二、工程实施-带着问题前行

GPU租赁平台：

AutoDL算力云 | 弹性、好用、省钱。租GPU就上AutoDL

橘皮优 | 全球C2C GPU算力租用平台

智星云 AI Galaxy | GPU云服务器 GPU服务器租用远程GPU租用深度学习服务器 | 免费GPU 便宜GPU

工程（三）——多激光雷达标定、融合、聚类

1、把当前跑通的conda环境复制出来。

conda activate pcdet
conda env export > environment.yml

# 将这个yml文件拷贝到新的Anaconda环境中。然后基于这个文件，创建新的conda环境。

conda env create -f environment.yml
conda activate pcdet
conda list
conda env list

或者手动复制~/anaconda3/envs/myenv（Linux/Mac）

2、如何训练自己的训练集

3D目标检测（4）：OpenPCDet训练篇--自定义数据集 (二十五)实践出真知——OpenPCDet 制作pointpillars自定义数据集 - 梦里寻梦的文章 - 知乎

上面这个链接也提及了SUSTechPOINTS工具。

根据自己需求，调整点云范围，需要注意的是x和y轴范围与voxel_size的比值需要是16的倍数，如（69.12+69.12）/0.16 /16= 54

里面提及400张标注还是不太够。

评论区的内容值得学习：数据泛化、标注前转车身坐标系。

3D目标检测（4）：OpenPCDet训练篇--自定义数据集https://zhuanlan.zhihu.com/p/407302009这个知乎作者程法说也是跑通的人。注意他的评论区。

同时讲解了对于NX设备的环境部署问题。

自动驾驶系统进阶与项目实战（十）基于PointPillars的点云三维目标检测和TensorRT实战（1）_tensorrt pointpillars-CSDN博客

3、只有激光雷达坐标系如何变换。

kitti这个标注，它是以相机2坐标系为基准的。

4、如何训练。

5、如何连续检测。

6、检测结果如何通过ros2的topic形式发送出去。

7、显示问题。如何把检测框体，通过汽车模型的方式替换掉。

8、如何筛选可行驶区域。

9、如何渲染整个环境。

10、地面问题处理。

三、思想实验

上面都是查找的别人如何做的。实操也很难自己收集数据，挨个标定。来测试。

那么这个操作就需要小批量的操作。

1、数据集：

第一步获取自己的数据：我们就用kitti中的点云数据。

我们选择其training中的前一千帧作为训练集，测试集中的前一千帧作为测试集。

2、标注

标注文件相应的拷贝过来，这样就构建了自己的训练集。

3、标注转换

我们用training中的训练集，比如选择中间的10帧数据。放在CVAT或者SUSTechPOINTS下去标注形成JSON文件，然后将JSON文件转为OPenPCDet可以训练的txt文件。

检查对比自己标注之后生成的txt与原始txt的数值差异。

4、训练

采用以上只有点云和标注文件的数据集。没有任何图像的情况下，训练模型。

当前数量只有1000个。然后尝试将训练集提升到2000个，3000个，4000个，测试集一次增长500，看看效果。记录训练时长和评价指标。

四、实操

官网文档：OpenPCDet/docs/CUSTOM_DATASET_TUTORIAL.md at master · open-mmlab/OpenPCDet · GitHub

1、构建数据集

先查看OPenPCDet对于自定义数据集的目录要求。按照路径生成自己的数据集文档。详情参照二、训练自己的模型中提及的链接:

【OpenPCDet】自定义数据集（kitti格式）训练PointPillars并评估&可视化，全过程debug_pointpillars训练自己的数据-CSDN博客

custom
├── testing
│   ├── velodyne  # 点云数据
├── training
│   ├── label_2  # 标签文件
│   ├── velodyne

其中点云数据的格式是bin文件。如果采集到的数据是rosbag包，则先进行bag2pcd，然后pcd2bin。得到bin格式的点云数据。代码如下：

bag2pcd.py:

import rospy
import rosbag
from sensor_msgs.msg import PointCloud2
from sensor_msgs import point_cloud2
import struct
 
def pointcloud2_to_pcd(point_cloud2_msg, filename):
    # 更新头部信息以包含intensity
    header = f"""# .PCD v0.7 - Point Cloud Data file format
VERSION 0.7
FIELDS x y z intensity
SIZE 4 4 4 4
TYPE F F F F
COUNT 1 1 1 1
WIDTH {point_cloud2_msg.width}
HEIGHT {point_cloud2_msg.height}
VIEWPOINT 0 0 0 1 0 0 0
POINTS {point_cloud2_msg.width * point_cloud2_msg.height}
DATA ascii
"""
 
    # 将点云数据（包括intensity）转换为ASCII格式并保存到PCD文件
    with open(filename, 'w') as f:
        f.write(header)
        for p in point_cloud2.read_points(point_cloud2_msg, field_names=("x", "y", "z", "intensity"), skip_nans=True):
            f.write(f"{' '.join(str(value) for value in p)}\n")
 
def main():
    bag_file = '/home/dxy/SUSTechPOINTS/备份/rosbag/train.bag'
    topic = '/velodyne_points'
    output_directory = '/home/dxy/SUSTechPOINTS/备份/lidar/'
    frame_count = 0
 
    with rosbag.Bag(bag_file, 'r') as bag:
        for topic, msg, t in bag.read_messages(topics=[topic]):
            if 1:
                filename = f"{output_directory}{t.to_nsec()}.pcd"
                pointcloud2_to_pcd(msg, filename)
                frame_count += 1
                print(f"Processed frame {frame_count}: Saved {filename}")
            else:
                print(f"Message is not of type PointCloud2: {type(msg)}")
 
    print(f"Total frames processed: {frame_count}")
 
if __name__ == "__main__":
    main()

pcd2bin.py:

import os
import numpy as np
 
def read_pcd(filepath):
    lidar = []
    header_passed = False
    with open(filepath, 'r') as f:
        for line in f:
            line = line.strip()
            if line.startswith('DATA'):
                header_passed = True
                continue
            if header_passed:
                linestr = line.split()
                if len(linestr) == 3:
                    linestr_convert = list(map(float, linestr)) + [1.0]
                    linestr_convert[2] += 0
                    print("!!!!!!!!!!!!!!!!!!!!!!ERROR")
                    lidar.append(linestr_convert)
                elif len(linestr) == 4:
                    linestr_convert = list(map(float, linestr))
                    linestr_convert[2] += 0
                    lidar.append(linestr_convert)
    return np.array(lidar)
 
def pcd2bin(pcdfolder, binfolder, start_idx, end_idx):
    ori_path = pcdfolder
    des_path = binfolder
    if not os.path.exists(des_path):
        os.makedirs(des_path)
 
    for idx in range(start_idx, end_idx + 1):
        filename = f"{idx:06d}"  # 格式化文件名，确保是六位数字，例如000001
        velodyne_file = os.path.join(ori_path, filename + '.pcd')
        if os.path.exists(velodyne_file):  # 确保文件存在
            pl = read_pcd(velodyne_file)
            pl = pl.reshape(-1, 4).astype(np.float32)
            velodyne_file_new = os.path.join(des_path, filename + '.bin')
            pl.tofile(velodyne_file_new)
        else:
            print(f"File not found: {velodyne_file}")
 
if __name__ == "__main__":
    pcdfolder = "/home/dxy/SUSTechPOINTS/data/备份/lidar_copy"
    binfolder = "/home/dxy/SUSTechPOINTS/data/备份/lidar_bin"
    
    # 可以在这里设置开始和结束的帧
    start_frame = 1
    end_frame = 35
    
    pcd2bin(pcdfolder, binfolder, start_idx=start_frame, end_idx=end_frame)

经过上面的步骤，就可以得到velodyne文件夹下所需的bin格式点云数据。

复制训练集：

cp ../data_object_velodyne/training/velodyne/000*.bin ./custom/training/velodyne/

cp ../data_object_label_2/training/label_2/000* ./training/label_2/

复制测试集：

cp ../data_object_velodyne/testing/velodyne/000*.bin ./custom/testing/

2、小标注数据集

但是此处要补充一些内容，我观察SUSTechPoints的输入为pcd格式，所以为了标注kitti中的数据集，需要将bin点云文件，转为pcd文件。

bin2pcd.py

import numpy as np
import open3d as o3d
import os
import argparse

def bin_to_pcd(bin_file_path, pcd_file_path):
    """
    将 KITTI Velodyne 点云的 .bin 文件转换为 .pcd 文件
    :param bin_file_path: 输入的 .bin 文件路径
    :param pcd_file_path: 输出的 .pcd 文件路径
    """
    # 读取 .bin 文件
    points = np.fromfile(bin_file_path, dtype=np.float32).reshape(-1, 4)
    # 提取 x, y, z 坐标（第4列是强度，通常不需要）
    points = points[:, :3]

    # 创建 Open3D 点云对象
    pcd = o3d.geometry.PointCloud()
    pcd.points = o3d.utility.Vector3dVector(points)

    # 保存为 .pcd 文件
    o3d.io.write_point_cloud(pcd_file_path, pcd)
    print(f"点云文件已成功保存到 {pcd_file_path}")

def main():
    # 解析命令行参数
    parser = argparse.ArgumentParser(description="将 KITTI Velodyne 点云的 .bin 文件转换为 .pcd 文件")
    parser.add_argument("bin_file", type=str, help="输入的 .bin 文件路径")
    parser.add_argument("pcd_file", type=str, help="输出的 .pcd 文件路径")
    args = parser.parse_args()

    # 转换文件
    bin_to_pcd(args.bin_file, args.pcd_file)

if __name__ == "__main__":
    main()

将label文件从JSON转化成OpenPCDet的格式 txt

import json

filename = "777.json"

with open(filename, 'r') as f:
    data = json.load(f)

label_list = []
for obj_dict in data:
    label_name = obj_dict["obj_type"]
    pos_xyz = obj_dict["psr"]["position"]
    rot_xyz = obj_dict["psr"]["rotation"]
    scale_xyz = obj_dict["psr"]["scale"]
    
    temp = str(label_name) + " "
    for xyz_dict in [pos_xyz, scale_xyz]:
        for key in ["x", "y", "z"]:
            temp += str(xyz_dict[key])
            temp += " "
    temp += str(rot_xyz["z"]) + "\n"
    label_list.append(temp)

txt_name = filename.split(".")[0] + ".txt"
with open(txt_name, "w") as f:
    for label in label_list:
        f.write(label)

标注工具SUSTechPOINTS使用教程 - 减肥囧途的文章 - 知乎
https://zhuanlan.zhihu.com/p/687518464

1、链接讲解了如何修改标签：在SUSTechPOINTS/public/js/obj_cfg.js中配置。修改obj_type_map和popularCategories

2、在tools/trans_kitti_labels.py文件中作者提供了，kitti转为SUSTechPOINTS格式的方法。

3、在data目录下，新建文件夹，跟example同一级目录。然后在网页的scene中可以看到你建立的文件夹的名字。修改完标签，添加完目录就可以标注了。

严格按照kitti中的类别名字来修改标注配置文件。

3、如何查看kitti数据集的标注情况。

现在需要查看kitti数据集标注情况，特别是关心它是只标注了摄像头视野范围，还是标注了全部数据？带着问题去探索。

kitti标注数据中包含了截断率和难度系数（截断率和难度系数主要用于评估数据集的标注质量或模型的检测性能，而不是算法训练的直接需求。），这对我们来说是没有用的，我们自定义数据集这块可以不包含这些内容。

所以我们只关注位置、大小、朝向即可。

我先把少量kitti数据标注文件，转成JSON格式。甚至手动复制的方式，来完成标注，看看可否查看标注情况。

3.1 实践流程：

（1）先将SUSTechPoints标签文件，修改为kitti的类别。在example同级目录下，新建kitti目录。

（2）在kitti目录下，新建lidar和label文件夹，然后选择四到五个点云和标注文件，将点云文件拷贝到lidar中。

（3）将标注文件拷贝到label下。通过代码或者手动将里面的位置、大小、朝向信息拷贝到json文件中，JSON的格式，仿照example中的格式。

（4）重启网页查看，新建的JSON是否可以对应上。

（5）然后自己重新标注一下其他kitti文件。检查标注信息。

（6）通过脚本，在kitti数据集中的标注信息，将截断率和难度系数等信息删除掉。只剩下位置、大小、朝向信息。

后续优化部分：将源码中，需要通过读取kitti格式，修改为通过读取JSON来训练。

（7）修改源码，直接读取JSON格式。

（8）将以上自定义数据集投入到OPenPCDet中，启动训练。

（9）评估训练效果。

执行：

3.2 绕不开的坐标变换

读这个源码：注意排序问题。

import os
import json
import math
import numpy as np
import sys
 
def trans_detection_label(src_label_path, tgt_label_path, start_idx=None, end_idx=None):
    files = os.listdir(src_label_path)
    files.sort()  # 确保文件按名称排序
 
    # 初始化最大ID为0
    max_id = 0
 
    # 如果指定了 start_idx 和 end_idx，过滤出范围内的文件
    if start_idx is not None and end_idx is not None:
        files = [f for f in files if f.split('.')[0].isdigit() and start_idx <= int(f.split('.')[0]) <= end_idx]
 
    for fname in files:
        frame, _ = os.path.splitext(fname)
        print(frame)
 
        kitti_lines = []
        with open(os.path.join(src_label_path, fname), encoding='utf-8') as f:
            labels = json.load(f, strict=False)
            for label in labels:
                obj_type = label["obj_type"]
 
                # if label.get('obj_attr') == 'static':
                #     continue  # 跳过当前对象
                
                # 根据条件修改 obj_type
                if obj_type == 'Scooter':
                    obj_type = 'Bicycle'
                elif obj_type == 'Bus':
                    obj_type = 'Truck'
                
                if obj_type == 'Bicycle':
                    obj_type = 'Cyclist'
 
                box_id = int(label["obj_id"])
                box_id += 282
 
 
                # 更新最大ID
                if int(box_id) > max_id:
                    max_id = int(box_id)
 
                box_position_x = label['psr']['position']['x']
                box_position_y = label['psr']['position']['y']
                box_position_z = label['psr']['position']['z']
                box_scale_x = label['psr']['scale']['x']
                box_scale_y = label['psr']['scale']['y']
                box_scale_z = label['psr']['scale']['z']
                box_position_z_kitti = float(box_position_z) + 0 - float(box_scale_z / 2)
                rotation_yaw = -float(label['psr']['rotation']['z']) - math.pi / 2
 
                kitti_lines.append(f'{obj_type} 1.0 0 0.0 -1 -1 -1 -1 {box_scale_z:.4f} {box_scale_y:.4f} {box_scale_x:.4f} '
                                   f'{box_position_x:.4f} {box_position_y:.4f} {box_position_z_kitti:.4f} {rotation_yaw:.4f}\n')
 
            with open(os.path.join(tgt_label_path, frame + ".txt"), 'w') as outfile:
                outfile.writelines(kitti_lines)
                    # 在处理完所有文件后打印最大ID
    print(f"The maximum ID in the sequence is: {max_id}")
 
if __name__ == "__main__":
    src_label = "/home/dxy/SUSTechPOINTS/data/备份/label_copy"  # 替换成自己的路径
    tgt_label = "/home/dxy/SUSTechPOINTS/data/备份/label_kitti/"
    # 这里你可以指定开始和结束的索引，例如处理000001到001000范围内的文件
    start_idx = 1
    end_idx = 100
    trans_detection_label(src_label, tgt_label, start_idx, end_idx)

kitti标注文件，每行对应一个目标，字段用空格分割。

类别截断率难度系数 alpha x1 y1 x2 y2 h w l x y z ry

我们看每一行的内容：在写入每行时，逐个顺序解析。

为了解析方便，代码我做了分行处理，但是写到文件中是一行数据。

在将susTechpoints生成的JSON文件输出为kitti的txt时，注意最后写入的内容：

 kitti_lines.append(
f'{obj_type} 1.0 0 0.0 -1 -1 -1 -1 
{box_scale_z:.4f} {box_scale_y:.4f} {box_scale_x:.4f} 
'f'{box_position_x:.4f} {box_position_y:.4f} {box_position_z_kitti:.4f} 
{rotation_yaw:.4f}
\n')

第二行代码的内容是：
- 物体类别、截断程度、遮挡等级、观察角度alpha、
- 最后四个-1是二维边界框坐标（相机坐标系下：左上角和右下角的像素坐标，格式为 xmin ymin xmax ymax。）
第三行：三维物体尺寸：顺序：高度y、宽度x、长度z（单位：米）。
- 这边存个疑：正经都是长、宽、高的顺序。
第四行：三维物体中心坐标
- 相机坐标系下的位置，格式为 x、y、z（单位：米）。
- 注意是相机坐标系下的x右、y下、z前的顺序。
- 特殊约定：底部在地面（y=0），高度向下为负。
第五行：三维物体绕 Y 轴旋转角度 rotation_y，
- 浮点数，范围在 −π 到 π 之间，表示物体的朝向角度。
- 正值表示顺时针旋转，负值表示逆时针旋转。

看上面这个图，描述的是kitti坐标变换。

相机：x右、y下、z前：绕y：正值表示顺时针旋转，负值表示逆时针旋转。
雷达：x前、y左、z上：绕z：正值表示顺时针旋转，负值表示逆时针旋转。

所以代码第三行：

需要将我们在激光点云下的z作为高传入、宽是相机的x，对应点云的y，长是相机的z，对应点云的x

第四行：代码中，直接传入激光雷达的x,y,z我看着感觉很怪异，此处也存个疑。有可能需要加载坐标转换文件去看一下具体使用时的操作。

因为现在我们是逆转化（从kitti转为susTechPoints），我们自己可以按照相机的转换一下。此处相机是x右、y下、z前。转为JSON。

相机z超前，x朝右为正，对应激光雷达y饭方向，pos_y = -x；这个对称可做可不做。
相机z超前，与激光雷达x同向，因此pos_x = z值。
相机的高度y，对应激光雷达z的反向。这块看如何定义坐标变换。

box_position_z_kitti = float(box_position_z) + 0 - float(box_scale_z / 2)

代码中，是标定位置高度减去 box高度的一半得到 kitti的高度。
- 反之，则是kitti加上box高度的一半。
- 此处存疑，我们先按照代码中的一一对应来查看最终给的效果。如果模型最终检测效果可以对应，则说明没问题。
- 如果有问题，我们再去矫正，同时检查代码是否有修正。
- 最大的可能是，此处不改，输出时改。
角度：按照代码是：取负值，减去pi/2。反过来就是kitti加上pi/2取负值。这个实测是对的。

data_object_label_2/training/label_2/000005.txt

Pedestrian 0.00 0 1.94 330.06 178.74 360.77 238.64 1.87 0.96 0.65 -8.50 2.07 23.02 1.59
DontCare -1 -1 -10 606.64 170.67 621.06 184.34 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 606.00 170.91 621.35 184.28 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 605.68 171.21 620.77 184.34 -1 -1 -1 -1000 -1000 -1000 -10
DontCare -1 -1 -10 566.39 168.89 585.07 184.56 -1 -1 -1 -1000 -1000 -1000 -10

转为json：

txt	1.87	0.96	0.65	-8.50	2.07	23.02	1.59
kitti格式	h	w	l	x	y	z	ry
JSON	scale_z	scale_y	scale_x	pos_y	-(y-h/2)	pox_x	-0.0192

[{"psr": {"position": {"x": 23.02, "y": -8.5, "z": 2.07}, "scale": {"x": 0.65, "y": 0.96, "z": 1.87}, "rotation": {"x": -0.0, "y": 0.0, "z": -0.0192}}, "obj_type": "Pedestrian", "obj_id": "0"}]

经过网页显示，不正确的位姿。

经过与实际图像对比，发现pos_y确实需要是 -x才可以。

txt	1.87	0.96	0.65	-8.50	2.07	23.02	1.59
kitti格式	h	w	l	x	y	z	ry
JSON	scale_z	scale_y	scale_x	-pos_y	pos_z	pox_x	-0.0192

修改一个pos_y看看结果：

x,y对应的上。但是z轴不匹配。需要研究一下高度偏高了。因为我们直接使用了kitti的原始数据。我们按照代码公式转换一下。

按照官网，box的原点在box底部中心，这个底部中心相对于相机坐标系的位置。朝下为正。所以才有了kitti中的2.07的高度。相机在车顶支架上，高度离地面差不多刚好2米。所以，地面上的物体的box相机坐标系中位置坐标，高度y方向是2.07。

官网原文：

x-axis -> right (length), y-axis -> bottom (height), z-axis -> forward (width) 3B bb corner coordinates in camera coordinate frame, coordinate system is at the bottom center of the box.

如果我们当前，相对于激光雷达来做基准，那么这个高度应该是-2.07或者-2.07 + h/2。对这两个值进行尝试。

点击这个小灯。auto adjust。检查json文件:

[
  {
    "obj_id": "0",
    "obj_type": "Pedestrian",
    "psr": {
      "position": {
        "x": 23.399055750847452,
        "y": 8.54686493827783,
        "z": -0.9240000000596047
      },
      "rotation": {
        "x": 0,
        "y": 0,
        "z": -2.958333082130389
      },
      "scale": {
        "x": 0.65,
        "y": 0.96,
        "z": 1.87
      }
    }
  }
]

这个朝向也是正确的。这个SusTechPoints确实好用，可以自动识别朝向。

结果最近的pos_z竟是h/2。所以可以获得kitti转化为json的数据。最后，我们回顾总结一下：

txt	1.87	0.96	0.65	-8.50	2.07	23.02	1.59
kitti格式	h	w	l	x	y	z	ry
JSON	scale_z	scale_y	scale_x	-pos_y	-(y-h/2)	pox_x	-0.0192

[
  {
    "obj_id": "0",
    "obj_type": "Pedestrian",
    "psr": {
      "position": {
        "x": 23.399055750847452,
        "y": 8.54686493827783,
        "z": -0.9240000000596047
      },
      "rotation": {
        "x": 0,
        "y": 0,
        "z": -2.958333082130389
      },
      "scale": {
        "x": 0.65,
        "y": 0.96,
        "z": 1.87
      }
    }
  }
]

这个高度值，原始kitti的标注信息是相对于相机2的位置，当前我们相对于自身激光雷达时，直接写我们基于雷达的位置信息对于模型训练来说是可以的。所以原始代码中的转换是可以用的。

咱们这里如果是为了转换显示，可以去通过susTechPoints自适应调整来完成标定即可。主要还是为了弄清楚底层的原理。

因此绕不开的坐标变换只要在选择box长、宽、高时，注意坐标变换即可。其他的xyz可以按照雷达的输入来进行。当只有雷达时，坐标系自动转到了雷达坐标系下。

基于对应关系，我们再查看几个其他几个的变换。经过多个测试：结论如下。

txt	1.87	0.96	0.65	-8.50	2.07	23.02	1.59
kitti格式	h	w	l	x	y	z	ry
JSON	scale_z	scale_y	scale_x	-pos_y	-(y-h/2)	pox_x	-（ry+pi/2）

总结规律如下：

相机z轴和雷达x轴重合，所以kitti的z轴坐标与pox_x对应。
相机x轴和雷达y轴相反，所以kitti的x轴坐标与-pos_y对应。
相机y轴与雷达z轴相反，
- 所以，相机坐标系中的y轴的数值，是box的底部与相机之间的高度差。
- 如果想要box的中间高度，需要高度差减去box高度的一半，获得box中间到相机的高度差。
- 这个高度差因为是低于激光雷达高度的，所以应该是负值。
因此代码中。需要修改为如下：
- 后面七个数分别为：
- scale_z, scale_y,scale_x, -pos_y, h/2 - pos_z, pos_x,-rot_z-pi/2

具体坐标如何，需要实践中去测试探索。因为网上教程是不用这么麻烦的。

最彻底解决方案是：将训练源码改为支持json格式的输入。

3.3 将kitti原始数据变成自定义数据集。

保留类别，然后用1.0 0 0.0 -1 -1 -1 -1替换掉所有文件中的对应部分。保留后方的大小、位置、和朝向。假装是我们自己标定的数据集。

写一个脚本来替换每行的内容。

'{obj_type} 1.0 0 0.0 -1 -1 -1 -1 
{box_scale_z:.4f} {box_scale_y:.4f} {box_scale_x:.4f} 
'f'{box_position_x:.4f} {box_position_y:.4f} {box_position_z_kitti:.4f} 
{rotation_yaw:.4f}

脚本如下：读取每个文件，然后读取每个文件中的每行，然后替换掉每行的内容。关闭文件。如果类别是指定的三个类别，则保留，否则则跳过['Car', 'Pedestrian', 'Cyclist']。

import os
import math
import numpy as np
import sys
 
def trans_detection_label(src_label_path, tgt_label_path, start_idx=None, end_idx=None):
    files = os.listdir(src_label_path)
    files.sort()  # 确保文件按名称排序
 
     # 指定要替换的固定值
    fixed_values = "1.0 0 0.0 -1 -1 -1 -1"


    # 如果指定了 start_idx 和 end_idx，过滤出范围内的文件
    if start_idx is not None and end_idx is not None:
        files = [f for f in files if f.split('.')[0].isdigit() and start_idx <= int(f.split('.')[0]) <= end_idx]
 
    for fname in files:
        frame, _ = os.path.splitext(fname)
        print(frame)
 
        with open(os.path.join(src_label_path, fname), encoding='utf-8') as f_in, open(
            os.path.join(tgt_label_path, frame + ".txt"), 'w') as f_out:

            for line in f_in:
                parts = line.strip().split(' ')

                # 根据条件修改 obj_type
                # if parts[0] == "DontCare":
                if parts[0] in ['Car', 'Pedestrian', 'Cyclist']:
                    # 保留类别、尺寸、位置和朝向
                    new_line = f"{parts[0]} {fixed_values} {parts[8]} {parts[9]} {parts[10]} {parts[11]} {parts[12]} {parts[13]} {parts[14]}"
                    f_out.write(new_line + '\n')
                else:
                    continue  # 跳过DontCare行


    print(f"The finished frame in the sequence is : {frame}")
 
if __name__ == "__main__":
    src_label = "/home/vincent/data/handle_labels/data_object_label_2/training/label_2"  # 替换成自己的路径
    tgt_label = "/home/vincent/data/handle_labels/data_object_label_2/training/labels"
    # 这里你可以指定开始和结束的索引，例如处理000001到001000范围内的文件
    start_idx = 0
    end_idx = 10000
    trans_detection_label(src_label, tgt_label, start_idx, end_idx)

这个代码有问题。如果用此代码生成的box的位置关系都是相对于相机坐标的。而如果我们用绝对坐标，也就是用激光雷达的坐标，那么就应该将其转换到激光雷达坐标系下，否则，就会出现：在生成gt_database时，里面的文件是空的。因为标签位置与点云位置对不上。通过配置文件过滤筛选时，导致，所有的筛选全为空。

KITTI 数据集中的 data_object_velodyne 点云数据是相对于激光雷达坐标系的。

PREPARE: {
                 filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
                #  filter_by_difficulty: [-1],
              }

这行代码，必须要转换为激光雷达点云文件bin的坐标。

new_line = f"{parts[0]} {fixed_values} {parts[8]} {parts[9]} {parts[10]} {parts[11]} {parts[12]} {parts[13]} {parts[14]}"

所以上面这行代码要转换到激光雷达坐标系下（后面运行程序生成gt_database为空时，就要考虑这个坐标关系了。）：这解释了上面存疑，为何在从json转kitti时，要用x,y,kitti_z这个顺序顺排了。参照上面这个对应关系。

part_kitti	[8]	[9]	[10]	[11]	[12]	[13]	[14]
kitti格式	h	w	l	x	y	z	ry
txt	1.87	0.96	0.65	-8.50	2.07	23.02	1.59
JSON	scale_z	scale_y	scale_x	-pos_y	-(y-h/2)	pox_x	-（ry+pi/2）
custom_txt	h	w	l	z	-x	y	ry
custom_txt	1.87	0.96	0.65	23.02	8.5	2.07	1.59
part_custom	[8]	[9]	[10]	[13]	-[11]	[12]	[14]

再去试试效果。修改转换代码。

new_line = f"{parts[0]} {fixed_values} {parts[8]} {parts[9]} {parts[10]} {parts[13]} {-float(parts[11])} {parts[12]} {parts[14]}"

import os
import math
import numpy as np
import sys
 
def trans_detection_label(src_label_path, tgt_label_path, start_idx=None, end_idx=None):
    files = os.listdir(src_label_path)
    files.sort()  # 确保文件按名称排序
 
     # 指定要替换的固定值
    fixed_values = "1.0 0 0.0 -1 -1 -1 -1"


    # 如果指定了 start_idx 和 end_idx，过滤出范围内的文件
    if start_idx is not None and end_idx is not None:
        files = [f for f in files if f.split('.')[0].isdigit() and start_idx <= int(f.split('.')[0]) <= end_idx]
 
    for fname in files:
        frame, _ = os.path.splitext(fname)
        print(frame)
 
        with open(os.path.join(src_label_path, fname), encoding='utf-8') as f_in, open(
            os.path.join(tgt_label_path, frame + ".txt"), 'w') as f_out:

            for line in f_in:
                parts = line.strip().split(' ')

                # 根据条件修改 obj_type
                # if parts[0] == "DontCare":
                if parts[0] in ['Car', 'Pedestrian', 'Cyclist']:
                    # 保留类别、尺寸、位置和朝向
                    new_line = f"{parts[0]} {fixed_values} {parts[8]} {parts[9]} {parts[10]} {parts[13]} {-float(parts[11])} {parts[12]} {parts[14]}"
                    f_out.write(new_line + '\n')
                else:
                    continue  # 跳过DontCare行


    print(f"The finished frame in the sequence is : {frame}")
 
if __name__ == "__main__":
    src_label = "/home/vincent/data/handle_labels/data_object_label_2/training/label_2"  # 替换成自己的路径
    tgt_label = "/home/vincent/data/handle_labels/data_object_label_2/training/labels"
    # 这里你可以指定开始和结束的索引，例如处理000001到001000范围内的文件
    start_idx = 0
    end_idx = 10000
    trans_detection_label(src_label, tgt_label, start_idx, end_idx)

2025-05-21 10:01:48,543   INFO  ----------- Create dataloader & network & optimizer -----------
2025-05-21 10:01:48,602   INFO  Database filter by min points Car: 22989 => 927
2025-05-21 10:01:48,602   INFO  Database filter by min points Pedestrian: 3615 => 431
2025-05-21 10:01:48,602   INFO  Database filter by min points Cyclist: 1312 => 147
2025-05-21 10:01:48,606   INFO  Loading Custom dataset.
2025-05-21 10:01:48,645   INFO  Total samples for CUSTOM dataset: 5984

此时因为我们只用了三个类别['Car', 'Pedestrian', 'Cyclist']，所以数据要比kitti完成数据集要少。好消息是：可以运行了。可以开始跑通train。

scp labels.zip name@ip:path

scp [用户名]@[服务器地址]:[远程文件路径] [本地目录]

3.4 用自定义数据集来训练

训练自己的模型

【OpenPCDet】自定义数据集（kitti格式）训练PointPillars并评估&可视化，全过程debug

【3D目标检测】OpenPCDet自定义数据集训练_openpcdet 自己数据集-CSDN博客

这篇文章包含了，点云格式转换，标注格式转换，标注工具，训练方法等内容。

提及了常见错误信息和debug内容。

推荐了一个远程工具：nomachine。

（1）自定义目录结构：

custom
├── testing
│   ├── velodyne  # 点云数据
├── training
│   ├── label_2  # 标签文件
│   ├── velodyne

将上面咱们修改过的labels.zip的内容，放入到上面label_2中，里面原来的标注文件删除。

（2）数据集准备和预处理

运行分割数据集代码：splitDatasets.py

"""
2024.03.21
author:alian
数据预处理操作
1.数据集分割
"""
import os
import random
import shutil
import numpy as np


def get_train_val_txt_kitti(src_path):
    """
    数据格式:KITTI
    # For KITTI Dataset
    └── KITTI_DATASET_ROOT
        ├── training    <-- 7481 train data
        |   ├── image_2 <-- for visualization
        |   ├── calib
        |   ├── label_2
        |   └── velodyne
        └── testing     <-- 7580 test data
            ├── image_2 <-- for visualization
            ├── calib
            └── velodyne
            
    src_path: KITTI_DATASET_ROOT kitti文件夹

    """
    # 1.自动生成数据集划分文件夹ImageSets
    set_path = "%s/ImageSets/"%src_path
    if os.path.exists(set_path):  # 如果文件存在
        shutil.rmtree(set_path)  # 清空原始数据
        os.makedirs(set_path)  # 重新创建
    else:
        os.makedirs(set_path)  # 自动新建文件夹
    
    # 2.训练样本分割  生成train.txt val.txt trainval.txt
    train_list = os.listdir(os.path.join(src_path,'training','velodyne'))
    random.shuffle(train_list)  # 打乱顺序，随机采样
    # 设置训练和验证的比例
    train_p = 0.8

    # 开始写入分割文件
    f_train = open(os.path.join(set_path, "train.txt"), 'w')
    f_val = open(os.path.join(set_path, "val.txt"), 'w')
    f_trainval = open(os.path.join(set_path, "trainval.txt"), 'w')
    
    for i,src in enumerate(train_list):
        if i<int(len(train_list)*train_p): # 训练集的数量
            f_train.write(src[:-4] + '\n')
            f_trainval.write(src[:-4] + '\n')
        else:
            f_val.write(src[:-4] + '\n')
            f_trainval.write(src[:-4] + '\n')

    # 3.测试样本分割  生成test.txt
    test_list = os.listdir(os.path.join(src_path,'testing','velodyne'))
    f_test = open(os.path.join(set_path, "test.txt"), 'w')
    for i,src in enumerate(test_list):
        f_test.write(src[:-4] + '\n')


if __name__=='__main__':
    """
    	src_path: 数据目录
    """
    src_path = './data/custom'
    # get_train_val_txt_kitti_det3d(src_path)
    get_train_val_txt_kitti(src_path)

输入：数据集路径
输出：在输入的数据集路径下生成

python splitDatasets.py

└── ImageSets # 数据集划分文件
            ├── train.txt
            ├── val.txt
            ├── test.txt
            └── trainval.txt
├── testing
│   ├── velodyne  # 点云数据
├── training
│   ├── label_2  # 标签文件
│   ├── velodyne

（3）生成标准数据格式

建议复制kitti_dataset.py、kitti_dataset.yaml，

重命名为custom_dataset.py、kitti_custom_dataset.yaml，修改文件路径如下：

Openpcdet-Test-master/pcdet/datasets/custom/custom_dataset.py
Openpcdet-Test-master/tools/cfgs/dataset_configs/kitti_custom_dataset.yaml

VScode远程修改服务器上的代码

import copy
import pickle
import os
 
import numpy as np
from skimage import io
 
from ...ops.roiaware_pool3d import roiaware_pool3d_utils
from ...utils import box_utils, common_utils, object3d_custom
from ..dataset import DatasetTemplate
# 定义属于自己的数据集，集成数据集模板
class CustomDataset(DatasetTemplate):
    def __init__(self, dataset_cfg, class_names, training=True, root_path=None, logger=None, ext='.bin'):
        """
        Args:
            root_path:
            dataset_cfg:
            class_names:
            training:
            logger:
        """
        super().__init__(
            dataset_cfg=dataset_cfg, class_names=class_names, training=training, root_path=root_path, logger=logger
        )
        self.split = self.dataset_cfg.DATA_SPLIT[self.mode]
        self.root_split_path = os.path.join(self.root_path, ('training' if self.split != 'test' else 'testing'))
 
        split_dir = os.path.join(self.root_path, 'ImageSets',(self.split + '.txt'))
        self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if os.path.exists(split_dir) else None
 
        self.custom_infos = []
        self.include_custom_data(self.mode)
        self.ext = ext
 
    # 用于导入自定义数据
    def include_custom_data(self, mode):
        if self.logger is not None:
            self.logger.info('Loading Custom dataset.')
        custom_infos = []
 
        for info_path in self.dataset_cfg.INFO_PATH[mode]:
            info_path = self.root_path / info_path
            if not info_path.exists():
                continue
            with open(info_path, 'rb') as f:
                infos = pickle.load(f)
                custom_infos.extend(infos)
        
        self.custom_infos.extend(custom_infos)
 
        if self.logger is not None:
            self.logger.info('Total samples for CUSTOM dataset: %d' % (len(custom_infos)))
    
    # 用于获取标签的标注信息
    def get_infos(self, num_workers=4, has_label=True, count_inside_pts=True, sample_id_list=None):
        import concurrent.futures as futures
        # 线程函数，主要是为了多线程读取数据，加快处理速度
        
        # 处理一帧
        def process_single_scene(sample_idx):
            print('%s sample_idx: %s' % (self.split, sample_idx))
            # 创建一个用于存储一帧信息的空字典
            info = {}
            # 定义该帧点云信息，pointcloud_info
            pc_info = {'num_features': 4, 'lidar_idx': sample_idx}
            # 将pc_info这个字典作为info字典里的一个键值对的值，其键名为‘point_cloud’添加到info里去
            info['point_cloud'] = pc_info
            '''
            # image信息和calib信息都暂时不需要
            # image_info = {'image_idx': sample_idx, 'image_shape': self.get_image_shape(sample_idx)}
            # info['image'] = image_info
            # calib = self.get_calib(sample_idx)
            # P2 = np.concatenate([calib.P2, np.array([[0., 0., 0., 1.]])], axis=0)
            # R0_4x4 = np.zeros([4, 4], dtype=calib.R0.dtype)
            # R0_4x4[3, 3] = 1.
            # R0_4x4[:3, :3] = calib.R0
            # V2C_4x4 = np.concatenate([calib.V2C, np.array([[0., 0., 0., 1.]])], axis=0)
            # calib_info = {'P2': P2, 'R0_rect': R0_4x4, 'Tr_velo_to_cam': V2C_4x4}
            # info['calib'] = calib_info
            '''
            if has_label:
                # 通过get_label函数，读取出该帧的标签标注信息
                obj_list = self.get_label(sample_idx)
                # 创建用于存储该帧标注信息的空字典
                annotations = {}
                # 下方根据标注文件里的属性将对应的信息加入到annotations的键值对，可以根据自己的需求取舍
                annotations['name'] = np.array([obj.cls_type for obj in obj_list])
                # annotations['truncated'] = np.array([obj.truncation for obj in obj_list])
                # annotations['occluded'] = np.array([obj.occlusion for obj in obj_list])
                # annotations['alpha'] = np.array([obj.alpha for obj in obj_list])
                # annotations['bbox'] = np.concatenate([obj.box2d.reshape(1, 4) for obj in obj_list], axis=0)
                annotations['dimensions'] = np.array([[obj.l, obj.h, obj.w] for obj in obj_list])  # lhw(camera) format
                annotations['location'] = np.concatenate([obj.loc.reshape(1, 3) for obj in obj_list], axis=0)
                annotations['rotation_y'] = np.array([obj.ry for obj in obj_list])
                annotations['score'] = np.array([obj.score for obj in obj_list])
                # annotations['difficulty'] = np.array([obj.level for obj in obj_list], np.int32)
 
                # 统计有效物体的个数，即去掉类别名称为“Dontcare”以外的
                num_objects = len([obj.cls_type for obj in obj_list if obj.cls_type != 'DontCare'])
                # 统计物体的总个数，包括了Dontcare
                num_gt = len(annotations['name'])
                # 获得当前的index信息
                index = list(range(num_objects)) + [-1] * (num_gt - num_objects)
                annotations['index'] = np.array(index, dtype=np.int32)
 
                # 从annotations里提取出从标注信息里获取的location、dims、rots等信息，赋值给对应的变量
                loc = annotations['location'][:num_objects]
                dims = annotations['dimensions'][:num_objects]
                rots = annotations['rotation_y'][:num_objects]
                # 由于我们的数据集本来就是基于雷达坐标系标注，所以无需坐标转换
                #loc_lidar = calib.rect_to_lidar(loc)
                loc_lidar = self.get_calib(loc)
                # 原来的dims排序是高宽长hwl,现在转到pcdet的统一坐标系下,按lhw排布
                l, h, w = dims[:, 0:1], dims[:, 1:2], dims[:, 2:3]
                
                # 由于我们基于雷达坐标系标注，所以获取的中心点本来就是空间中心，所以无需从底面中心转到空间中心
                # bottom center -> object center: no need for loc_lidar[:, 2] += h[:, 0] / 2
                # print("sample_idx: ", sample_idx, "loc: ", loc, "loc_lidar: " , sample_idx, loc_lidar)
                # get gt_boxes_lidar see https://zhuanlan.zhihu.com/p/152120636
                # loc_lidar[:, 2] += h[:, 0] / 2
                gt_boxes_lidar = np.concatenate([loc_lidar, l, w, h, -(np.pi / 2 + rots[..., np.newaxis])], axis=1)
                # 将雷达坐标系下的真值框信息存入annotations中
                annotations['gt_boxes_lidar'] = gt_boxes_lidar
                # 将annotations这整个字典作为info字典里的一个键值对的值
                info['annos'] = annotations
            
            return info
            # 后续的由于没有calib信息和image信息，所以可以直接注释
            '''
            #     if count_inside_pts:
            #         points = self.get_lidar(sample_idx)
            #         calib = self.get_calib(sample_idx)
            #         pts_rect = calib.lidar_to_rect(points[:, 0:3])
            #         fov_flag = self.get_fov_flag(pts_rect, info['image']['image_shape'], calib)
            #         pts_fov = points[fov_flag]
            #         corners_lidar = box_utils.boxes_to_corners_3d(gt_boxes_lidar)
            #         num_points_in_gt = -np.ones(num_gt, dtype=np.int32)
            #         for k in range(num_objects):
            #             flag = box_utils.in_hull(pts_fov[:, 0:3], corners_lidar[k])
            #             num_points_in_gt[k] = flag.sum()
            #         annotations['num_points_in_gt'] = num_points_in_gt
            # return info
            '''
        sample_id_list = sample_id_list if sample_id_list is not None else self.sample_id_list
        if sample_id_list is None:
            raise ValueError("Sample ID list is None. Check the dataset configuration and files.")  
              
        with futures.ThreadPoolExecutor(num_workers) as executor:
            infos = executor.map(process_single_scene, sample_id_list)
        return list(infos)
        # 此时返回值infos是列表，列表元素为字典类型
                
    # 用于获取标定信息
    def get_calib(self, loc):
        # calib_file = self.root_split_path / 'calib' / ('%s.txt' % idx)
        # assert calib_file.exists()
        # return calibration_kitti.Calibration(calib_file)
        
        # loc_lidar = np.concatenate([np.array((float(loc_obj[2]),float(-loc_obj[0]),float(loc_obj[1]-2.3)),dtype=np.float32).reshape(1,3) for loc_obj in loc])
        # return loc_lidar
        # 这里做了一个由相机坐标系到雷达坐标系翻转（都遵从右手坐标系），但是 -2.3这个数值具体如何得来需要再看下
 
        # 我们的label中的xyz就是在雷达坐标系下,不用转变,直接赋值
        loc_lidar = np.concatenate([np.array((float(loc_obj[0]),float(loc_obj[1]),float(loc_obj[2])),dtype=np.float32).reshape(1,3) for loc_obj in loc])
        return loc_lidar
                
    # 用于获取标签
    def get_label(self, idx):
        # 从指定路径中提取txt内容
        label_file = self.root_split_path / 'label_2' / ('%s.txt' % idx)
        print(f"Label file path: {label_file}")  # 添加调试信息

        assert label_file.exists(), f"Label file not found: {label_file}"  # 添加详细错误信息
        # 主要就是从这个函数里获取具体的信息
        return object3d_custom.get_objects_from_label(label_file)
 
    # 用于获取雷达点云信息
    def get_lidar(self, idx, getitem):
        """
            Loads point clouds for a sample
                Args:
                    index (int): Index of the point cloud file to get.
                Returns:
                    np.array(N, 4): point cloud.
        """
        # get lidar statistics
        if getitem == True:
            lidar_file = self.root_split_path + '/velodyne/' + ('%s.bin' % idx)
            
        else:
            lidar_file = self.root_split_path / 'velodyne' / ('%s.bin' % idx)
        return np.fromfile(str(lidar_file), dtype=np.float32).reshape(-1, 4)
 
    # 用于数据集划分
    def set_split(self, split):
        super().__init__(
            dataset_cfg=self.dataset_cfg, class_names=self.class_names, training=self.training, root_path=self.root_path, logger=self.logger
        )
        self.split = split
        self.root_split_path = self.root_path / ('training' if self.split != 'test' else 'testing')
 
        split_dir = self.root_path / 'ImageSets' / (self.split + '.txt')
        self.sample_id_list = [x.strip() for x in open(split_dir).readlines()] if split_dir.exists() else None
 
    # 创建真值数据库
    # Create gt database for data augmentation
    def create_groundtruth_database(self, info_path=None, used_classes=None, split='train'):
            import torch
    
            database_save_path = Path(self.root_path) / ('gt_database' if split == 'train' else ('gt_database_%s' % split))
            db_info_save_path = Path(self.root_path) / ('custom_dbinfos_%s.pkl' % split)
    
            database_save_path.mkdir(parents=True, exist_ok=True)
            all_db_infos = {}
    
            with open(info_path, 'rb') as f:
                infos = pickle.load(f)
    
            for k in range(len(infos)):
                print('gt_database sample: %d/%d' % (k + 1, len(infos)))
                info = infos[k]
                sample_idx = info['point_cloud']['lidar_idx']
                points = self.get_lidar(sample_idx,False)
                annos = info['annos']
                names = annos['name']
                # difficulty = annos['difficulty']
                # bbox = annos['bbox']
                gt_boxes = annos['gt_boxes_lidar']
    
                num_obj = gt_boxes.shape[0]
                point_indices = roiaware_pool3d_utils.points_in_boxes_cpu(
                    torch.from_numpy(points[:, 0:3]), torch.from_numpy(gt_boxes)
                ).numpy()  # (nboxes, npoints)
    
                for i in range(num_obj):
                    filename = '%s_%s_%d.bin' % (sample_idx, names[i], i)
                    filepath = database_save_path / filename
                    gt_points = points[point_indices[i] > 0]
    
                    gt_points[:, :3] -= gt_boxes[i, :3]
                    with open(filepath, 'w') as f:
                        gt_points.tofile(f)
    
                    if (used_classes is None) or names[i] in used_classes:
                        db_path = str(filepath.relative_to(self.root_path))  # gt_database/xxxxx.bin
                        # db_info = {'name': names[i], 'path': db_path, 'image_idx': sample_idx, 'gt_idx': i,
                        #            'box3d_lidar': gt_boxes[i], 'num_points_in_gt': gt_points.shape[0],
                        #            'difficulty': difficulty[i], 'bbox': bbox[i], 'score': annos['score'][i]}
                        db_info = {'name': names[i], 'path': db_path,  'gt_idx': i,
                                'box3d_lidar': gt_boxes[i], 'num_points_in_gt': gt_points.shape[0], 'score': annos['score'][i]}
                        
                        if names[i] in all_db_infos:
                            all_db_infos[names[i]].append(db_info)
                        else:
                            all_db_infos[names[i]] = [db_info]
            for k, v in all_db_infos.items():
                print('Database %s: %d' % (k, len(v)))
    
            with open(db_info_save_path, 'wb') as f:
                pickle.dump(all_db_infos, f)
    # 生成预测字典信息
    @staticmethod
    def generate_prediction_dicts(batch_dict, pred_dicts, class_names, output_path=None):
        """
        Args:
            batch_dict:
                frame_id:
            pred_dicts: list of pred_dicts
                pred_boxes: (N,7), Tensor
                pred_scores: (N), Tensor
                pred_lables: (N), Tensor
            class_names:
            output_path:
        Returns:
        """
        def get_template_prediction(num_smaples):
            ret_dict = {
                'name': np.zeros(num_smaples), 'alpha' : np.zeros(num_smaples),
                'dimensions': np.zeros([num_smaples, 3]), 'location': np.zeros([num_smaples, 3]),
                'rotation_y': np.zeros(num_smaples), 'score': np.zeros(num_smaples),
                'boxes_lidar': np.zeros([num_smaples, 7])
            }
            return ret_dict
 
        def generate_single_sample_dict(batch_index, box_dict):
            pred_scores = box_dict['pred_scores'].cpu().numpy()
            pred_boxes = box_dict['pred_boxes'].cpu().numpy()
            pred_labels = box_dict['pred_labels'].cpu().numpy()
 
            # Define an empty template dict to store the prediction information, 'pred_scores.shape[0]' means 'num_samples'
            pred_dict = get_template_prediction(pred_scores.shape[0])
            # If num_samples equals zero then return the empty dict
            if pred_scores.shape[0] == 0:
                return pred_dict
 
            # No calibration files
 
            # pred_boxes_camera = box_utils.boxes3d_lidar_to_kitti_camera(pred_boxes,None)
 
            pred_dict['name'] = np.array(class_names)[pred_labels - 1]
            # pred_dict['alpha'] = -np.arctan2(-pred_boxes[:, 1], pred_boxes[:, 0]) + pred_boxes_camera[:, 6]
            # pred_dict['dimensions'] = pred_boxes_camera[:, 3:6]
            # pred_dict['location'] = pred_boxes_camera[:, 0:3]
            # pred_dict['rotation_y'] = pred_boxes_camera[:, 6]
            pred_dict['score'] = pred_scores
            pred_dict['boxes_lidar'] = pred_boxes
 
            return pred_dict
 
        annos = []
        for index, box_dict in enumerate(pred_dicts):
            frame_id = batch_dict['frame_id'][index]
 
            single_pred_dict = generate_single_sample_dict(index, box_dict)
            single_pred_dict['frame_id'] = frame_id
            annos.append(single_pred_dict)
 
            # Output pred results to Output-path in .txt file 
            if output_path is not None:
                cur_det_file = output_path / ('%s.txt' % frame_id)
                with open(cur_det_file, 'w') as f:
                    bbox = single_pred_dict['bbox']
                    loc = single_pred_dict['location']
                    dims = single_pred_dict['dimensions']  # lhw -> hwl: lidar -> camera
 
                    for idx in range(len(bbox)):
                        print('%s -1 -1 %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f %.4f'
                            % (single_pred_dict['name'][idx], single_pred_dict['alpha'][idx],
                                bbox[idx][0], bbox[idx][1], bbox[idx][2], bbox[idx][3],
                                dims[idx][1], dims[idx][2], dims[idx][0], loc[idx][0],
                                loc[idx][1], loc[idx][2], single_pred_dict['rotation_y'][idx],
                                single_pred_dict['score'][idx]), file=f)
            return annos
    def evaluation(self, det_annos, class_names, **kwargs):
            if 'annos' not in self.custom_infos[0].keys():
                return None, {}
 
            from .kitti_object_eval_python import eval as kitti_eval
 
            eval_det_annos = copy.deepcopy(det_annos)
            eval_gt_annos = [copy.deepcopy(info['annos']) for info in self.custom_infos]
            ap_result_str, ap_dict = kitti_eval.get_official_eval_result(eval_gt_annos, eval_det_annos, class_names)
 
            return ap_result_str, ap_dict
    # 用于返回训练帧的总个数
    def __len__(self):
        if self._merge_all_iters_to_one_epoch:
            return len(self.sample_id_list) * self.total_epochs
 
        return len(self.custom_infos)
 
    # 用于将点云与3D标注框均转至前述统一坐标定义下，送入数据基类提供的self.prepare_data()
    def __getitem__(self, index):  ## 修改如下
        if self._merge_all_iters_to_one_epoch:
            index = index % len(self.custom_infos)
 
        info = copy.deepcopy(self.custom_infos[index])
        sample_idx = info['point_cloud']['lidar_idx']
        points = self.get_lidar(sample_idx, True)
        input_dict = {
            'frame_id': self.sample_id_list[index],
            'points': points
        }
 
        if 'annos' in info:
            annos = info['annos']
            annos = common_utils.drop_info_with_name(annos, name='DontCare')
            gt_names = annos['name']
            gt_boxes_lidar = annos['gt_boxes_lidar']
            input_dict.update({
                'gt_names': gt_names,
                'gt_boxes': gt_boxes_lidar
            })
 
        data_dict = self.prepare_data(data_dict=input_dict)
 
        return data_dict
 
# 用于创建自定义数据集的信息
def create_custom_infos(dataset_cfg, class_names, data_path, save_path, workers=4):
    dataset = CustomDataset(dataset_cfg=dataset_cfg, class_names=class_names, root_path=data_path, training=False)
    train_split, val_split = 'train', 'val'
   # 定义文件的路径和名称
    train_filename = save_path / ('custom_infos_%s.pkl' % train_split)
    val_filename = save_path / ('custom_infos_%s.pkl' % val_split)
    trainval_filename = save_path / 'custom_infos_trainval.pkl'
    test_filename = save_path / 'custom_infos_test.pkl'
 
    print('---------------Start to generate data infos---------------')
 
    dataset.set_split(train_split)
    # 执行完上一步，得到train相关的保存文件，以及sample_id_list的值为train.txt文件下的数字
    # 下面是得到train.txt中序列相关的所有点云数据的信息，并且进行保存
    custom_infos_train = dataset.get_infos(num_workers=workers, has_label=True, count_inside_pts=True)
    with open(train_filename, 'wb') as f:
        pickle.dump(custom_infos_train, f)
    print('Custom info train file is saved to %s' % train_filename)
 
    dataset.set_split(val_split)
    # 对验证集的数据进行信息统计并保存
    custom_infos_val = dataset.get_infos(num_workers=workers, has_label=True, count_inside_pts=True)
    with open(val_filename, 'wb') as f:
        pickle.dump(custom_infos_val, f)
    print('Custom info val file is saved to %s' % val_filename)
 
    with open(trainval_filename, 'wb') as f:
        pickle.dump(custom_infos_train + custom_infos_val, f)
    print('Custom info trainval file is saved to %s' % trainval_filename)
 
 
    dataset.set_split('test')
    # kitti_infos_test = dataset.get_infos(num_workers=workers, has_label=False, count_inside_pts=False)
    custom_infos_test = dataset.get_infos(num_workers=workers, has_label=False, count_inside_pts=False)
    with open(test_filename, 'wb') as f:
        pickle.dump(custom_infos_test, f)
    print('Custom info test file is saved to %s' % test_filename)
 
    
 
    print('---------------Start create groundtruth database for data augmentation---------------')
    # 用trainfile产生groundtruth_database
    # 只保存训练数据中的gt_box及其包围点的信息，用于数据增强    
    dataset.set_split(train_split)
    dataset.create_groundtruth_database(info_path=train_filename, split=train_split)
 
    print('---------------Data preparation Done---------------')
 
if __name__=='__main__':
    import sys
    if sys.argv.__len__() > 1 and sys.argv[1] == 'create_custom_infos':
        import yaml
        from pathlib import Path
        from easydict import EasyDict
        dataset_cfg = EasyDict(yaml.safe_load(open(sys.argv[2])))
        ROOT_DIR = (Path(__file__).resolve().parent / '../../../').resolve()
        # # 创建 CustomDataset 实例后，调用 set_split 方法初始化 sample_id_list
        # dataset = CustomDataset(
        #     dataset_cfg=dataset_cfg,
        #     class_names=['Car', 'Pedestrian', 'Cyclist'],  # 修改类别
        #     root_path=ROOT_DIR / 'data' / 'custom',
        #     training=False
        # )
        # dataset.set_split('train')  # 添加这行代码初始化 sample_id_list

        create_custom_infos(
            dataset_cfg=dataset_cfg,
            class_names=['Car', 'Pedestrian', 'Cyclist'], # 1.修改类别
            data_path=ROOT_DIR / 'data' / 'custom',
            save_path=ROOT_DIR / 'data' / 'custom'
        )

DATASET: 'CustomDataset'
# DATA_PATH: '../data/custom'
DATA_PATH: '/home/northlab/Downloads/test/OpenPCDet/data/custom'

# If this config file is modified then pcdet/models/detectors/detector3d_template.py:
# Detector3DTemplate::build_networks:model_info_dict needs to be modified.
POINT_CLOUD_RANGE: [-75.2, -75.2, -2, 75.2, 75.2, 4]

# MAP_CLASS_TO_KITTI: {
#     'Vehicle': 'Car',
#     'Pedestrian': 'Pedestrian',
#     'Cyclist': 'Cyclist',
# }

DATA_SPLIT: {
    'train': train,
    'test': val
}

INFO_PATH: {
    'train': [custom_infos_train.pkl],
    'test': [custom_infos_val.pkl],
}

GET_ITEM_LIST: ["points"]
FOV_POINTS_ONLY: True

POINT_FEATURE_ENCODING: {
    encoding_type: absolute_coordinates_encoding,
    used_feature_list: ['x', 'y', 'z', 'intensity'],
    src_feature_list: ['x', 'y', 'z', 'intensity'],
}

DATA_AUGMENTOR:
    DISABLE_AUG_LIST: ['placeholder']
    AUG_CONFIG_LIST:
        - NAME: gt_sampling
          USE_ROAD_PLANE: False             # Notice that 'USE_ROAD_PLANE'
          DB_INFO_PATH:
              - custom_dbinfos_train.pkl   # pcdet/datasets/augmentor/database_ampler.py:line 26
          PREPARE: {
             filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],  # 2.修改类别
          }

          SAMPLE_GROUPS: ['Car:20', 'Pedestrian:15', 'Cyclist:15']  # 3. 修改类别
          NUM_POINT_FEATURES: 4
          DATABASE_WITH_FAKELIDAR: False
          REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
          LIMIT_WHOLE_SCENE: True

        - NAME: random_world_flip
          ALONG_AXIS_LIST: ['x', 'y']

        - NAME: random_world_rotation
          WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

        - NAME: random_world_scaling
          WORLD_SCALE_RANGE: [0.95, 1.05]

DATA_PROCESSOR:
    - NAME: mask_points_and_boxes_outside_range
      REMOVE_OUTSIDE_BOXES: True

    - NAME: shuffle_points
      SHUFFLE_ENABLED: {
        'train': True,
        'test': False
      }

    - NAME: transform_points_to_voxels
      VOXEL_SIZE: [0.05, 0.05, 0.1]
      MAX_POINTS_PER_VOXEL: 5
      MAX_NUMBER_OF_VOXELS: {
        'train': 16000,
        'test': 40000
      }

python -m pcdet.datasets.custom.custom_dataset create_custom_infos tools/cfgs/dataset_configs/kitti_custom_dataset.yaml

结果如下

gt_database sample: 5984/5984
Database Car: 22965
Database Tram: 424
Database Cyclist: 1302
Database Pedestrian: 3579
Database Misc: 776
Database Truck: 870
Database Van: 2343
Database Person_sitting: 182
---------------Data preparation Done---------------

（4）模型训练配置文件修改

选用模型为potinpillar，修改文件如下：

OpenPCDet/tools/cfgs/kitti_models/pointpillar.yaml

在同一路径下，我新建了一个pointpillar_custom.yaml文件。

CLASS_NAMES: ['Car', 'Pedestrian', 'Cyclist']   # 自己的类别

DATA_CONFIG: 
    _BASE_CONFIG_: ./tools/cfgs/dataset_configs/kitti_custom_dataset.yaml   # 此处修改对应路径
    POINT_CLOUD_RANGE: [0, -39.68, -3, 69.12, 39.68, 1]   # 自己的范围
    DATA_PROCESSOR:
        - NAME: mask_points_and_boxes_outside_range
          REMOVE_OUTSIDE_BOXES: True

        - NAME: shuffle_points
          SHUFFLE_ENABLED: {
            'train': True,
            'test': False
          }

        - NAME: transform_points_to_voxels
          VOXEL_SIZE: [0.16, 0.16, 4]
          MAX_POINTS_PER_VOXEL: 32
          MAX_NUMBER_OF_VOXELS: {
            'train': 16000,
            'test': 40000
          }
    DATA_AUGMENTOR:
        DISABLE_AUG_LIST: ['placeholder']
        AUG_CONFIG_LIST:
            - NAME: gt_sampling
              USE_ROAD_PLANE: False
              DB_INFO_PATH:
                   - custom_dbinfos_train.pkl
                  # - kitti_dbinfos_train.pkl      # /pcdet/datasets/augmentor/data_augmentor.py", line 24
              PREPARE: {
                 filter_by_min_points: ['Car:5', 'Pedestrian:5', 'Cyclist:5'],
                 # filter_by_difficulty: [-1],
              }

              SAMPLE_GROUPS: ['Car:15','Pedestrian:15', 'Cyclist:15']
              NUM_POINT_FEATURES: 4
              DATABASE_WITH_FAKELIDAR: False
              REMOVE_EXTRA_WIDTH: [0.0, 0.0, 0.0]
              LIMIT_WHOLE_SCENE: False

            - NAME: random_world_flip
              ALONG_AXIS_LIST: ['x']

            - NAME: random_world_rotation
              WORLD_ROT_ANGLE: [-0.78539816, 0.78539816]

            - NAME: random_world_scaling
              WORLD_SCALE_RANGE: [0.95, 1.05]

MODEL:
    NAME: PointPillar

    VFE:
        NAME: PillarVFE
        WITH_DISTANCE: False
        USE_ABSLOTE_XYZ: True
        USE_NORM: True
        NUM_FILTERS: [64]

    MAP_TO_BEV:
        NAME: PointPillarScatter
        NUM_BEV_FEATURES: 64

    BACKBONE_2D:
        NAME: BaseBEVBackbone
        LAYER_NUMS: [3, 5, 5]
        LAYER_STRIDES: [2, 2, 2]
        NUM_FILTERS: [64, 128, 256]
        UPSAMPLE_STRIDES: [1, 2, 4]
        NUM_UPSAMPLE_FILTERS: [128, 128, 128]

    DENSE_HEAD:
        NAME: AnchorHeadSingle
        CLASS_AGNOSTIC: False

        USE_DIRECTION_CLASSIFIER: True
        DIR_OFFSET: 0.78539
        DIR_LIMIT_OFFSET: 0.0
        NUM_DIR_BINS: 2

        ANCHOR_GENERATOR_CONFIG: [
            {
                'class_name': 'Car',
                'anchor_sizes': [[3.9, 1.6, 1.56]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-1.78],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.6,
                'unmatched_threshold': 0.45
            },
            {
                'class_name': 'Pedestrian',
                'anchor_sizes': [[0.8, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            },
            {
                'class_name': 'Cyclist',
                'anchor_sizes': [[1.76, 0.6, 1.73]],
                'anchor_rotations': [0, 1.57],
                'anchor_bottom_heights': [-0.6],
                'align_center': False,
                'feature_map_stride': 2,
                'matched_threshold': 0.5,
                'unmatched_threshold': 0.35
            }
        ]

        TARGET_ASSIGNER_CONFIG:
            NAME: AxisAlignedTargetAssigner
            POS_FRACTION: -1.0
            SAMPLE_SIZE: 512
            NORM_BY_NUM_EXAMPLES: False
            MATCH_HEIGHT: False
            BOX_CODER: ResidualCoder

        LOSS_CONFIG:
            LOSS_WEIGHTS: {
                'cls_weight': 1.0,
                'loc_weight': 2.0,
                'dir_weight': 0.2,
                'code_weights': [1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0]
            }

    POST_PROCESSING:
        RECALL_THRESH_LIST: [0.3, 0.5, 0.7]
        SCORE_THRESH: 0.1
        OUTPUT_RAW_SCORE: False

        EVAL_METRIC: kitti

        NMS_CONFIG:
            MULTI_CLASSES_NMS: False
            NMS_TYPE: nms_gpu
            NMS_THRESH: 0.01
            NMS_PRE_MAXSIZE: 4096
            NMS_POST_MAXSIZE: 500


OPTIMIZATION:
    BATCH_SIZE_PER_GPU: 4
    NUM_EPOCHS: 80

    OPTIMIZER: adam_onecycle
    LR: 0.003
    WEIGHT_DECAY: 0.01
    MOMENTUM: 0.9

    MOMS: [0.95, 0.85]
    PCT_START: 0.4
    DIV_FACTOR: 10
    DECAY_STEP_LIST: [35, 45]
    LR_DECAY: 0.1
    LR_CLIP: 0.0000001

    LR_WARMUP: False
    WARMUP_EPOCH: 1

    GRAD_NORM_CLIP: 10

（5）启动训练

python tools/train.py --cfg_file tools/cfgs/kitti_models/pointpillar_custom.yaml --batch_size=4 --epochs=320

（6）Debug

2025-05-20 18:17:59,974   INFO  ----------- Create dataloader & network & optimizer -----------
2025-05-20 18:18:00,033   INFO  Database filter by min points Car: 23056 => 0
2025-05-20 18:18:00,034   INFO  Database filter by min points Pedestrian: 3611 => 0
2025-05-20 18:18:00,034   INFO  Database filter by min points Cyclist: 1323 => 0

2025-05-20 16:00:40,954   INFO  **********************Start training cfgs/kitti_models/pointpillar_custom(default)**********************
epochs:   0%|                                                                                                                                                                                                                       | 0/320 [00:00<?, ?it/s]
Traceback (most recent call last):                                                                                                                                                                                                 | 0/1496 [00:00<?, ?it/s]
  File "/home/northlab/Downloads/test/OpenPCDet/tools/train.py", line 233, in <module>
    main()
  File "/home/northlab/Downloads/test/OpenPCDet/tools/train.py", line 178, in main
    train_model(
  File "/home/northlab/Downloads/test/OpenPCDet/tools/train_utils/train_utils.py", line 180, in train_model
    accumulated_iter = train_one_epoch(
  File "/home/northlab/Downloads/test/OpenPCDet/tools/train_utils/train_utils.py", line 33, in train_one_epoch
    batch = next(dataloader_iter)
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 630, in __next__
    data = self._next_data()
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1345, in _next_data
    return self._process_data(data)
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/utils/data/dataloader.py", line 1371, in _process_data
    data.reraise()
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/_utils.py", line 694, in reraise
    raise exception
ValueError: Caught ValueError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/utils/data/_utils/worker.py", line 308, in _worker_loop
    data = fetcher.fetch(index)
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in fetch
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/torch/utils/data/_utils/fetch.py", line 51, in <listcomp>
    data = [self.dataset[idx] for idx in possibly_batched_index]
  File "/home/northlab/Downloads/test/OpenPCDet/pcdet/datasets/custom/custom_dataset.py", line 375, in __getitem__
    data_dict = self.prepare_data(data_dict=input_dict)
  File "/home/northlab/Downloads/test/OpenPCDet/pcdet/datasets/dataset.py", line 185, in prepare_data
    data_dict = self.data_augmentor.forward(
  File "/home/northlab/Downloads/test/OpenPCDet/pcdet/datasets/augmentor/data_augmentor.py", line 302, in forward
    data_dict = cur_augmentor(data_dict=data_dict)
  File "/home/northlab/Downloads/test/OpenPCDet/pcdet/datasets/augmentor/database_sampler.py", line 468, in __call__
    sampled_boxes = np.stack([x['box3d_lidar'] for x in sampled_dict], axis=0).astype(np.float32)
  File "<__array_function__ internals>", line 200, in stack
  File "/home/northlab/.conda/envs/pcdet/lib/python3.9/site-packages/numpy/core/shape_base.py", line 460, in stack
    raise ValueError('need at least one array to stack')
ValueError: need at least one array to stack

我启动train时，会提示加载数据失败。查看Dataloader才发现，对于每个box都会进行点数的筛查。导致没有筛查通过。经过反思以上内容，认为是标签文件从kitti转为custom时，此时需要的是相对雷达的坐标。而直接保留kitti还是相机坐标系，因此导致custom训练失败。

纠正方法，在上面，注意调整个人标注与kitti坐标系的转换上面。

3.5 评估训练效果。

python tools/demo.py --cfg_file ./tools/cfgs/kitti_models/pointpillar_custom.yaml --data_path ./data/custom/testing/velodyne/ --ckpt ./output/cfgs/kitti_models/pointpillar_custom/default/ckpt/latest_model.pth

这个查看是关掉一个，显示下一个。

第二个感觉非常差劲。

结论：目前可以跑通整个流程，但是细节上优化不够。需要重新研究输入的训练标注情况和输出的位置关系。

效果真不好。

待完善：

1、需要改写demo.py，既要把结果通过open3d显示，又要通过日志的形式将类别和位置输出出来。然后看看以何种方式分发转发出去。

2、当前只用了三个类别：['Car', 'Pedestrian', 'Cyclist']，我们可以再把['Truck']加入进去，看看效果。重新修改类别和配置文件，看看效果如何？

import os
import math
import numpy as np
import sys
 
def trans_detection_label(src_label_path, tgt_label_path, start_idx=None, end_idx=None):
    files = os.listdir(src_label_path)
    files.sort()  # 确保文件按名称排序
 
     # 指定要替换的固定值
    fixed_values = "1.0 0 0.0 -1 -1 -1 -1"


    # 如果指定了 start_idx 和 end_idx，过滤出范围内的文件
    if start_idx is not None and end_idx is not None:
        files = [f for f in files if f.split('.')[0].isdigit() and start_idx <= int(f.split('.')[0]) <= end_idx]
 
    for fname in files:
        frame, _ = os.path.splitext(fname)
        print(frame)
 
        with open(os.path.join(src_label_path, fname), encoding='utf-8') as f_in, open(
            os.path.join(tgt_label_path, frame + ".txt"), 'w') as f_out:

            for line in f_in:
                parts = line.strip().split(' ')

                # 根据条件修改 obj_type
                # if parts[0] == "DontCare":
                if parts[0] in ['Car', 'Pedestrian', 'Cyclist', 'Truck']:
                    # 保留类别、尺寸、位置和朝向
                    new_line = f"{parts[0]} {fixed_values} {parts[8]} {parts[9]} {parts[10]} {parts[13]} {-float(parts[11])} {parts[12]} {parts[14]}"
                    f_out.write(new_line + '\n')
                    if parts[0] == "Truck":
                        print("++++++" , frame, "===========")
                else:
                    continue  # 跳过DontCare行


    print(f"The finished frame in the sequence is : {frame}")
 
if __name__ == "__main__":
    src_label = "/home/vincent/data/handle_labels/data_object_label_2/training/label_2"  # 替换成自己的路径
    tgt_label = "/home/vincent/data/handle_labels/data_object_label_2/training/labels"
    # 这里你可以指定开始和结束的索引，例如处理000001到001000范围内的文件
    start_idx = 0
    end_idx = 10000
    trans_detection_label(src_label, tgt_label, start_idx, end_idx)

重新生成zip压缩包。上传到服务器。然后解压。copy到指定路径下。生成数据分割。

删除*.pkl，删除ImageSets文件夹。删除gt_database文件夹。

按照四章中，3.4节的操作。预处理数据集。

修改：Openpcdet-Test-master/pcdet/datasets/custom/custom_dataset.py中的类别信息。

create_custom_infos(
   dataset_cfg=dataset_cfg,
   class_names=['Car', 'Pedestrian', 'Cyclist', 'Truck'], # 1.修改类别
   data_path=ROOT_DIR / 'data' / 'custom',
   save_path=ROOT_DIR / 'data' / 'custom'
   )

Openpcdet-Test-master/tools/cfgs/dataset_configs/kitti_custom_dataset.yaml

在OpenPCDet/tools/cfgs/kitti_models/pointpillar_custom.yaml文件中，修改类别，并将ANCHOR配置：

      {
          'class_name': 'Truck',
          'anchor_sizes': [[7.0, 2.5, 3.0]],  # Truck 的尺寸
          'anchor_rotations': [0, 1.57],
          'anchor_bottom_heights': [-2.0],    # 根据 Truck 的高度调整
          'align_center': False,
          'feature_map_stride': 2,
          'matched_threshold': 0.6,          # 根据需要调整
          'unmatched_threshold': 0.45        # 根据需要调整
      }

2025-06-10 12:03:35,300   INFO  ----------- Create dataloader & network & optimizer -----------
2025-06-10 12:03:35,363   INFO  Database filter by min points Car: 22997 => 921
2025-06-10 12:03:35,363   INFO  Database filter by min points Pedestrian: 3619 => 427
2025-06-10 12:03:35,363   INFO  Database filter by min points Cyclist: 1294 => 152
2025-06-10 12:03:35,363   INFO  Database filter by min points Truck: 897 => 856
2025-06-10 12:03:35,367   INFO  Loading Custom dataset.
2025-06-10 12:03:35,406   INFO  Total samples for CUSTOM dataset: 5984

4、通过CPU虚拟机尝试调用模型来检测。

5、启动ros包，让其检测，看看效果。

只要跑通流程即可，并无真实数据

6、后处理，通过获取的框。优化渲染显示效果。

7、将输出结果封装发送

8、研究可行驶区域划分

五、部署问题

ONNX和TensorRT，是模型部署和推理加速的应用。是模型训练好之后部署的问题。

点击链接查看和 Kimi 的对话

User: 我用OpenPCDet训练好pointpillars模型之后，onnx负责做什么？

Kimi: 在使用 OpenPCDet 训练好 PointPillars 模型后，ONNX（Open Neural Network Exchange）通常用于模型的转换和优化，以便将其部署到不同的硬件平台或推理框架中。以下是 ONNX 在此场景中的主要作用：

### 1. **模型转换**
- **格式统一**：OpenPCDet 训练的模型通常是以 PyTorch 格式保存的，而 ONNX 是一种开放的模型格式，可以将 PyTorch 模型转换为 ONNX 格式。这使得模型能够被不同的推理框架（如 TensorRT、ONNX Runtime、OpenVINO 等）加载和使用。
- **跨框架兼容性**：ONNX 提供了一种标准化的方式，使得模型可以无缝地从 PyTorch 转换到其他深度学习框架或推理引擎。例如，你可以将 PyTorch 训练好的 PointPillars 模型转换为 ONNX 格式，然后在 TensorRT 中进行优化，以实现更高的推理速度。

### 2. **模型优化**
- **推理加速**：ONNX Runtime 是一个高性能的推理引擎，可以直接加载 ONNX 格式的模型并进行推理。它对模型进行了优化，包括算子融合、内存优化等，能够提高模型的推理速度，同时减少资源消耗。
- **硬件适配**：ONNX 格式的模型可以被不同的硬件加速器（如 GPU、FPGA、ASIC 等）加载和优化。例如，通过将模型转换为 ONNX 格式，你可以利用 NVIDIA 的 TensorRT 或 OpenVINO 等工具，针对特定硬件进行进一步的优化，以实现最佳的推理性能。

### 3. **部署支持**
- **跨平台部署**：ONNX 格式的模型可以被多种操作系统和硬件平台支持，包括但不限于 Windows、Linux、macOS、移动设备等。这使得 PointPillars 模型可以更容易地部署到不同的应用场景中。
- **简化部署流程**：ONNX 提供了统一的接口和工具，使得模型的部署过程更加简单。例如，你可以使用 ONNX Runtime 提供的 API，将模型快速部署到服务器或边缘设备上。

### 4. **模型验证**
- **一致性检查**：在将 PyTorch 模型转换为 ONNX 格式后，可以通过 ONNX 的工具对模型进行验证，确保转换后的模型与原始模型在功能上保持一致。这有助于发现转换过程中可能出现的错误或不兼容的算子。
- **调试支持**：ONNX 提供了丰富的调试工具，可以帮助开发者检查模型在转换和推理过程中的行为，确保模型的正确性。

### 5. **模型共享**
- **模型共享与协作**：ONNX 格式的模型文件是开放的，可以方便地在团队成员之间共享，或者在不同的项目中复用。这有助于提高开发效率，促进团队协作。

总结
在 OpenPCDet 训练好 PointPillars 模型后，ONNX 的主要作用是将模型从 PyTorch 格式转换为一种通用的、跨框架的格式，以便进行进一步的优化、部署和共享。

通过 ONNX，你可以将模型部署到不同的硬件平台，利用各种推理引擎的优势，实现高效的推理性能。