代码阅读记录(9)—NEW pointpillars

南徐炼丹大师

已于 2023-09-10 19:51:02 修改

阅读量311

点赞数 2

分类专栏：代码阅读文章标签： python

于 2023-07-05 10:08:02 首次发布

本文链接：https://blog.csdn.net/weixin_45699456/article/details/131542766

版权

代码阅读专栏收录该内容

9 篇文章 8 订阅

订阅专栏

GitHub - zhulf0804/PointPillars： A Simple PointPillars PyTorch Implenmentation for 3D Lidar（KITTI） Detection.

train.py

参数设置

    parser = argparse.ArgumentParser(description='Configuration Parameters')
    parser.add_argument('--data_root', default='/mnt/ssd1/lifa_rdata/det/kitti',
                        help='your data root for kitti')
    parser.add_argument('--saved_path', default='pillar_logs')
    parser.add_argument('--batch_size', type=int, default=6)
    parser.add_argument('--num_workers', type=int, default=4)
    parser.add_argument('--nclasses', type=int, default=3)
    parser.add_argument('--init_lr', type=float, default=0.00025)
    parser.add_argument('--max_epoch', type=int, default=160)
    parser.add_argument('--log_freq', type=int, default=8)
    parser.add_argument('--ckpt_freq_epoch', type=int, default=20)
    parser.add_argument('--no_cuda', action='store_true',
                        help='whether to use cuda')

- `data_root`：默认值为'/mnt/ssd1/lifa_rdata/det/kitti'，用于指定存储KITTI数据集的根目录。
- `saved_path`：默认值为'pillar_logs'，用于指定保存路径。
- `batch_size`：类型为整数，默认值为6，用于设置批处理大小。
- `num_workers`：类型为整数，默认值为4，用于设置数据加载器的线程数。
- `nclasses`：类型为整数，默认值为3，用于指定分类的类别数量。
- `init_lr`：类型为浮点数，默认值为0.00025，用于设置初始学习率。
- `max_epoch`：类型为整数，默认值为160，用于设置最大训练轮数。
- `log_freq`：类型为整数，默认值为8，用于设置日志记录频率。
- `ckpt_freq_epoch`：类型为整数，默认值为20，用于设置模型检查点保存频率（每多少个epoch保存一次）。
- `no_cuda`：布尔类型标志，如果设置了该标志，则不使用CUDA加速。

mian（）

    setup_seed()  # 保证实验结果的可重复性。
    # 数据读取 先读再加载
    train_dataset = Kitti(data_root=args.data_root,
                          split='train')
    val_dataset = Kitti(data_root=args.data_root,
                        split='val')
    train_dataloader = get_dataloader(dataset=train_dataset,
                                      batch_size=args.batch_size,
                                      num_workers=args.num_workers,
                                      shuffle=True)
    val_dataloader = get_dataloader(dataset=val_dataset,
                                    batch_size=args.batch_size,
                                    num_workers=args.num_workers,
                                    shuffle=False)
    # 是否使用cuda加速
    if not args.no_cuda:
        pointpillars = PointPillars(nclasses=args.nclasses).cuda()
    else:
        pointpillars = PointPillars(nclasses=args.nclasses)
    loss_func = Loss()  # 损失函数

    # 训练轮数  学习率 优化器 学习率调度器（动态调整学习率）
    max_iters = len(train_dataloader) * args.max_epoch
    init_lr = args.init_lr
    optimizer = torch.optim.AdamW(params=pointpillars.parameters(),
                                  lr=init_lr,
                                  betas=(0.95, 0.99),
                                  weight_decay=0.01)
    scheduler = torch.optim.lr_scheduler.OneCycleLR(optimizer,
                                                    max_lr=init_lr * 10,
                                                    total_steps=max_iters,
                                                    pct_start=0.4,
                                                    anneal_strategy='cos',
                                                    cycle_momentum=True,
                                                    base_momentum=0.95 * 0.895,
                                                    max_momentum=0.95,
                                                    div_factor=10)
    # 保存训练日志文件 后续使用TensorBoard进行可视化分析   检查点
    saved_logs_path = os.path.join(args.saved_path, 'summary')
    os.makedirs(saved_logs_path, exist_ok=True)
    writer = SummaryWriter(saved_logs_path)
    saved_ckpt_path = os.path.join(args.saved_path, 'checkpoints')
    os.makedirs(saved_ckpt_path, exist_ok=True)

学习率调度器：

`torch.optim.lr_scheduler.OneCycleLR`是一种学习率调度器，用于动态地调整训练过程中的学习率。它基于1个周期的训练，其中学习率先增加到最大值，然后再逐渐降低。这种策略旨在加快模型的收敛速度、提高泛化能力和防止过拟合。

下面是对传递给`OneCycleLR`构造函数的参数的解释：
- `optimizer`：优化器对象，通常是用于优化模型参数的PyTorch优化器，如SGD、Adam等。
- `max_lr`：学习率的最大值。在训练过程中，学习率会从初始值逐渐增加到最大值，然后再逐渐减小。
- `total_steps`：总训练步数（或迭代次数），即整个训练过程中经历的总批次数。
- `pct_start`：学习率增长阶段的比例。例如，如果设置为0.4，那么学习率将在训练过程的前40%步骤内线性增加到最大值。
- `anneal_strategy`：学习率的退火策略。这里使用了"cos"，表示采用余弦退火策略，在训练过程的后期逐渐降低学习率。
- `cycle_momentum`：是否进行动量调整。如果设置为True，则动量也会在一个周期内变化。
- `base_momentum`：动量的初始值。
- `max_momentum`：动量的最大值。
- `div_factor`：学习率的除数因子。学习率将按照指定的除数因子逐渐降低。

基本设置完成后开始训练模型

        print('=' * 20, epoch, '=' * 20)
        train_step, val_step = 0, 0
        for i, data_dict in enumerate(tqdm(train_dataloader)):
            if not args.no_cuda:
                # move the tensors to the cuda
                for key in data_dict:
                    for j, item in enumerate(data_dict[key]):
                        if torch.is_tensor(item):
                            data_dict[key][j] = data_dict[key][j].cuda()

            optimizer.zero_grad()
            # 把读取的数据中对应的内容取出来  然后进行训练  点集合 边界框 标签 难度
            batched_pts = data_dict['batched_pts']
            batched_gt_bboxes = data_dict['batched_gt_bboxes']
            batched_labels = data_dict['batched_labels']
            batched_difficulty = data_dict['batched_difficulty']
            # 类别 边界框 方向 锚框信息
            bbox_cls_pred, bbox_pred, bbox_dir_cls_pred, anchor_target_dict = \
                pointpillars(batched_pts=batched_pts,
                             mode='train',
                             batched_gt_bboxes=batched_gt_bboxes,
                             batched_gt_labels=batched_labels)
            # 调整维度
            bbox_cls_pred = bbox_cls_pred.permute(0, 2, 3, 1).reshape(-1, args.nclasses)
            bbox_pred = bbox_pred.permute(0, 2, 3, 1).reshape(-1, 7)
            bbox_dir_cls_pred = bbox_dir_cls_pred.permute(0, 2, 3, 1).reshape(-1, 2)

            # 边界框的标签信息  权重 位置回归 方向
            batched_bbox_labels = anchor_target_dict['batched_labels'].reshape(-1)
            batched_label_weights = anchor_target_dict['batched_label_weights'].reshape(-1)
            batched_bbox_reg = anchor_target_dict['batched_bbox_reg'].reshape(-1, 7)
            # batched_bbox_reg_weights = anchor_target_dict['batched_bbox_reg_weights'].reshape(-1)
            batched_dir_labels = anchor_target_dict['batched_dir_labels'].reshape(-1)
            # batched_dir_labels_weights = anchor_target_dict['batched_dir_labels_weights'].reshape(-1)

            # 寻找有效标签 并且保留正样本信息
            pos_idx = (batched_bbox_labels >= 0) & (batched_bbox_labels < args.nclasses)
            bbox_pred = bbox_pred[pos_idx]
            batched_bbox_reg = batched_bbox_reg[pos_idx]
            # sin(a - b) = sin(a)*cos(b) - cos(a)*sin(b)
            bbox_pred[:, -1] = torch.sin(bbox_pred[:, -1].clone()) * torch.cos(batched_bbox_reg[:, -1].clone())
            batched_bbox_reg[:, -1] = torch.cos(bbox_pred[:, -1].clone()) * torch.sin(batched_bbox_reg[:, -1].clone())
            bbox_dir_cls_pred = bbox_dir_cls_pred[pos_idx]
            batched_dir_labels = batched_dir_labels[pos_idx]

            # 对正样本的类别标签和预测结果进行了进一步的处理，为了计算损失、评估模型性能
            num_cls_pos = (batched_bbox_labels < args.nclasses).sum()
            bbox_cls_pred = bbox_cls_pred[batched_label_weights > 0]
            batched_bbox_labels[batched_bbox_labels < 0] = args.nclasses
            batched_bbox_labels = batched_bbox_labels[batched_label_weights > 0]

            # 计算损失
            loss_dict = loss_func(bbox_cls_pred=bbox_cls_pred,
                                  bbox_pred=bbox_pred,
                                  bbox_dir_cls_pred=bbox_dir_cls_pred,
                                  batched_labels=batched_bbox_labels,
                                  num_cls_pos=num_cls_pos,
                                  batched_bbox_reg=batched_bbox_reg,
                                  batched_dir_labels=batched_dir_labels)

            # 更新参数  是否保存训练日志
            loss = loss_dict['total_loss']
            loss.backward()
            # torch.nn.utils.clip_grad_norm_(pointpillars.parameters(), max_norm=35)
            optimizer.step()
            scheduler.step()

            global_step = epoch * len(train_dataloader) + train_step + 1

            if global_step % args.log_freq == 0:
                save_summary(writer, loss_dict, global_step, 'train',
                             lr=optimizer.param_groups[0]['lr'],
                             momentum=optimizer.param_groups[0]['betas'][0])
            train_step += 1
        # 是否是需要保存检查点的时机
        if (epoch + 1) % args.ckpt_freq_epoch == 0:
            torch.save(pointpillars.state_dict(), os.path.join(saved_ckpt_path, f'epoch_{epoch + 1}.pth'))

        if epoch % 2 == 0:
            continue

评估代码类似

网络搭建pointpillars.py

class PointPillars(nn.Module)

init

def __init__(self,
             nclasses=3,
             voxel_size=[0.16, 0.16, 4],
             point_cloud_range=[0, -39.68, -3, 69.12, 39.68, 1],
             max_num_points=32,
             max_voxels=(16000, 40000)):

`nclasses`：类别数量，默认为3。这个参数用于指定数据集中对象的类别数目。
`voxel_size`：体素尺寸，默认为`[0.16, 0.16, 4]`。体素是在三维空间中的一个小立方体单元，该参数定义了体素的尺寸，它是一个具有三个元素的列表，分别表示在x、y和z轴上的尺寸。
`point_cloud_range`：点云范围，默认为`[0, -39.68, -3, 69.12, 39.68, 1]`。这个参数定义了点云数据的范围，在x、y和z轴上都有最小值和最大值。例如，在x轴上的范围是从0到69.12。
`max_num_points`：每个体素中的最大点数，默认为32。这个参数限制了每个体素中可以包含的最大点云数量。
`max_voxels`：最大体素数量，默认为`(16000, 40000)`。这个参数定义了整个数据集中允许的最大体素数量。它是一个具有两个元素的元组，第一个元素表示在x轴上的最大体素数量，第二个元素表示在y轴上的最大体素数量。

self.nclasses = nclasses
self.pillar_layer = PillarLayer(voxel_size=voxel_size,
                                point_cloud_range=point_cloud_range,
                                max_num_points=max_num_points,
                                max_voxels=max_voxels)
self.pillar_encoder = PillarEncoder(voxel_size=voxel_size,
                                    point_cloud_range=point_cloud_range,
                                    in_channel=9,
                                    out_channel=64)
self.backbone = Backbone(in_channel=64,
                         out_channels=[64, 128, 256],
                         layer_nums=[3, 5, 5])
self.neck = Neck(in_channels=[64, 128, 256],
                 upsample_strides=[1, 2, 4],
                 out_channels=[128, 128, 128])
self.head = Head(in_channel=384, n_anchors=2 * nclasses, n_classes=nclasses)

# anchors
# 三个x、y和z轴上的最小和最大值例如，第一个表示在x轴上的范围是从0到69.12，在y轴上的范围是从-39.68到39.68，在z轴上的范围是从-0.6到-0.6。
ranges = [[0, -39.68, -0.6, 69.12, 39.68, -0.6],
          [0, -39.68, -0.6, 69.12, 39.68, -0.6],
          [0, -39.68, -1.78, 69.12, 39.68, -1.78]]
# 三个尺寸 宽长高
sizes = [[0.6, 0.8, 1.73], [0.6, 1.76, 1.73], [1.6, 3.9, 1.56]]
# 旋转角度
rotations = [0, 1.57]
self.anchors_generator = Anchors(ranges=ranges,
                                 sizes=sizes,
                                 rotations=rotations)

# train
# 'pos_iou_thr': 正样本的IoU（Intersection over Union）阈值。如果目标框与真实框的IoU大于等于该阈值，则将其分配为正样本。
# 'neg_iou_thr': 负样本的IoU阈值。如果目标框与真实框的IoU小于该阈值，则将其分配为负样本。
# 'min_iou_thr': 最小的IoU阈值。如果目标框与真实框的IoU在'neg_iou_thr'和'pos_iou_thr'之间且不满足这个最小阈值，则忽略不进行分配。
self.assigners = [
    {'pos_iou_thr': 0.5, 'neg_iou_thr': 0.35, 'min_iou_thr': 0.35},
    {'pos_iou_thr': 0.5, 'neg_iou_thr': 0.35, 'min_iou_thr': 0.35},
    {'pos_iou_thr': 0.6, 'neg_iou_thr': 0.45, 'min_iou_thr': 0.45},
]

# val and test
self.nms_pre = 100  # 非极大值抑制（non-maximum suppression）之前保留的预测框（bounding box）的数量
self.nms_thr = 0.01  # 非极大值抑制的IoU阈值。当两个预测框的IoU大于等于该阈值时，较低置信度（confidence）的预测框将被抑制。
self.score_thr = 0.1  # 筛选预测框时使用的置信度阈值。只有具有置信度大于等于该阈值的预测框才会被保留。
self.max_num = 50  # 每张图像中保留的最大预测框数量。如果非极大值抑制后的预测框数量超过了该值，则会选择置信度较高的前self.max_num个预测框。

class PillarLayer(nn.Module):

class PillarLayer(nn.Module):
    def __init__(self, voxel_size, point_cloud_range, max_num_points, max_voxels):
        super().__init__()
        self.voxel_layer = Voxelization(voxel_size=voxel_size,
                                        point_cloud_range=point_cloud_range,
                                        max_num_points=max_num_points,
                                        max_voxels=max_voxels)

    @torch.no_grad()
    def forward(self, batched_pts):
        '''
        batched_pts: list[tensor], len(batched_pts) = bs
        return: 
               pillars: (p1 + p2 + ... + pb, num_points, c), 
               coors_batch: (p1 + p2 + ... + pb, 1 + 3), 
               num_points_per_pillar: (p1 + p2 + ... + pb, ), (b: batch size)
        '''
        pillars, coors, npoints_per_pillar = [], [], []
        for i, pts in enumerate(batched_pts):
            voxels_out, coors_out, num_points_per_voxel_out = self.voxel_layer(pts)
            # voxels_out: (max_voxel, num_points, c), coors_out: (max_voxel, 3)
            # num_points_per_voxel_out: (max_voxel, )
            pillars.append(voxels_out)
            coors.append(coors_out.long())
            npoints_per_pillar.append(num_points_per_voxel_out)

        pillars = torch.cat(pillars, dim=0)  # (p1 + p2 + ... + pb, num_points, c)
        npoints_per_pillar = torch.cat(npoints_per_pillar, dim=0)  # (p1 + p2 + ... + pb, )
        coors_batch = []
        for i, cur_coors in enumerate(coors):
            coors_batch.append(F.pad(cur_coors, (1, 0), value=i))
        coors_batch = torch.cat(coors_batch, dim=0)  # (p1 + p2 + ... + pb, 1 + 3)

        return pillars, coors_batch, npoints_per_pillar

这段代码定义了一个名为PillarLayer的PyTorch模型类。它用于执行点云数据的体素化操作。

在初始化方法`__init__`中，模型接受了一些参数：`voxel_size`（体素尺寸），`point_cloud_range`（点云范围），`max_num_points`（每个体素最大点数），以及`max_voxels`（最大体素数量）。然后通过调用`Voxelization`类来创建一个`voxel_layer`对象，该类负责实际的体素化操作。

在`forward`方法中，在不计梯度的上下文下（`torch.no_grad()`装饰器），输入是一个批次的点云数据，表示为一个列表`batched_pts`，其中`len(batched_pts)`等于批次大小`bs`。

对于每个样本，它首先将点云数据传递给`voxel_layer`对象进行体素化操作。然后将体素化结果保存到`pillars`列表中，将体素坐标保存到`coors`列表中，并将每个体素中的点的数量保存到`npoints_per_pillar`列表中。

接下来，它使用`torch.cat`函数将所有体素化结果连接起来，生成形状为`(p1 + p2 + ... + pb, num_points, c)`的`pillars`张量，其中`p1 + p2 + ... + pb`表示所有体素的总数，`num_points`表示每个体素中的点的数量，`c`表示每个点的特征维度。

类似地，它使用`torch.cat`函数将所有体素的坐标连接起来，生成形状为`(p1 + p2 + ... + pb, 1 + 3)`的`coors_batch`张量，其中`1 + 3`表示每个体素的坐标维度（1个额外维度用于区分不同批次）。

最后，它使用同样的方法将所有体素中的点的数量连接起来，生成形状为`(p1 + p2 + ... + pb, )`的`npoints_per_pillar`张量。

最终，`forward`方法返回了三个张量：`pillars`表示体素化后的点云数据，`coors_batch`表示体素的坐标，`npoints_per_pillar`表示每个体素中的点的数量。

class PillarEncoder(nn.Module)

self.out_channel = out_channel
self.vx, self.vy = voxel_size[0], voxel_size[1]
self.x_offset = voxel_size[0] / 2 + point_cloud_range[0]  # x轴和y轴上的偏移量
self.y_offset = voxel_size[1] / 2 + point_cloud_range[1]
self.x_l = int((point_cloud_range[3] - point_cloud_range[0]) / voxel_size[0])
self.y_l = int((point_cloud_range[4] - point_cloud_range[1]) / voxel_size[1])

self.conv = nn.Conv1d(in_channel, out_channel, 1, bias=False)
self.bn = nn.BatchNorm1d(out_channel, eps=1e-3, momentum=0.01)

voxel_size[0]表示每个体素在x轴上的尺寸，point_cloud_range[0]表示点云数据在x轴上的起始位置。通过除以2和加上起始位置，可以得到x轴上的中心偏移量。

point_cloud_range[3]和point_cloud_range[0]分别表示点云数据在x轴上的结束位置和起始位置，通过两者之差除以体素尺寸voxel_size[0]，可以得到x轴上的长度值。

forward（）

# 1. calculate offset to the points center (in each pillar)  计算到点中心的偏移（在每个支柱中）
offset_pt_center = pillars[:, :, :3] - torch.sum(pillars[:, :, :3], dim=1, keepdim=True) / npoints_per_pillar[:,
None,
None]  # (p1 + p2 + ... + pb, num_points, 3)

首先，pillars[:, :, :3]表示取出pillars张量的前三个维度的元素，即[x, y, z]坐标。

接下来，torch.sum(pillars[:, :, :3], dim=1, keepdim=True)是对每个pillar的[x, y, z]坐标进行求和操作，维度为1。这样得到的结果是一个形状为(batch_size, 1, 3)的张量，表示每个pillar的坐标之和。

然后，npoints_per_pillar[:, None, None]是将npoints_per_pillar扩展为(batch_size, 1, 1)的形状。假设npoints_per_pillar是一个形状为(batch_size,)的张量，表示每个pillar包含的点的数量。

最后，通过将每个pillar的坐标之和除以对应的点的数量，即/ npoints_per_pillar[:, None, None]，得到每个pillar的中心点坐标。

因此，offset_pt_center的形状与pillars[:, :, :3]相同，表示每个pillar中心点相对于其原始坐标的偏移量。

# 2. calculate offset to the pillar center  计算到立柱中心的偏移
x_offset_pi_center = pillars[:, :, :1] - (
        coors_batch[:, None, 1:2] * self.vx + self.x_offset)  # (p1 + p2 + ... + pb, num_points, 1)
y_offset_pi_center = pillars[:, :, 1:2] - (
        coors_batch[:, None, 2:3] * self.vy + self.y_offset)  # (p1 + p2 + ... + pb, num_points, 1)

x_offset_pi_center表示柱子在x轴上的偏移量，y_offset_pi_center表示柱子在y轴上的偏移量。

# 3. encoder
features = torch.cat([pillars, offset_pt_center, x_offset_pi_center, y_offset_pi_center],
                     dim=-1)  # (p1 + p2 + ... + pb, num_points, 9) 4+3+1+1
features[:, :, 0:1] = x_offset_pi_center  # tmp
features[:, :, 1:2] = y_offset_pi_center  # tmp

拼接9维特征

# 4. find mask for (0, 0, 0) and update the encoded features  查找（0，0，0）的掩码并更新编码的特征
        # a very beautiful implementation
        voxel_ids = torch.arange(0, pillars.size(1)).to(device)  # (num_points, )
        mask = voxel_ids[:, None] < npoints_per_pillar[None, :]  # (num_points, p1 + p2 + ... + pb)
        mask = mask.permute(1, 0).contiguous()  # (p1 + p2 + ... + pb, num_points)
        features *= mask[:, :, None]

        # 5. embedding
        features = features.permute(0, 2, 1).contiguous()  # (p1 + p2 + ... + pb, 9, num_points)
        features = F.relu(self.bn(self.conv(features)))  # (p1 + p2 + ... + pb, out_channels, num_points)
        pooling_features = torch.max(features, dim=-1)[0]  # (p1 + p2 + ... + pb, out_channels)

        # 6. pillar scatter
        batched_canvas = []
        bs = coors_batch[-1, 0] + 1
        for i in range(bs):
            cur_coors_idx = coors_batch[:, 0] == i
            cur_coors = coors_batch[cur_coors_idx, :]
            cur_features = pooling_features[cur_coors_idx]

            canvas = torch.zeros((self.x_l, self.y_l, self.out_channel), dtype=torch.float32, device=device)
            canvas[cur_coors[:, 1], cur_coors[:, 2]] = cur_features
            canvas = canvas.permute(2, 1, 0).contiguous()
            batched_canvas.append(canvas)
        batched_canvas = torch.stack(batched_canvas, dim=0)  # (bs, in_channel, self.y_l, self.x_l)

合并编码，并且进行卷积核池化，然后根据Pillars在map中的位置(coors), 将 P 个pillars的特征scatter到(432, 496)的特征图上

test.py

    parser = argparse.ArgumentParser(description='Configuration Parameters')
    parser.add_argument('--ckpt', default='pretrained/epoch_160.pth', help='your checkpoint for kitti')
    parser.add_argument('--pc_path', help='your point cloud path')
    parser.add_argument('--calib_path', default='', help='your calib file path')
    parser.add_argument('--gt_path', default='', help='your ground truth path')
    parser.add_argument('--img_path', default='', help='your image path')
    parser.add_argument('--no_cuda', action='store_true',
                        help='whether to use cuda')

1. --ckpt：指定kitti模型的检查点文件路径。默认值为'pretrained/epoch_160.pth'。

2. --pc_path：表示点云文件的路径。这个参数在代码中是必需的，用于指定点云数据的位置。

3. --calib_path：表示calib文件的路径。这个参数可选，用于指定calib文件的位置。

4. --gt_path：表示地面真值文件的路径。这个参数可选，用于指定地面真值数据的位置。

5. --img_path：表示图像文件的路径。这个参数可选，用于指定图像数据的位置。

6. --no_cuda：如果设置了该参数，则不使用CUDA加速。这是一个布尔类型的参数，如果在命令行中指定了该参数，则为True，否则为False。

CLASSES = {
    'Pedestrian': 0,
    'Cyclist': 1,
    'Car': 2
}
LABEL2CLASSES = {v: k for k, v in CLASSES.items()}
pcd_limit_range = np.array([0, -40, -3, 70.4, 40, 0.0], dtype=np.float32)

LABEL2CLASSES 是一个通过键值对交换得到的字典，将类别编号映射回相应的类别名称。也就是说，它是 CLASSES 字典的逆转字典，用于从类别编号获取对应的类别名称。

pc = read_points(args.pc_path)
pc = point_range_filter(pc)
pc_torch = torch.from_numpy(pc)

if os.path.exists(args.calib_path):
    calib_info = read_calib(args.calib_path)
else:
    calib_info = None

if os.path.exists(args.gt_path):
    gt_label = read_label(args.gt_path)
else:
    gt_label = None

if os.path.exists(args.img_path):
    img = cv2.imread(args.img_path, 1)
else:
    img = None

pc = read_points(args.pc_path)：根据给定的文件路径 args.pc_path，调用了一个名为 read_points 的函数来读取点云数据，并将结果保存在变量 pc 中。

pc = point_range_filter(pc)：将变量 pc（包含点云数据）传递给一个名为 point_range_filter 的函数进行点云过滤处理。该函数可能会根据某些条件过滤掉一部分点云数据，然后将更新后的点云数据再次保存在变量 pc 中。

pc_torch = torch.from_numpy(pc)：将变量 pc 中的点云数据转换为一个PyTorch张量（torch.tensor）的形式，并将结果保存在变量 pc_torch 中。这里使用了torch.from_numpy()函数来实现转换，前提是pc必须是NumPy数组类型。

其余为查看路径是否存在进行数据读取

    if calib_info is not None and img is not None:
        tr_velo_to_cam = calib_info['Tr_velo_to_cam'].astype(np.float32)
        r0_rect = calib_info['R0_rect'].astype(np.float32)
        P2 = calib_info['P2'].astype(np.float32)

        image_shape = img.shape[:2]
        result_filter = keep_bbox_from_image_range(result_filter, tr_velo_to_cam, r0_rect, P2, image_shape)

如果 calib_info 和 img 都不为 None，则执行以下操作：

将 calib_info 字典中的键为 'Tr_velo_to_cam' 的值转换为 np.float32 类型，并将结果赋值给变量 tr_velo_to_cam。这个值表示从 Velodyne 坐标系到相机坐标系的变换矩阵。

将 calib_info 字典中的键为 'R0_rect' 的值转换为 np.float32 类型，并将结果赋值给变量 r0_rect。这个值表示相机坐标系下的图像坐标系相对于 Velodyne 坐标系的旋转矩阵。

将 calib_info 字典中的键为 'P2' 的值转换为 np.float32 类型，并将结果赋值给变量 P2。这个值表示相机投影矩阵，用于将图像坐标转换为相机坐标

获取图像的形状（宽度和高度），并将其存储在变量 image_shape 中。

调用函数 keep_bbox_from_image_range，传递参数 result_filter, tr_velo_to_cam, r0_rect, P2, image_shape，并将返回结果存储在变量 result_filter 中。函数的作用是根据图像范围保留边界框。

result_filter = keep_bbox_from_lidar_range(result_filter, pcd_limit_range)
lidar_bboxes = result_filter['lidar_bboxes']
labels, scores = result_filter['labels'], result_filter['scores']

这段代码是在处理结果筛选和限制范围的操作。让我用表情和简单的解释来说明一下：

result_filter 是一个结果过滤器，可能是一个数据结构或数据集，用来存储从某个任务中得到的结果。
pcd_limit_range 是一个用来限制范围的参数，可能是一组数值或范围，用来定义一个区域或边界。

在这段代码中，涉及以下步骤：

keep_bbox_from_lidar_range() 是一个函数或方法，用来根据激光雷达的范围来保留关于边界框（bounding boxes）的信息。这可能是根据给定的范围对边界框进行过滤或筛选。
lidar_bboxes 是从 result_filter 中提取的与激光雷达边界框相关的数据。
labels 和 scores 是从 result_filter 中提取的标签（类别）和分数（得分）信息，可能表示边界框所属的类别及其置信度。

南徐炼丹大师

关注

2
点赞
踩
2

收藏

觉得还不错? 一键收藏
1
评论
代码阅读记录(9)—NEW pointpillars

`ckpt_freq_epoch`：类型为整数，默认值为20，用于设置模型检查点保存频率（每多少个epoch保存一次）。- `max_lr`：学习率的最大值。- `num_workers`：类型为整数，默认值为4，用于设置数据加载器的线程数。- `init_lr`：类型为浮点数，默认值为0.00025，用于设置初始学习率。- `max_epoch`：类型为整数，默认值为160，用于设置最大训练轮数。- `log_freq`：类型为整数，默认值为8，用于设置日志记录频率。
复制链接

扫一扫