实验记录 | PointMLP | Grouping layer + Geometric Affine

笑稀了的野生俊

于 2024-09-02 21:18:45 发布

阅读量590

点赞数 19

文章标签：人工智能深度学习计算机视觉 python

本文链接：https://blog.csdn.net/Junseer/article/details/141826859

版权

引言

自 PointNet++ 腾空出世，点云分析的深度框架便成为了该领域的热点，点云学习网络的发展便一发不可收拾。和大部分深度网络一样，点云网络遵循着 “降采样，聚合特征” 的基本思路，逐步提取点云的深度特征。

大部分点云网络的降采样通过 “Grouping layer” 实现，正巧本菜最近在复现 PointMLP 模型的代码，其中 LocalGrouper 就是所谓的降采样模块。许多的点云网络都有它的影子，属于网络的核心组件了，因而顺便在此记录一下。

我们都知道，图像卷积的经典思想：将周围的像素信息加权到中心像素点上，以实现降采样和特征的聚合。点云亦是如此：先通过采样选取中心点，然后找到中心点周围的邻域点，最后将邻域点的信息聚合到中心点上。这样便实现了 “降采样，聚合特征” 的效果。上代码！

代码

class LocalGrouper(nn.Module):
    def __init__(self, channel, groups, kneighbors=24, use_xyz=True, normalize="center", **kwargs):
        """
        Give xyz[b,p,3] and fea[b,p,d], return new_xyz[b,g,3] and new_fea[b,g,k,d]
        :param groups: groups number
        :param kneighbors: k-nerighbors
        :param kwargs: others
        """
        super(LocalGrouper, self).__init__()
        self.groups = groups
        self.kneighbors = kneighbors
        self.use_xyz = use_xyz
        if normalize is not None:
            self.normalize = normalize.lower()
        else:
            self.normalize = None
        if self.normalize not in ["center", "anchor"]:
            print(f"Unrecognized normalize parameter (self.normalize), set to None. Should be one of [center, anchor].")
            self.normalize = None
        if self.normalize is not None:
            add_channel=3 if self.use_xyz else
            self.affine_alpha = nn.Parameter(torch.ones([1,1,1,channel + add_channel]))
            self.affine_beta = nn.Parameter(torch.zeros([1, 1, 1, channel + add_channel]))

    # xyz:[B,1024,3], points:[B,1024,64]
    def forward(self, xyz, points):
        B, N, C = xyz.shape
        S = self.groups
        xyz = xyz.contiguous()

        """1. 最远点采样得到中心点 → xyz[B,groups,3] & fea[B,groups,64] """
        fps_idx = farthest_point_sample(xyz, self.groups).long() # [B, groups]
        new_xyz = index_points(xyz, fps_idx)        # [B, groups, 3]
        new_points = index_points(points, fps_idx)  # [B, groups, 64]
    
        """2. K-NN算法得到中心周围的k个邻域点"""
        idx = knn_point(self.kneighbors, xyz, new_xyz)
        grouped_xyz = index_points(xyz, idx)        # [B, groups, k, 3]
        grouped_points = index_points(points, idx)  # [B, groups, k, d]
        if self.use_xyz:
            grouped_points = torch.cat([grouped_points, grouped_xyz],dim=-1)  # [B, npoint, k, d+3]

        """3. 归一化处理"""
        if self.normalize is not None:
            if self.normalize =="center":
                mean = torch.mean(grouped_points, dim=2, keepdim=True)
            if self.normalize =="anchor":
                mean = torch.cat([new_points, new_xyz],dim=-1) if self.use_xyz else new_points
                mean = mean.unsqueeze(dim=-2)  # [B, groups, 1, 64]
            
            std = torch.std((grouped_points-mean).reshape(B,-1), dim=-1, keepdim=True).unsqueeze(dim=-1).unsqueeze(dim=-1)
            grouped_points = (grouped_points-mean)/(std + 1e-5)

            """4. 几何仿射变换"""
            grouped_points = self.affine_alpha*grouped_points + self.affine_beta

        new_points = torch.cat([grouped_points, new_points.view(B, S, 1, -1).repeat(1, 1, self.kneighbors, 1)], dim=-1)
        return new_xyz, new_points

Ⅰ，选取中心点：最远点采样

fps_idx = farthest_point_sample(xyz, self.groups).long()

使用最远点采样算法从 xyz 坐标中选出 groups 个点的索引，fps_idx 保存这些点的索引。[B, groups] 表示批次大小为 B，采样得到的点数为 groups 。

new_xyz = index_points(xyz, fps_idx)

根据 fps_idx 从原始 xyz 坐标中提取出采样得到的新坐标 new_xyz。

new_points = index_points(points, fps_idx)

根据 fps_idx 从原始 points 特征中提取出与新坐标对应的特征 new_points。

Ⅱ，选取邻域点：KNN算法

idx = knn_point(self.kneighbors, xyz, new_xyz)

通过K近邻（KNN）算法，从原始坐标 xyz 中找到每个 new_xyz 点的 kneighbors 个最近邻点，并返回这些邻居点的索引 idx。idx 的维度通常是 [B, groups, kneighbors]，表示对于每个新采样的点（new_xyz），在原始点云中找到 kneighbors 个最近的点的索引。

Ⅲ，标准化（归一化）处理

如何理解呢？其实就是把特征向量减去均值（中心化），然后除以标准差。直观上就是把特征空间中的所有点先平移到原点附近，然后除以标准差使特征的标准差为1。

（1）中心化

此处举一个非常清晰的例子：假设 new_points 是通过FPS采样得到的2个中心点，grouped_points 是根据K近邻得到的4个最近邻点，此时的 feature dimension 为 4。那么中心化就是将每个 group 中的所有点减去中心点的值。

这里用到了 torch 的广播机制啦：new_points：[3,2,1,4] → [3,2,4,4]

import torch
# [B, G, K, D] = [3, 2, 4, 4]
new_points = torch.randint(0, 10, (3, 2, 1, 4))    # [3,2,1,4]
grouped_points = torch.randint(0, 10, (3, 2, 4, 4))# [3,2,4,4]
Centralization = grouped_points - new_points

（2）标准差

标准化就是在中心化的基础上，除以一个标准差。标准差的计算：

std = torch.std((grouped_points-mean).reshape(B,-1), dim=-1, keepdim=True).unsqueeze(dim=-1).unsqueeze(dim=-1)

其实我觉得这里的标准差的计算挺迷惑的。现在一个点云样本有 G*K 个点，每个点有 D 个特征通道。这里直接将 G*K*D 这么多个值，全部拿来计算一个标准差，正常来说应该要在不同的特征维度分别计算标准差。这里着实把我整得挺蒙蔽的。

（3）标准化

每个点减去均值 mean （中心点）后，除以标准差 std。加上 1e-5 以避免除零错误。

grouped_points = (grouped_points-mean)/(std + 1e-5)

Ⅳ，几何仿射模块：线性变换

这里的操作是 PointMLP 模型的一个独特设计，我的理解是：不同局部区域的几何结构可能差异很大，这就意味着可能需要不同的特征提取器来捕捉这些差异。但共享的残差MLP（residual MLPs）在处理这种多样性时会遇到困难，因为它们是为通用性而设计的，而不是针对特定几何结构的差异。因此引入可学习参数，能够一定程度缓解这种情况。

最后是对归一化后的 grouped_points 进行仿射变换。self.affine_alpha 是一个缩放因子，self.affine_beta 是一个偏移量，这两个参数是可学习的，在训练过程中更新。如果你稍微品读一下代码，不难发现几何仿射实际就是一个逐元素的线性变换。

grouped_points = self.affine_alpha*grouped_points + self.affine_beta

总结

写到这里，不难发现 LocalGrouper 是非常容易理解的。作为点云网络的核心部件，LocalGrouper 模块很好地充当了“卷积”的角色，故而也有着不可撼动的地位。

笑稀了的野生俊

关注

19
点赞
踩
11

收藏

觉得还不错? 一键收藏
0
评论
实验记录 | PointMLP | Grouping layer + Geometric Affine

图像卷积的经典思想：将周围的像素信息加权到中心像素点上，以实现降采样和特征的聚合。点云亦是如此：先通过采样选取中心点，然后找到中心点周围的邻域点，最后将邻域点的信息聚合到中心点上。这样便实现了 “降采样，聚合特征” 的效果。上代码！
复制链接

扫一扫