Pointnet++ 网络结构以及代码实现

COOLRANEN

已于 2023-07-25 16:41:29 修改

阅读量1.1k

点赞数 1

分类专栏： 3维点云文章标签：深度学习人工智能

于 2023-07-20 17:26:37 首次发布

本文链接：https://blog.csdn.net/m0_57122465/article/details/131826764

版权

3维点云专栏收录该内容

3 篇文章 2 订阅

订阅专栏

pointnet++通过分层特征学习提升对点云局部结构的识别，包括最远点采样、分组和特征提取步骤。在分类任务中，它使用setabstraction层结合pointnet。在分割任务中，利用特征传播策略进行上采样和特征融合。

摘要由CSDN通过智能技术生成

前言：

pointnet++是在pointnet的基础上发展而来的，而pointnet对于局部结构的识别能力有所缺陷，从pointnet的网络我们也可以看出，pointnet（如图一）是对整体的特征进行了maxpooling操作，忽略了局部特征，而pointnet++采用了一个叫深度的层次特征学习模式以提高局部结构的识别能力。
具体细节还请参考论文：

PointNet++: Deep Hierarchical Feature Learning on Point Sets in a Metric Space

https://proceedings.neurips.cc/paper_files/paper/2017/file/d8bf84be3800d12f74d8b05e9b89836f-Paper.pdf

图一：

pointnet++的网络结构

一、分类任务

1.1 分层抽取特征（set abstraction）

1.1.1sampling：在点云中采样作为中心点，采用最远点采样法（farthest point sampling）

点云的数量为N，批次为B，需要采取npoint个中心点，xyz的第三个维度代表点云集的xyz空间坐标数据

最远点采样的步骤为：

1.初始化中心点centroids为[B,npoints]维度的全0张量，初始化距离distance全为10的10次方维度为[B,N]的张量，farthest初始化为随机从N个点选取的一个点。

2.首先先将随机初始点为最远值作为第一个中心点，然后就算点云中每个点与第一个中心点的距离，存在dist的中，这里采用的是欧式距离，公式举例为（x1-x2）**2+(y1-y2)**2+(z1-z2)**2，就是两点各个坐标的差值平方和。然后将此距离与distance做比较，将距离张量dist中小于distance中对应位置的值的距离更新到distance张量中。取distance中的最大值作为最远值点，centroids中更新为储存着第一个和第二个中心点，然后重复上面操作，依次以更新后的farthest点作为中心点，计算距离，取样，直到取到npoint个数的点为止。

def farthest_point_sample(xyz, npoint):
    """
    Input:
        xyz: pointcloud data, [B, N, 3]
        npoint: number of samples
    Return:
        centroids: sampled pointcloud index, [B, npoint]
    """
    device = xyz.device
    B, N, C = xyz.shape
    centroids = torch.zeros(B, npoint, dtype=torch.long).to(device)
    distance = torch.ones(B, N).to(device) * 1e10
    farthest = torch.randint(0, N, (B,), dtype=torch.long).to(device)
    #torch.randint函数会生成指定范围内的随机整数，并返回一个张量
    batch_indices = torch.arange(B, dtype=torch.long).to(device)
    for i in range(npoint):
        centroids[:, i] = farthest
        #随机值farthest作为为点云集合的中心点
        centroid = xyz[batch_indices, farthest, :].view(B, 1, 3)
        #选最远点的点作为质心，形状为B,1,3
        dist = torch.sum((xyz - centroid) ** 2, -1)
        #采用欧式距离计算每个点与质点的距离，最后dist的形状是B,N
        mask = dist < distance
        #标记哪些点小于distance，mask是个B,N的布尔码数组，标记了小于distance的信息
        distance[mask] = dist[mask]
        #dist[mask]是个一维数组，含有对应true顺序的dist的数值从而与distance的数值更新
        #将距离张量dist中小于distance中对应位置的值的距离更新到distance张量中，
        #从而更新每个点到采样点的距离。
        farthest = torch.max(distance, -1)[1]
        #farthest是distance中与质心最远点的点的索引
    return centroids

1.1.2grouping ，分组层，找距离中心点附近最近的K个点，组成local points region。这样的话就可以更加关注点云的局部信息，具体操作如下：

1.在query_ball_point函数中将上一步每一个采样的中心点需要以它们为中心采样周围的点组成成一个group，中心点与它group里面的其他点假设都在一个球体内，中心点为质心，计算其他点与质心的的距离，将不在球内的点（距离大于r平方的点）标记为N，然后选取离他最近的nsample个点为同一个组的采样点。如果质心附近点云稀疏的话（不够nsample个采样点），则将第一个点复制，将前nsample中不满足条件的点替换为第一个点，同样取样nsample个点。最后返回采样group的索引

2.根据已经提取出来的group_idx，在points（所有的点云数据集）中提取出，new_xyz,new_points,这些为points的子集，为每一个中心点采取一个group的集合，new_xyz的最后一维只包含xyz等空间信息，而new_points的最后一维包含其他特征，比如法向量nx，ny，nz

def query_ball_point(radius, nsample, xyz, new_xyz):
    """
    Input:
        radius: local region radius
        nsample: max sample number in local region
        xyz: all points, [B, N, 3]
        new_xyz: query points, [B, S, 3]
    Return:
        group_idx: grouped points index, [B, S, nsample]
    """
    device = xyz.device
    B, N, C = xyz.shape
    _, S, _ = new_xyz.shape
    group_idx = torch.arange(N, dtype=torch.long).to(device).view(1, 1, N).repeat([B, S, 1])
    #将[0,....,N-1]先用view变化为（1,1，N）相当于1行N列，然后将第一维度上复制B次，第二维度（行）复制S次，
    #第三维度复制1次，最后是B,S,N的形状
    sqrdists = square_distance(new_xyz, xyz)
    #sqrdists: [B, S, N] 记录中心点与所有点之间的欧氏距离
    group_idx[sqrdists > radius ** 2] = N
    #为了处理未找到有效邻域点的情况，并对应于球形邻域搜索中的点筛选操作，将距离大于半径的邻域点排除在外。
    group_idx = group_idx.sort(dim=-1)[0][:, :, :nsample]
    #对最后一个维度采用升序排序排序，选出距离最近的nsample个点，形状B,S,nsample
    group_first = group_idx[:, :, 0].view(B, S, 1).repeat([1, 1, nsample])
    #考虑到一个group不足nsample个点，用第一个点复制代替
    #得到的group_first张量是一个形状为[B, S, nsample]的张量，其中每个元素表示每个查询点的第一个邻域点的索引
    mask = group_idx == N
    group_idx[mask] = group_first[mask]
    #对于在nsample内若存在大于半径球内的N点值，则将大于group的点替换成第一个点，最后返回group的索引
    return group_idx


def sample_and_group(npoint, radius, nsample, xyz, points, returnfps=False):
    """
    Input:
        npoint:
        radius:
        nsample:
        xyz: input points position data, [B, N, 3]
        points: input points data, [B, N, D]
    Return:
        new_xyz: sampled points position data, [B, npoint, nsample, 3]
        new_points: sampled points data, [B, npoint, nsample, 3+D]
    """
    B, N, C = xyz.shape
    S = npoint
    fps_idx = farthest_point_sample(xyz, npoint)               #获取了最远采样的几个点的索引[B, npoint]
    new_xyz = index_points(xyz, fps_idx)                       #获取最远点采样点[B,npoint,C]
    idx = query_ball_point(radius, nsample, xyz, new_xyz)      #获取每个中心点采样nsample个点的下标[B,npoint,nsample]的索引
    grouped_xyz = index_points(xyz, idx)                       # 获取所有采样的点的分组[B,npoint,nsample,C]
    grouped_xyz_norm = grouped_xyz - new_xyz.view(B, S, 1, C)  #每个group点减去质心的坐标

    if points is not None:
        grouped_points = index_points(points, idx)
        new_points = torch.cat([grouped_xyz_norm, grouped_points], dim=-1)
        # 最后一个特征维度进行拼接[B, npoint, nsample, C+D]
    else:
        new_points = grouped_xyz_norm
    if returnfps:
        return new_xyz, new_points, grouped_xyz, fps_idx
    else:
        return new_xyz, new_points


def sample_and_group_all(xyz, points):
    """
    Input:
        xyz: input points position data, [B, N, 3]
        points: input points data, [B, N, D]
    Return:
        new_xyz: sampled points position data, [B, 1, 3]
        new_points: sampled points data, [B, 1, N, 3+D]
    """
    #直接将所有点作为一个group，即增加一个长度为1的维度而已

    device = xyz.device
    B, N, C = xyz.shape
    new_xyz = torch.zeros(B, 1, C).to(device)
    # new_xyz代表中心点，用原点表示
    grouped_xyz = xyz.view(B, 1, N, C)
    # grouped_xyz减去中心点：每个区域的点减去区域的中心值，由于中心点为原点，所以结果仍然是grouped_xyz
    if points is not None:
        new_points = torch.cat([grouped_xyz, points.view(B, 1, N, -1)], dim=-1)
        # view(B, 1, N, -1)，-1代表自动计算，即结果等于view(B, 1, N, D)
    else:
        new_points = grouped_xyz
    return new_xyz, new_points

以上函数还使用了一个index_points的函数如下：

主要功能是可以用batch_indices，以及idx（两个维度必须匹配），根据点云索引从点云集中抽取出特定的点云数据。关于这个索引方法可以看看numpy的整数索引方法。

def index_points(points, idx):
    """

    Input:
        points: input points data, [B, N, C]
        idx: sample index data, [B, S]
    Return:
        new_points:, indexed points data, [B, S, C]
    """
    device = points.device
    B = points.shape[0]
    view_shape = list(idx.shape)
    view_shape[1:] = [1] * (len(view_shape) - 1)
    #view_shape[1:]=[s]然后把[1]赋给[s],变为[B,1]
    repeat_shape = list(idx.shape)
    repeat_shape[0] = 1
    #repeat_shape形状为[1,S]
    batch_indices = torch.arange(B, dtype=torch.long).to(device).view(view_shape).repeat(repeat_shape)
    #arrange生成[0, ..., B - 1], view后变为列向量[B, 1], repeat后[B, S]
    new_points = points[batch_indices, idx, :]
    # 从points中取出每个batch_indices对应索引的数据点
    return new_points

1.1.3特征提取层

将上面进行过采样以及分组处理后的点进行pointnet网络，这样一来，pointnet就可以关注到局部的细节，需要进行两次set abstraction的提取，下面是set abstraion的代码：

class PointNetSetAbstraction(nn.Module):
    def __init__(self, npoint, radius, nsample, in_channel, mlp, group_all):
        super(PointNetSetAbstraction, self).__init__()
        self.npoint = npoint
        self.radius = radius
        self.nsample = nsample
        self.mlp_convs = nn.ModuleList()
        self.mlp_bns = nn.ModuleList()
        last_channel = in_channel
        for out_channel in mlp:
            self.mlp_convs.append(nn.Conv2d(last_channel, out_channel, 1))
            self.mlp_bns.append(nn.BatchNorm2d(out_channel))
            last_channel = out_channel
        self.group_all = group_all

    def forward(self, xyz, points):
        """
        Input:
            xyz: input points position data, [B, C, N]
            points: input points data, [B, D, N]
        Return:
            new_xyz: sampled points position data, [B, C, S]
            new_points_concat: sample points feature data, [B, D', S]
        """
        xyz = xyz.permute(0, 2, 1)
        if points is not None:
            points = points.permute(0, 2, 1)

        if self.group_all:
            new_xyz, new_points = sample_and_group_all(xyz, points)
        else:
            new_xyz, new_points = sample_and_group(self.npoint, self.radius, self.nsample, xyz, points)
        # new_xyz: sampled points position data, [B, npoint, C]
        # new_points: sampled points data, [B, npoint, nsample, C+D]
        new_points = new_points.permute(0, 3, 2, 1) # [B, C+D, nsample,npoint]
        for i, conv in enumerate(self.mlp_convs):
            bn = self.mlp_bns[i]
            new_points =  F.relu(bn(conv(new_points)))
        #经过多层感知机以及maxpooling，相当于局部pointnet
        new_points = torch.max(new_points, 2)[0]
        new_xyz = new_xyz.permute(0, 2, 1)
        return new_xyz, new_points

分类任务中，两层set abstraion层后再接一个pointnet，得到一个关于全局的特征张量，然后通过多层感知机变化通道数，最后经过softmax输出各类别的概率。

二、分割任务

分割需要对每一个点进行分类，在前面的步骤中经过采样分组和pointnet已经将点云进行了下采样，所以分割任务中需要将特征上采样进行还原到以前的维度。作者提出了一种基于距离插值的分层特征传播（Feature Propagation）策略，从网络图看，先是将第一次经过pointnet的特征（我们当他当layer2层）与第二次经过pointnet的特征（我们把它称layer3层），做距离差值，然后还原到第一次pointnet后的特征维度，然后与没做过pointnet的层继续做距离差值。

以下是距离差值的公式：

代码如下，实现逻辑是：

def square_distance(src, dst):
    """
    Calculate Euclid distance between each two points.

    src^T * dst = xn * xm + yn * ym + zn * zm；
    sum(src^2, dim=-1) = xn*xn + yn*yn + zn*zn;
    sum(dst^2, dim=-1) = xm*xm + ym*ym + zm*zm;
    dist = (xn - xm)^2 + (yn - ym)^2 + (zn - zm)^2
         = sum(src**2,dim=-1)+sum(dst**2,dim=-1)-2*src^T*dst

    Input:
        src: source points, [B, N, C]
        dst: target points, [B, M, C]
    Output:
        dist: per-point square distance, [B, N, M]
    """
    B, N, _ = src.shape
    _, M, _ = dst.shape
    dist = -2 * torch.matmul(src, dst.permute(0, 2, 1))
    #乘法运算实际上是计算了两个向量之间的内积。
    dist += torch.sum(src ** 2, -1).view(B, N, 1)
    dist += torch.sum(dst ** 2, -1).view(B, 1, M)
    return dist

首先是计算layer2与layer3每个点之间的距离，然后进行升序排列，取靠的最近的layer3层三个点作为距离差值的点，取这三个距离的倒数相加，接着得出权值，用特征与权值相乘，得到差值后的新的点的特征值。产生的新特征与上一层的特征进行cat操作，再通过卷积等完成特征融合。

class PointNetFeaturePropagation(nn.Module):
    def __init__(self, in_channel, mlp):
        super(PointNetFeaturePropagation, self).__init__()
        self.mlp_convs = nn.ModuleList()
        self.mlp_bns = nn.ModuleList()
        last_channel = in_channel
        for out_channel in mlp:
            self.mlp_convs.append(nn.Conv1d(last_channel, out_channel, 1))
            self.mlp_bns.append(nn.BatchNorm1d(out_channel))
            last_channel = out_channel

    def forward(self, xyz1, xyz2, points1, points2):
        """
        Input:
            xyz1: input points position data, [B, C, N]
            xyz2: sampled input points position data, [B, C, S]
            points1: input points data, [B, D, N]
            points2: input points data, [B, D, S]
        Return:
            new_points: upsampled points data, [B, D', N]
        """
        xyz1 = xyz1.permute(0, 2, 1)
        xyz2 = xyz2.permute(0, 2, 1)

        points2 = points2.permute(0, 2, 1)
        B, N, C = xyz1.shape
        _, S, _ = xyz2.shape

        if S == 1:
            interpolated_points = points2.repeat(1, N, 1)
            #如果只有一个点，将复制N份上采样
        else:
            dists = square_distance(xyz1, xyz2)
            #计算layer2的xyz1的点与layer 3 的xyz2的点之间的距离，形状为[B,N,S]
            dists, idx = dists.sort(dim=-1)
            dists, idx = dists[:, :, :3], idx[:, :, :3]
            #然后将距离按照行的维度升序排列，也就是排列后可得每个N点离s个点最近的点，取三个最近点维度成为 [B, N, 3]
            dist_recip = 1.0 / (dists + 1e-8)
            #取距离的倒数，对应论文中的 Wi(x)，然后将每行的三个距离的倒数相加
            norm = torch.sum(dist_recip, dim=2, keepdim=True)
            weight = dist_recip / norm
            #计算权重，离得近的点权重大。 两者相除就是每个距离占总和的比重 也就是weight
            interpolated_points = torch.sum(index_points(points2, idx) * weight.view(B, N, 3, 1), dim=2)
            #index_points(points2, idx)，points2的维度为[B,S,D],idx的维度为[B,N,3],函数中batch_indices为[B,N,3],
            #最后得到的维度为[B,N,3,D],weight的维度view为[B,N,3,1]



        if points1 is not None:
            points1 = points1.permute(0, 2, 1)
            new_points = torch.cat([points1, interpolated_points], dim=-1)
        else:
            new_points = interpolated_points

        new_points = new_points.permute(0, 2, 1)
        for i, conv in enumerate(self.mlp_convs):
            bn = self.mlp_bns[i]
            new_points = F.relu(bn(conv(new_points)))
        return new_points

COOLRANEN

关注

1
点赞
踩
4

收藏

觉得还不错? 一键收藏
0
评论
Pointnet++ 网络结构以及代码实现

pointnet++是在pointnet的基础上发展而来的，而pointnet对于局部结构的识别能力有所缺陷，从pointnet的网络我们也可以看出，pointnet（如图一）是对整体的特征进行了maxpooling操作，忽略了局部特征，而pointnet++采用了一个叫深度的层次特征学习模式以提高局部结构的识别能力。
复制链接

扫一扫