[论文系列]Waffle Iron

论文网址:https://arxiv.org/pdf/2301.10100v1.pdf

代码网址:https://github.com/valeoai/WaffleIron

结果:

  1. 方法:点+投影<MLP+2D convolutions>

(1) 点的特征提取通过MLP完成,即论文High-level description部分

(2) 投影部分:分别在(x,y) (x,z) (y,z)三个平面进行投影

we propose to repeatedly project along each main axis.Concretely, we sequentially project on planes (x, y), (x, z) and (y, z) atlayer l = 1, l = 2, and l = 3, respectively, and repeat this sequence until layer l= L.

所以需要控制的参数有两个:①L<3的倍数,作者给出的理由是点是按x,y y,z z,x三个方向投影的,但是既没有说明在这三个方向上投影有什么好处,在代码中也只看到x,y方向的投影,也就是BEV图的处理,> ②二维图片分辨率(grid_shape)

  1. 编码层Embedding(x : B×C_in×N; neighbors: B×K×N; Output : B×C_out×N)

  • BN(Batch Normal) : 点云数据归一化

  • Point Embedding(Conv1d) : 点云编码

  • Neighbors Embedding(Conv2d) : 周围点云编码

  • 最后就是把点云和Neighbor的编码拼接起来

class Embedding(nn.Module):
    def __init__(self, channels_in, channels_out):
        super().__init__()

        # Normalize inputs
        self.norm = nn.BatchNorm1d(channels_in)

        # Point Embedding
        self.conv1 = nn.Conv1d(channels_in, channels_out, 1)

        # Neighborhood embedding
        self.conv2 = nn.Sequential(
            nn.BatchNorm2d(channels_in),
            nn.Conv2d(channels_in, channels_out, 1, bias=False),
            nn.BatchNorm2d(channels_out),
            nn.ReLU(inplace=True),
            nn.Conv2d(channels_out, channels_out, 1, bias=False),
        )

        # Merge point and neighborhood embeddings
        self.final = nn.Conv1d(2 * channels_out, channels_out, 1, bias=True, padding=0)

    def forward(self, x, neighbors):
        """x: B x C_in x N. neighbors: B x K x N. Output: B x C_out x N"""
        # Normalize input
        x = self.norm(x)

        # Point embedding
        point_emb = self.conv1(x)

        # Neighborhood embedding
        gather = []
        # Gather neighbors around each center point
        for ind_nn in range(
            1, neighbors.shape[1]
        ):  # Remove first neighbors which is the center point
            temp = neighbors[:, ind_nn : ind_nn + 1, :].expand(-1, x.shape[1], -1)
            gather.append(torch.gather(x, 2, temp).unsqueeze(-1))
        # Relative coordinates
        neigh_emb = torch.cat(gather, -1) - x.unsqueeze(-1)  # Size: (B x C x N) x K
        # Embedding
        neigh_emb = self.conv2(neigh_emb).max(-1)[0]

        # Merge both embeddings
        return self.final(torch.cat((point_emb, neigh_emb), dim=1))

  1. 骨干网络BackBone;

  • 通道混合<遍历每个depth>

> BN

> MLP : Conv1d + Relu + Conv1d

> scale : Conv1d(groups = channels)

> token +scale(mlp(norm(token)))

  • 空间混合<遍历每个BEV网格grids_shape>

> BN

> ffn: Conv2d+Relu+Conv2d

> scale

> token+scale(ffn(BN(token)))

class WaffleIron(nn.Module):
    def __init__(self, channels, depth, grids_shape):
        super().__init__()
        self.grids_shape = grids_shape
        self.channel_mix = nn.ModuleList([ChannelMix(channels) for _ in range(depth)])
        self.spatial_mix = nn.ModuleList(
            [
                SpatialMix(channels, grids_shape[d % len(grids_shape)])
                for d in range(depth)
            ]
        )

    def forward(self, tokens, cell_ind, occupied_cell):

        # Build projection matrices
        batch_size, num_points = tokens.shape[0], tokens.shape[-1]
        point_ind = (
            torch.arange(num_points, device=tokens.device)
            .unsqueeze(0)
            .expand(batch_size, -1)
            .reshape(1, -1)
        )
        batch_ind = (
            torch.arange(batch_size, device=tokens.device)
            .unsqueeze(1)
            .expand(-1, num_points)
            .reshape(1, -1)
        )
        non_zeros_ind = []
        for i in range(cell_ind.shape[1]):
            non_zeros_ind.append(
                torch.cat((batch_ind, cell_ind[:, i].reshape(1, -1), point_ind), axis=0)
            )
        sp_mat = [
            build_proj_matrix(id, occupied_cell, batch_size, np.prod(sh))
            for id, sh in zip(non_zeros_ind, self.grids_shape)
        ]

        # Actual backbone
        for d, (smix, cmix) in enumerate(zip(self.spatial_mix, self.channel_mix)):
            tokens = smix(tokens, sp_mat[d % len(sp_mat)])
            tokens = cmix(tokens)

        return tokens

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值