RandLA-Net 亮点2------local feature aggregation

最新推荐文章于 2024-08-27 21:16:42 发布

考拉喜欢吃火腿

最新推荐文章于 2024-08-27 21:16:42 发布

阅读量1.1k

点赞数

分类专栏： 3D点云处理

本文链接：https://blog.csdn.net/qq_24505417/article/details/108982154

版权

3D点云处理专栏收录该内容

18 篇文章 14 订阅

订阅专栏

策略2. 局部特征编码(LFA)

LocSE + Attentive pooling + residual block

1. 网络结构概览

整个网络包含encode,decode 和头部. LFA模块应用于encode 网络中,用于对随机采样的点进行特征提取,解决随机采样带来的信息丢失问题. 除了LFA,作者还使用了残差模块.下面是整个网络代码.

2. 扩张的残差模块:

分为两条路:一条mainstream ,一条shortcut, 最后相加, 弥补信息丢失.

mainstream: MLP(d_out/2) + LFA(d_out) + MLP(d_out *2)

shortcut: MLP(d_out*2)

residual_output = LeaklyReLU( mainstream + shortcut)

残差模块后做点云下采样(训练开始前已经记录每层要保留的点集,以及他们的最近N个邻居点),进入下一层

代码:

def dilated_res_block(self, feature, xyz, neigh_idx, d_out, name, is_training):
    f_pc = helper_tf_util.conv2d(feature, d_out // 2, [1, 1], name + 'mlp1', [1, 1], 'VALID', True, is_training)
    f_pc = self.building_block(xyz, f_pc, neigh_idx, d_out, name + 'LFA', is_training)
    f_pc = helper_tf_util.conv2d(f_pc, d_out * 2, [1, 1], name + 'mlp2', [1, 1], 'VALID', True, is_training,
                                     activation_fn=None)
    shortcut = helper_tf_util.conv2d(feature, d_out * 2, [1, 1], name + 'shortcut', [1, 1], 'VALID',
                                         activation_fn=None, bn=True, is_training=is_training)
    return tf.nn.leaky_relu(f_pc + shortcut)

3. LFA模块

也就是LFA模块. 该模块由Local Spatial Encoding（局部空间编码）和Attentive Pooling(注意力池化)组成. 结构为:

LocSE+ Attentive Pooling+LocSE+Attentive Pooling

3.1 LocSE (局部空间编码):

第一步: 相对位置编码. 将中心点, 邻居点,相对坐标和距离进行串联,然后使用卷积操作,使其维度与输入的点云特征维度一致,得到增强的特征.

第二步:空间编码特征. 将编码后的相对位置特征与neighbours 的点特征进行串联就得到空间编码特征.

LocSE代码:

def relative_pos_encoding(self, xyz, neigh_idx):
    neighbor_xyz = self.gather_neighbour(xyz, neigh_idx)
    xyz_tile = tf.tile(tf.expand_dims(xyz, axis=2), [1, 1, tf.shape(neigh_idx)[-1], 1])
    relative_xyz = xyz_tile - neighbor_xyz
    relative_dis = tf.sqrt(tf.reduce_sum(tf.square(relative_xyz), axis=-1, keepdims=True))
    relative_feature = tf.concat([relative_dis, relative_xyz, xyz_tile, neighbor_xyz], axis=-1)
    return relative_feature

def building_block(self, xyz, feature, neigh_idx, d_out, name, is_training):
    d_in = feature.get_shape()[-1].value
    f_xyz = self.relative_pos_encoding(xyz, neigh_idx)
    f_xyz = helper_tf_util.conv2d(f_xyz, d_in, [1, 1], name + 'mlp1', [1, 1], 'VALID', True, is_training)
    f_neighbours = self.gather_neighbour(tf.squeeze(feature, axis=2), neigh_idx)
    f_concat = tf.concat([f_neighbours, f_xyz], axis=-1)
    f_pc_agg = self.att_pooling(f_concat, d_out // 2, name + 'att_pooling_1', is_training)

    f_xyz = helper_tf_util.conv2d(f_xyz, d_out // 2, [1, 1], name + 'mlp2', [1, 1], 'VALID', True, is_training)
    f_neighbours = self.gather_neighbour(tf.squeeze(f_pc_agg, axis=2), neigh_idx)
    f_concat = tf.concat([f_neighbours, f_xyz], axis=-1)
    f_pc_agg = self.att_pooling(f_concat, d_out, name + 'att_pooling_2', is_training)
    return f_pc_agg

在整个building_block(LFA)中, 进行了两次LocSE 和 Attentive Pooling, 如代码所示.

第二次池化是对输入点坐标再次进行相对位置编码,这一次编码的特征长度为d_out//2 (xyz-->f_xyz(d_in)-->f_xyz(d_out//2)), 然后与第一次融合+池化后的特征(特征长度也为d_out//2 )进行concate,并再次池化.

3.2 Attentive Pooling:

思想: 用点的特征生成同维度的权重,为每个点赋予不同的重要性,然后进行加权聚合.

Attentive Pooling 代码:

结构: MLP + 加权求和 + MLP

步骤: 先经过一个简单的MLP(dense)，然后利用softmax对K个近邻点分别打分，将特征与注意力得分相乘，再reduce_sum将所有近邻点的特征全部融合. 最后再经过share_MLP进行维度变形得到一个中心点的局部特征。

 def att_pooling(feature_set, d_out, name, is_training):
     batch_size = tf.shape(feature_set)[0]
     num_points = tf.shape(feature_set)[1]
     num_neigh = tf.shape(feature_set)[2]
     d = feature_set.get_shape()[3].value
     f_reshaped = tf.reshape(feature_set, shape=[-1, num_neigh, d])
     att_activation = tf.layers.dense(f_reshaped, d, activation=None, use_bias=False, name=name + 'fc')
     att_scores = tf.nn.softmax(att_activation, axis=1)
     f_agg = f_reshaped * att_scores
     f_agg = tf.reduce_sum(f_agg, axis=1)
     f_agg = tf.reshape(f_agg, [batch_size, num_points, 1, d])
     f_agg = helper_tf_util.conv2d(f_agg, d_out, [1, 1], name + 'mlp', [1, 1], 'VALID', True, is_training)
     return f_agg