【点云】RandLA-Net中的Dilated Residual Block代码解析

 代码来自:https://github.com/QingyongHu/RandLA-Net/blob/master/RandLANet.py

 其中函数dilated_res_block(self, feature, xyz, neigh_idx, d_out, name, is_training)就是文章对应的Dilated Residual Block模块。具体代码如下:

def dilated_res_block(self, feature, xyz, neigh_idx, d_out, name, is_training):
        f_pc = helper_tf_util.conv2d(feature, d_out // 2, [1, 1], name + 'mlp1', [1, 1], 'VALID', True, is_training)
        f_pc = self.building_block(xyz, f_pc, neigh_idx, d_out, name + 'LFA', is_training)
        f_pc = helper_tf_util.conv2d(f_pc, d_out * 2, [1, 1], name + 'mlp2', [1, 1], 'VALID', True, is_training,
                                     activation_fn=None)
        shortcut = helper_tf_util.conv2d(feature, d_out * 2, [1, 1], name + 'shortcut', [1, 1], 'VALID',
                                         activation_fn=None, bn=True, is_training=is_training)
        return tf.nn.leaky_relu(f_pc + shortcut)

其中feature维度为[batch_szie, 1, N, din],其中N为点云数量,din是输入维度,dout是输出维度,K是近邻点数

1 通过shared mlp处理后,f_pc为[batch_size, 1, N, 1/2dout]

2 主题部分为building_block模块。其中输入xyz为[b, N, 3],neigh_idx[b, N, K],此处的din = 1/2dout,feature = f_pc

def building_block(self, xyz, feature, neigh_idx, d_out, name, is_training):
        d_in = feature.get_shape()[-1].value
        f_xyz = self.relative_pos_encoding(xyz, neigh_idx)
        f_xyz = helper_tf_util.conv2d(f_xyz, d_in, [1, 1], name + 'mlp1', [1, 1], 'VALID', True, is_training)
        f_neighbours = self.gather_neighbour(tf.squeeze(feature, axis=2), neigh_idx)
        f_concat = tf.concat([f_neighbours, f_xyz], axis=-1)
        f_pc_agg = self.att_pooling(f_concat, d_out // 2, name + 'att_pooling_1', is_training)

        f_xyz = helper_tf_util.conv2d(f_xyz, d_out // 2, [1, 1], name + 'mlp2', [1, 1], 'VALID', True, is_training)
        f_neighbours = self.gather_neighbour(tf.squeeze(f_pc_agg, axis=2), neigh_idx)
        f_concat = tf.concat([f_neighbours, f_xyz], axis=-1)
        f_pc_agg = self.att_pooling(f_concat, d_out, name + 'att_pooling_2', is_training)
        return f_pc_agg

2.1 relative_pos_encoding(self, xyz, neigh_idx)

def relative_pos_encoding(self, xyz, neigh_idx):
        neighbor_xyz = self.gather_neighbour(xyz, neigh_idx)
        xyz_tile = tf.tile(tf.expand_dims(xyz, axis=2), [1, 1, tf.shape(neigh_idx)[-1], 1])
        relative_xyz = xyz_tile - neighbor_xyz
        relative_dis = tf.sqrt(tf.reduce_sum(tf.square(relative_xyz), axis=-1, keepdims=True))
        relative_feature = tf.concat([relative_dis, relative_xyz, xyz_tile, neighbor_xyz], axis=-1)
        return relative_feature

2.1.1 gather_neighbour:其中pc为[b, N, 3],neighbor_idx为[b, N, K]。通过reshape函数,将neighbor_idx变换为[b, N*K],从pc中取出对应的特征(此处为坐标),得到features为[b, N*K, 3],然后再通过reshape函数将feature转换为[b, N, K, 3]输出。

def gather_neighbour(pc, neighbor_idx):
        # gather the coordinates or features of neighboring points
        batch_size = tf.shape(pc)[0]
        num_points = tf.shape(pc)[1]
        d = pc.get_shape()[2].value
        index_input = tf.reshape(neighbor_idx, shape=[batch_size, -1])
        features = tf.batch_gather(pc, index_input)
        features = tf.reshape(features, [batch_size, num_points, tf.shape(neighbor_idx)[-1], d])
        return features

通过gather_neighbour()函数获得对应点的xyz,随后就是将整合在一起得到relative_feature。

返回后的f_xyz = relative_feature,再通过shared mlp,将其转换为[b, N, K, 1/2dout]。随后通过gather_neighbour()函数获得邻近点的特征f_neighbours[b, N, K, 1/2dout],并将它于f_xyz连接得到f_concat[b, N, K, dout]

2.2 att_pooling(f_concat, d_out // 2, name + 'att_pooling_1', is_training)

def att_pooling(feature_set, d_out, name, is_training):
        batch_size = tf.shape(feature_set)[0]
        num_points = tf.shape(feature_set)[1]
        num_neigh = tf.shape(feature_set)[2]
        d = feature_set.get_shape()[3].value
        f_reshaped = tf.reshape(feature_set, shape=[-1, num_neigh, d])
        att_activation = tf.layers.dense(f_reshaped, d, activation=None, use_bias=False, name=name + 'fc')
        att_scores = tf.nn.softmax(att_activation, axis=1)
        f_agg = f_reshaped * att_scores
        f_agg = tf.reduce_sum(f_agg, axis=1)
        f_agg = tf.reshape(f_agg, [batch_size, num_points, 1, d])
        f_agg = helper_tf_util.conv2d(f_agg, d_out, [1, 1], name + 'mlp', [1, 1], 'VALID', True, is_training)
        return f_agg

其中feature_set = f_concat,对应维度为 [b, N, K, dout],此处函数输入d_out为外层函数的1/2dout。

通过reshape函数将feature_set转换为f_reshaped [b * N, K, dout]

然后连接一个dense函数,即一个全连接,只改变输入的最后一维,输出维度为神经元数,即参数d。因为只使用了一个全连接网络,也就是论文中所说的共享参数的全连接的实现。输出接一个softmax得到att_scores [b* N, K, dout]。

随后是reduce_sum函数,按照那个维度求和,哪个维度变为1,即f_agg [b * N, 1, dout]

随后通过一个reshape转换为[b, N, 1, dout],然后通过shared mlp 转换为[b, N, 1, 1/2dout]输出

  • 2
    点赞
  • 13
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值