1. 技术发展顺序
bag of words >> learnable bag of words with manual designed features == VLAD
VLAD >> learnable features == netVLAD
netVLAD >> learnable PointCloud features == pointNetVLAD (CVPR2018)
pointNetVLAD >> a module Robust Relocation System
2. 应用
BOW用于视觉SLAM重定位已经非常经典了, 如果
1. 特征可学习
2. cluster center 可学习
3. 激光数据可学习出描述子,(一天中 变化光照场景无影响,且数据量少)
那这就是为基于激光slam的移动机器人设计的回环检测模块(此模块技术一直学术工程上没有很好解决)
3 坑
但不能说激光回环已经得到解决了,笔者从实际工作出发,提出三点可能的坑:
1. flexible pointnet:
backbone net要经过反复的调试以适应场景,硬件的限制。大部分pointnet的实现都是照搬Charles Qi的原版,如下代码所示。把网络维度写死。类比港科沈老师组最新的vins kidnaped, 它们根据netVLAD进行了轻量化调优。pointNetVLAD能在机器人上跑,这个调优工作量是免不了的。工作需要时间堆出来。
def forward(point_cloud, is_training, bn_decay=None):
"""PointNetVLAD, INPUT is batch_num_queries X num_pointclouds_per_query X num_points_per_pointcloud X 3,
OUTPUT batch_num_queries X num_pointclouds_per_query X output_dim """
batch_num_queries = point_cloud.get_shape()[0].value
num_pointclouds_per_query = point_cloud.get_shape()[1].value
num_points = point_cloud.get_shape()[2].value
CLUSTER_SIZE=64
OUTPUT_DIM=256
point_cloud = tf.reshape(point_cloud, [batch_num_queries*num_pointclouds_per_query, num_points,3])
with tf.variable_scope('transform_net1') as sc:
input_transform = input_transform_net(point_cloud, is_training, bn_decay, K=3)
point_cloud_transformed = tf.matmul(point_cloud, input_transform)
input_image = tf.expand_dims(point_cloud_transformed, -1)
net = tf_util.conv2d(input_image, 64, [1,3],
padding='VALID', stride=[1,1],
is_training=is_training,
scope='conv1', bn_decay=bn_decay)
net = tf_util.conv2d(net, 64, [1,1],
padding='VALID', stride=[1,1],
is_training=is_training,
scope='conv2', bn_decay=bn_decay)
with tf.variable_scope('transform_net2') as sc:
feature_transform = feature_transform_net(net, is_training, bn_decay, K=64)
net_transformed = tf.matmul(tf.squeeze(net, axis=[2]), feature_transform)
net_transformed = tf.expand_dims(net_transformed, [2])
net = tf_util.conv2d(net_transformed, 64, [1,1],