pointnet学习（七）input_transform_net

最新推荐文章于 2023-06-07 22:27:54 发布

guyuezunting

最新推荐文章于 2023-06-07 22:27:54 发布

阅读量1k

点赞数

分类专栏： pointnet 文章标签： tensorflow

本文链接：https://blog.csdn.net/guyuezunting/article/details/106901658

版权

pointnet 专栏收录该内容

40 篇文章 27 订阅

订阅专栏

因为getmodel中第一个net就是此函数，且函数语句较多，所以分开写

首先看函数实现

def input_transform_net(point_cloud, is_training, bn_decay=None, K=3):
    """ Input (XYZ) Transform Net, input is BxNx3 gray image
        Return:
            Transformation matrix of size 3xK """
    batch_size = point_cloud.get_shape()[0].value
    num_point = point_cloud.get_shape()[1].value

    input_image = tf.expand_dims(point_cloud, -1)
    net = tf_util.conv2d(input_image, 64, [1,3],
                         padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training,
                         scope='tconv1', bn_decay=bn_decay)
    net = tf_util.conv2d(net, 128, [1,1],
                         padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training,
                         scope='tconv2', bn_decay=bn_decay)
    net = tf_util.conv2d(net, 1024, [1,1],
                         padding='VALID', stride=[1,1],
                         bn=True, is_training=is_training,
                         scope='tconv3', bn_decay=bn_decay)
    net = tf_util.max_pool2d(net, [num_point,1],
                             padding='VALID', scope='tmaxpool')

    net = tf.reshape(net, [batch_size, -1])
    net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training,
                                  scope='tfc1', bn_decay=bn_decay)
    net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training,
                                  scope='tfc2', bn_decay=bn_decay)

    with tf.variable_scope('transform_XYZ') as sc:
        assert(K==3)
        weights = tf.get_variable('weights', [256, 3*K],
                                  initializer=tf.constant_initializer(0.0),
                                  dtype=tf.float32)
        biases = tf.get_variable('biases', [3*K],
                                 initializer=tf.constant_initializer(0.0),
                                 dtype=tf.float32)
        biases += tf.constant([1,0,0,0,1,0,0,0,1], dtype=tf.float32)
        transform = tf.matmul(net, weights)
        transform = tf.nn.bias_add(transform, biases)

    transform = tf.reshape(transform, [batch_size, 3, K])
    return transform

输入的是point_cloud, is_training, bn_decay=None, K=3，调用为

transform = input_transform_net(point_cloud, is_training, bn_decay, K=3)前三个参数可以参考学习（五）

因此point_cloud,32x1024x3的tensor，is_training为bool类型tansor，shape未指定，bn_decay为学习率，此学习率随着trainprocess按照指数function慢慢递增，K按照此函数解释，为transformnet的卷积核的维度3xK维，则最后返回的transform为32x3x3的一个tensor。

第一二句为获取point的shape，bitchsize=32，pointnum=1024

第三句将输入的pointcloud拓展一维，变为32x1024x3x1的tensor，inputimage。

第四、五、六句，则为搭建卷积层的过程，通过tf_util.conv2d函数实现。参考pointnet学习（八）tf_util.conv2d

第一层卷积“tconv1”输出output（shpe[32，1024，1，64]），第二层“tconv2”输出output（shpe[32，1024，1，128]），第三层“tconv3”输出output（shpe[32，1024，1，1024]）

第七句为搭建maxpool层。因此“transform_net1”包括三个2d卷积层以及一个maxpoling层“tmaxpool”。输出为shape[32,1,1,1024]的tensor参考pointnet tf_util.max_pool2d

因为h,w,都是1，所以可以将32个batch对应的每个input计算出来的1024个channel值取出来进行计算。

reshape，

第八句net = tf.reshape(net, [batch_size, -1])

第九句，第十句

net = tf_util.fully_connected(net, 512, bn=True, is_training=is_training,
                                  scope='tfc1', bn_decay=bn_decay)
    net = tf_util.fully_connected(net, 256, bn=True, is_training=is_training,
                                  scope='tfc2', bn_decay=bn_decay)

将net通过一个fullyconnect层进行计算。参考pointnet tf_util.fully_connected计算之后net为32，256的tensor

再后面的操作，则是对fullyconnect的输出，乘以一个weight，256，3*k（k=3）再加一个初始化为[1,0,0,0,1,0,0,0,1]shape

为9的tensor biases最后得到32，9的tensor transform，再reshape成32，3，3的tensor，供后续预测对pointnet进行旋转，

当前这么设计net的原理有二，第一，卷积是对每个点的xyz三个轴向的特征计算提取，得到一个全局的特征点，类似图像处理中提取边，角，等特征明显的点一样，提取之后根据源，目的计算旋转矩阵，达到配准的效果

第二个原因是，点云的无序性，通过卷积核对每个点进行卷积，然后取卷积之后的最大值，这样就避免了输入同一个点云index不一样导致的提取结果不一样的问题了。

guyuezunting

关注

0
点赞
踩
6

收藏

觉得还不错? 一键收藏
0
评论
pointnet学习（七）input_transform_net

因为getmodel中第一个net就是此函数，且函数语句较多，所以分开写首先看函数实现def input_transform_net(point_cloud, is_training, bn_decay=None, K=3): """ Input (XYZ) Transform Net, input is BxNx3 gray image Return: Transformation matrix of size 3xK """ batch_
复制链接

扫一扫

专栏目录