CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data

欢迎访问我的个人博客: zengzeyu.com

前言


原文章请见参考文献: CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data.PDF
本文提出了一种新型的去地面点云方法。一种对3D点云数据编码来给CNN进行训练,最后来分割地面点云的方法。

地面点分割方法


训练数据说明


首先说明,根据Velodyne HDL-64E 生成的KITTI原始点云数据分析得知,每一帧点云尺寸大概为 64x4500,本文每一帧数据为 64x360 ,所以要对原始数据进行降采样。在每一帧点云中,每一线激光绕中心旋转一圈得到的点云按照 的归类分为 360 份,每一份点云的信息提取某一个点或者平均信息作为点代表,代表点的特征和 label 填入格子中生成CNN所需训练数据。每个点 label 进行二分类,分为地面点和分地面点。点特征包括 P = [Px, Py, Pz, Pi, Pr] ([ 坐标x, 坐标y, 坐标z, 反射强度intensity, 距离range ])。

A. 数据准备(Encoding Sparse 3D Data Into a Dense 2D Matrix)


为了将稀疏的3D点云数据应用的2D的CNN中,本文将其编码为2D的多信号通道数据储存在矩阵 M 中,如下图所示。

image.png

矩阵M尺寸为 64x360 ,降采样过程中,对一个格子内多个点进行平均取值作为代表。同时为了简化数据,[x,z] 计算得到的值代表距离,因为本文默认 Y 轴为高度方向,所以 x, z 值为对偶,可以采取此种方式进行简化数据。对于空格子,则从临近格子进行线性插值来生成该格子内值。

image.png

B. 训练数据集(Training Dataset)


训练数据集的重要性不容多说,本文自行开发了基于人工种子点选取的点云分割工具(semiautomatic tool for ground annotation),原理参考图像中的区域增长算法,只不过此处将点之间距离作为判断条件代替灰度值,同时发现当上下限为[0.03, 0.07]米时分割效果最好。选取了KITTI不同场景下共252帧点云作为人工分割数据,将分割好的数据按照7:3比例分为[训练集,评价集]
由于上面得到的数据量太少,所以本文又通过其他一些方法对剩下的19k帧数据,生成了训练所需数据集,基与点云特征有:最低高度,高度变化值,两线激光点云之间的距离和高度差。本文也尝试过自动生成数据(artificial 3D LiDAR data),但是效果较差。

C. 网络结构以及训练方法(Topology and Training of the Proposed Networks)


因为生成的训练数据较少,所以只采用浅层的CNN网络结构(shallow CNN architectures),类型为全卷积(fully convolutional)。卷基层和反卷基层都包含非线性的ReLU神经元(ReLU non-linearities),采用梯度下降方法进行训练。网络结构如下图所示:

image.png

上文 A. 中得到的矩阵 M 作为网络输入,因为是逐点(pixel)进行分类,所以网络的输出尺寸与输入尺寸相同,根据分类: ground = 1,其余点根据softmax函数概率映射进行输出。反卷积层(Deconvolutional
layers,广泛应用于语义分割(semantic segmentation)领域)在本文提出的4个网络结构中的中3个都有应用,其中包括效果最好的 L05+deconv (上图中第一个)。

CNN的输入数据先要进行归一化(normalize)和剪裁(rescale),高度方面KITTI数据集将 3m 以上的数据进行了滤波处理,深度 d 通道方面则使用 log 进行归一化处理。

image.png

image.png

实验结果


image.png

————————————————————————————————

Caffe代码复现


本文着手于复现 L05+deconv 网络的训练和预测。
image.png

Caffe 代码一共包含 5 个文件,其中 3Python 文件,以及自动生成的 2prototxt 文件。
Python:
- pcl_data_layer.py: 读取数据层类
- net.py: CNN网络结构配置
- solve.py: 求解器参数配置

prototxt:
- pcl_train.prototxt: 网络结构文件,由net.py自动生成
- solve.prototxt: 求解配置文件,由solve.py自动生成

pcl_data_layer.py

import caffe
import numpy as np
import random
import os
import matplotlib.pyplot as plt
import sys
from enum import Enum

class pointInfo(Enum):
    row = 0
    col = 1
    height = 2
    range = 3
    mark = 4

class PCLSegDataLayer(caffe.Layer):

    def setup(self, bottom, top):

        params = eval(self.param_str)
        self.npy_dir = params["pcl_dir"]
        self.list_name = list()

        # two tops: data and label
        if len(top) != 2:
            raise Exception("Need to define two tops: data and label.")
        # data layers have no bottoms
        if len(bottom) != 0:
            raise Exception("Do not define a bottom.")

        self.load_file_name( self.npy_dir, self.list_name )
        self.idx = 0

    def reshape(self, bottom, top):
        self.data, self.label = self.load_file( self.idx )
        # reshape tops to fit (leading 1 is for batch dimension)
        top[0].reshape(1, *self.data.shape)
        top[1].reshape(1, *self.label.shape)


    def forward(self, bottom, top):
        # assign output
        top[0].data[...] = self.data
        top[1].data[...] = self.label

        # pick next input
        self.idx += 1
        if self.idx == len(self.list_name):
            self.idx = 0

    def backward(self, top, propagate_down, bottom):
        pass

    def load_file(self, idx):
        print("idx", idx)
        in_file = np.load(self.list_name[idx]) #[row, col, height, range, mark]
        in_file = self.rescale_data(in_file)
        # is data correct
        if not self.is_data_correct(in_file):
            self.idx += 1
            self.load_file(self.idx)
            print("skip one frame.")

        in_file = self.fix_nan_point(in_file)
        in_data = in_file[:,:,0:-2]
        in_label = in_file[:,:,-1]
        return in_data, in_label

    def load_file_name(self, path, list_name):
        for file in os.listdir(path):
            file_path = os.path.join(path, file)
            if os.path.isdir(file_path):
                os.listdir(file_path, list_name)
            else:
                list_name.append(file_path)

    def rescale_data(self, in_file_data):
        rescaled_cloud = np.zeros(shape=(64, 360, 5))
        for i in range(64):
            for j in range(1, 181):
                kenel_data_1 = in_file_data[i, (j-1)*25:(j-1)*25+12, :]
                kenel_data_2 = in_file_data[i, (j-1)*25+13:j*25, :]
                rescaled_cloud[i, (j-1)*2] = self.find_point(kenel_data_1)
                rescaled_cloud[i, (j - 1) * 2 + 1] = self.find_point(kenel_data_2)
        return rescaled_cloud

    def find_point(self, kernel_store):
        tmp_range = 0
        tmp_size = 0
        for k in range(kernel_store.shape[0]):
            if kernel_store[k, -2] != 0:
                tmp_range += kernel_store[k, -2]
                tmp_size += 1
        if tmp_size != 0:
            tmp_range = tmp_range / tmp_size

        global_min_diff = sys.float_info.max
        point_num = -1
        for k in range(kernel_store.shape[0]):
            tmp_diff = abs(tmp_range - kernel_store[k, -2])
            if tmp_diff < global_min_diff:
                global_min_diff = tmp_diff
                point_num = k

        if point_num == -1:
            return point_num
        else:
            return kernel_store[point_num]


    def fix_nan_point(self, in_cloud):
        #fix edeg nan point 1st
        in_cloud = self.fix_left_edge_nan_point( in_cloud )
        in_cloud = self.fix_right_edge_nan_point( in_cloud )
        #fix centrol nan point
        for i in range(in_cloud.shape[0]):
            for j in range(1, in_cloud.shape[1]):
                if in_cloud[i, j, -1] == -1:
                    nan_size = 1
                    left = j - 1
                    right = j + 1
                    while in_cloud[i, left, -1] == -1:
                        left -= 1
                        nan_size += 1
                    while in_cloud[i, right, -1] == -1:
                        right += 1
                        nan_size += 1

                    height_diff_cell = (in_cloud[i, right, 2] - in_cloud[i, left, 2]) / nan_size
                    range_diff_cell = (in_cloud[i, right, 3] - in_cloud[i, left, 3]) / nan_size
                    in_cloud[i, j, 2] = in_cloud[i, left, 2] + (j - left) * height_diff_cell
                    in_cloud[i, j, 3] = in_cloud[i, left, 3] + (j - left) * range_diff_cell
                    if abs(j - left) < abs(right-j):
                        in_cloud[i, j, -1] = in_cloud[i, left, -1]
                    else:
                        in_cloud[i, j, -1] = in_cloud[i, right, -1]
        return in_cloud


    def fix_left_edge_nan_point(self, in_cloud):
        for i in range(in_cloud.shape[0]):
            if in_cloud[i, 0, -1] == -1:
                nan_size = 1
                left = 359
                right = 1
                while in_cloud[i,left,-1] == -1:
                    # print("left", left, in_cloud[i, left, 2], in_cloud[i, left, 3], in_cloud[i, left, 4])
                    left -= 1
                    nan_size += 1
                # print("left", left, in_cloud[i, left, 2], in_cloud[i, left, 3], in_cloud[i, left, 4])

                while in_cloud[i,right,-1] == -1:
                    # print("right", right, in_cloud[i, right, 2], in_cloud[i, right, 3], in_cloud[i, right, 4])
                    right += 1
                    nan_size +=1
                # print("right", right, in_cloud[i, right, 2], in_cloud[i, right, 3], in_cloud[i, right, 4])

                height_diff_cell = (in_cloud[i, right, 2] - in_cloud[i, left, 2]) / nan_size
                range_diff_cell = (in_cloud[i, right, 3] - in_cloud[i, left, 3]) / nan_size
                in_cloud[i, 0, 2] = in_cloud[i, left, 2] + (360 - left) * height_diff_cell
                in_cloud[i, 0, 3] = in_cloud[i, left, 3] + (360 - left) * range_diff_cell
                if abs(360 - left) < right:
                    in_cloud[i, 0, -1] = in_cloud[i, left, -1]
                else:
                    in_cloud[i, 0, -1] = in_cloud[i, right, -1]
        return in_cloud


    def fix_right_edge_nan_point(self, in_cloud):
        for i in range(in_cloud.shape[0]):
            if in_cloud[i, in_cloud.shape[1]-1, -1] == -1:
                nan_size = 1
                left = in_cloud.shape[1]-2
                right = 0
                while in_cloud[i,left,-1] == -1:
                    left -= 1
                    nan_size += 1
                while in_cloud[i,right,-1] == -1:
                    right += 1
                    nan_size +=1

                height_diff_cell = (in_cloud[i, right, 2] - in_cloud[i, left, 2]) / nan_size
                range_diff_cell = (in_cloud[i, right, 3] - in_cloud[i, left, 3]) / nan_size
                in_cloud[i, in_cloud.shape[1]-1, 2] = in_cloud[i, left, 2] + (in_cloud.shape[1]-1 - left) * height_diff_cell
                in_cloud[i, in_cloud.shape[1]-1, 3] = in_cloud[i, left, 3] + (in_cloud.shape[1]-1 - left) * range_diff_cell
                if abs(in_cloud.shape[1]-1 - left) < right + 1:
                    in_cloud[i, in_cloud.shape[1]-1, -1] = in_cloud[i, left, -1]
                else:
                    in_cloud[i, in_cloud.shape[1]-1, -1] = in_cloud[i, right, -1]
        return in_cloud

    def is_data_correct(self, in_cloud):
        for i in range(in_cloud.shape[0]):
            tmp_size = 0
            for j in range(in_cloud.shape[1]):
                if in_cloud[i, j, -1] == -1:
                    tmp_size += 1
            if tmp_size == in_cloud.shape[1]:
                print("tmp_size", tmp_size)
                return False
        return True

根据生成数据格式,将数据 split 为用于数据层输入的feature data 和 用于计算 losslabelground truth

net.py

根据 fcn 源码格式编写 cnn 代码:

import caffe
from caffe import layers as L, params as P


def conv_relu(bottom, nout, ks=3, stride=1, pad=1):
    conv = L.Convolution(bottom, kernel_size=ks, stride=stride,
        num_output=nout, pad=pad,
        param=[dict(lr_mult=1, decay_mult=1), dict(lr_mult=2, decay_mult=0)])
    return conv, L.ReLU(conv, in_place=True)

def deconv_relu(bottom, nout, ks=3, stride=1):
    deconv = L.Deconvolution(bottom, convolution_param=dict(num_output=nout, kernel_size=ks, stride=stride,
            bias_term=False),  param=[dict(lr_mult=0)])
    return deconv, L.ReLU(deconv, in_place=True)


def cnn():
    n = caffe.NetSpec()
    pydata_params = dict()
    pydata_params['pcl_dir'] = '../velodyne/npy/npy_0.5_grid/'
    pylayer = 'PCLSegDataLayer'
    n.data, n.label = L.Python(module='pcl_data_layer', layer=pylayer,
            ntop=2, param_str=str(pydata_params))

    # base net
    n.conv1_1, n.relu1_1 = conv_relu(n.data, nout=24, ks=11, pad=10)
    n.conv2_1, n.relu2_1 = conv_relu(n.relu1_1, nout=48, ks=5, stride=2, pad=2)
    n.conv3_1, n.relu3_1 = conv_relu(n.relu2_1, nout=48)
    n.deconv4_1, n.relu4_1 = deconv_relu(n.relu3_1, nout=24, ks=5, stride=2)
    n.conv5_1, n.relu5_1 = conv_relu(n.relu4_1, nout=64)
    n.conv6_1, n.relu6_1 = conv_relu(n.relu5_1, nout=2, ks=4)

    n.softmax = L.SoftmaxWithLoss(n.conv6_1, n.label)

    return n.to_proto()


def make_net():
    with open('pcl_train.prototxt', 'w') as f:
        f.write(str(cnn()))


if __name__ == '__main__':
    make_net()

运行net.py文件后生成pcl_train.prototxt文件:

layer {
  name: "data"
  type: "Python"
  top: "data"
  top: "label"
  python_param {
    module: "pcl_data_layer"
    layer: "PCLSegDataLayer"
    param_str: "{\'pcl_dir\': \'../velodyne/npy/npy_0.5_grid/\'}"
  }
}
layer {
  name: "conv1_1"
  type: "Convolution"
  bottom: "data"
  top: "conv1_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 24
    pad: 10
    kernel_size: 11
    stride: 1
  }
}
layer {
  name: "relu1_1"
  type: "ReLU"
  bottom: "conv1_1"
  top: "conv1_1"
}
layer {
  name: "conv2_1"
  type: "Convolution"
  bottom: "conv1_1"
  top: "conv2_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 48
    pad: 2
    kernel_size: 5
    stride: 2
  }
}
layer {
  name: "relu2_1"
  type: "ReLU"
  bottom: "conv2_1"
  top: "conv2_1"
}
layer {
  name: "conv3_1"
  type: "Convolution"
  bottom: "conv2_1"
  top: "conv3_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 48
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "relu3_1"
  type: "ReLU"
  bottom: "conv3_1"
  top: "conv3_1"
}
layer {
  name: "deconv4_1"
  type: "Deconvolution"
  bottom: "conv3_1"
  top: "deconv4_1"
  param {
    lr_mult: 0
  }
  convolution_param {
    num_output: 24
    bias_term: false
    kernel_size: 5
    stride: 2
  }
}
layer {
  name: "relu4_1"
  type: "ReLU"
  bottom: "deconv4_1"
  top: "deconv4_1"
}
layer {
  name: "conv5_1"
  type: "Convolution"
  bottom: "deconv4_1"
  top: "conv5_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 64
    pad: 1
    kernel_size: 3
    stride: 1
  }
}
layer {
  name: "relu5_1"
  type: "ReLU"
  bottom: "conv5_1"
  top: "conv5_1"
}
layer {
  name: "conv6_1"
  type: "Convolution"
  bottom: "conv5_1"
  top: "conv6_1"
  param {
    lr_mult: 1
    decay_mult: 1
  }
  param {
    lr_mult: 2
    decay_mult: 0
  }
  convolution_param {
    num_output: 2
    pad: 1
    kernel_size: 4
    stride: 1
  }
}
layer {
  name: "relu6_1"
  type: "ReLU"
  bottom: "conv6_1"
  top: "conv6_1"
}
layer {
  name: "softmax"
  type: "SoftmaxWithLoss"
  bottom: "conv6_1"
  bottom: "label"
  top: "softmax"
}

solve.py

solver流程:

  1. 设计好需要优化的对象,以及用于学习的训练网络和用于评估的测试网络(通过调用另外一个配置文件prototxt来执行)
  2. 通过forwardbackward迭代的进行优化来更新参数
  3. 定期的评价测试网络(设定多少次训练后进行一次测试)
  4. 在优化过程中显示模型和solver的状态

单步迭代过程中,solver进行如下工作:

  1. 调用forward算法来计算最终的输出值,以及对应的loss
  2. 调用backward算法来计算每层的梯度
  3. 根据选用的solver方法,利用梯度进行参数更新
  4. 记录并保存每次迭代的学习率、快照和状态
import caffe
import numpy as np
import os

# init
# caffe.set_device(0)
# caffe.set_mode_gpu()

solver = caffe.SGDSolver('solver.prototxt')

for _ in range(25):
    solver.step(4000)

以上。


参考文献:CNN for Very Fast Ground Segmentation in Velodyne LiDAR Data.PDF

  • 2
    点赞
  • 10
    收藏
    觉得还不错? 一键收藏
  • 13
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 13
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值