Processing Point Clouds

博客原文地址:http://ronny.rest/tutorials/module/pointclouds_01

1. 点云数据

点云数据可以由N行、至少包含三列的numpy数组表示,每一行表示这一个点在三维坐标系下的坐标(x, y, z)。如果该点云是从激光雷达中获取,每个点还有一个额外的值,比如“反射值”,它用来衡量激光雷达光束遇到障碍物反射回来的数量。在这种情况下,每个点云数据或许会由N*4的数组来表示。

 

2. 图像坐标系和点云坐标系

二维图像坐标系和点云坐标系是两种完全不同的表示方法,如上图所示,蓝色表示这图像坐标系,而橙色表示三维点云坐标系。下面总结了图像坐标系和点云坐标系需要注意的内容:

2.1 图像坐标

(1)在图像坐标中,所有的坐标值都是正数

(2)坐标系的原点在图像的左上角

(3)坐标值都是整数(这是因为一个像素表示一个点)

2.2 点云坐标

(1)在点云坐标系中,所有的坐标值可正可负

(2)坐标值可取实数

(3)x的正半轴表示前方

(4)y的正半轴表示左侧

(5)z的正半轴表示上方

 

3 创建点云数据的鸟瞰图

3.1 相关的坐标系

为了生成点云的鸟瞰图,我们必须要将点云的xy坐标系转换到图像的xy坐标系中,就如上图所示,而且在坐标系转换前后,还要注意以下几个问题:

(1)x和y坐标值表示相反的东西

(2)x和y坐标系表示相反的方向

(3)鸟瞰图的原点必须以(0,0)开始,并且其值要大于0。

3.2 感兴趣的矩形区域

在转换之前,我们不会将整个点云图像全部转成鸟瞰图,而是只将某个区域的点云图转换成鸟瞰图。因此,我们会创建一个过滤器,它会对整个点云进行过滤,只得到我们感兴趣的区域。我会制定一个相对于原点的范围值,用来划分感兴趣的矩形区域。所有在坐标原点左边的值都置为负,所有在坐标原点右边的值都置为正。

下面的代码设置了一个感兴趣的矩形区域,它设置了距离原点左右10米、前方20米的区域。

side_range=(-10, 10)     # left-most to right-most
fwd_range=(0, 20)       # back-most to forward-most

下面,我创建了一个过滤器来得到那些我们真正感兴趣的矩形区域:

# EXTRACT THE POINTS FOR EACH AXIS
x_points = points[:, 0]
y_points = points[:, 1]
z_points = points[:, 2]

# FILTER - To return only indices of points within desired cube
# Three filters for: Front-to-back, side-to-side, and height ranges
# Note left side is positive y axis in LIDAR coordinates
f_filt = np.logical_and((x_points > fwd_range[0]), (x_points < fwd_range[1]))
s_filt = np.logical_and((y_points > -side_range[1]), (y_points < -side_range[0]))
filter = np.logical_and(f_filt, s_filt)
indices = np.argwhere(filter).flatten()

# KEEPERS
x_points = x_points[indices]
y_points = y_points[indices]
z_points = z_points[indices]

3.3 将点坐标映射到像素坐标

这时,我们得到了大量的数值为实数的点云,接着我们需要将其实数值转成整数。我们可以直接将x和y的值转成整数,但是会丢失很多分辨率。比如,如果每个点的测量单位是米,那么在二维图像中,每个像素就表示1*1矩形区域内的点云,并且也会丧失很多比它小的细节信息。如果点云中的物体是山峰等较大物体的话,影响不是很大,但是如果想要捕捉人、车等更小的物体的话,这个方法就不是最优的选择。

上面数据转换的方法可以稍微修改一下从而得到我们所期望的分辨率。在将其转换成整数之前,我们首先需要对数据进行尺度变换。如果我们的测量单位是米,而期望的分辨率是5厘米,我们可以这样做:

res = 0.05
# CONVERT TO PIXEL POSITION VALUES - Based on resolution
x_img = (-y_points / res).astype(np.int32)  # x axis is -y in LIDAR
y_img = (-x_points / res).astype(np.int32)  # y axis is -x in LIDAR

不难发现,不光x轴和y轴进行了交换,而且坐标系的方向也进行了交换,下面我们就可以用二维图像坐标的处理方法对其进行处理了。

3.4 移动到新的原点

现在x和y数据依然不能直接将其映射到图像坐标系中,因为x和y数据中依然存在负数,所以我们需要对数据进行转换,使得(0,0)点是坐标中最小的值。

# SHIFT PIXELS TO HAVE MINIMUM BE (0,0)
# floor and ceil used to prevent anything being rounded to below 0 after shift
x_img -= int(np.floor(side_range[0] / res))
y_img += int(np.ceil(fwd_range[1] / res))

3.5 像素值

我们已经将点云数据映射到二维图像中的(x, y)位置,下面我们需要做的就是如何填充(x, y)位置上的数据?常用的方法是使用高度值进行填充,但是需要注意两个问题:

(1)像素值应该为整数

(2)像素值应该在0-255之间

我们可以得到点云数据中的最大值和最小值,然后再将所有点云数据的高度值映射到0-255之间。另一个方法就是,我们可以设置一个高度值范围,然后只保留感兴趣的高度值,然后再从中找到最大值和最小值将所有感兴趣高度映射到0-255之间。这个方法很有效,因为我们能够得到感兴趣区中最多的细节信息。

在下面的代码中,我们设置了一个高度值范围,原点下方两米到原点上方半米的位置。

height_range = (-2, 0.5)  # bottom-most to upper-most

# CLIP HEIGHT VALUES - to between min and max heights
pixel_values = np.clip(a = z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

现在,我们可以将其映射到0-255之间的整数了:

def scale_to_255(a, min, max, dtype=np.uint8):
    """ Scales an array of values from specified min, max range to 0-255
        Optionally specify the data type of the output (default is uint8)
    """
    return (((a - min) / float(max - min)) * 255).astype(dtype)

# RESCALE THE HEIGHT VALUES - to be between the range 0-255
pixel_values  = scale_to_255(pixel_values, min=height_range[0], max=height_range[1])

3.6 创建图像数组

现在,我们可以准备创建一个图像了,我们只需要初始化一个数组,它的维度由感兴趣矩形区域的大小和分辨率决定。之后,我们使用从点云转换而来的x和y值来索引二维图像数组,然后再对每个点进行填充。

# INITIALIZE EMPTY ARRAY - of the dimensions we want
x_max = 1+int((side_range[1] - side_range[0])/res)
y_max = 1+int((fwd_range[1] - fwd_range[0])/res)
im = np.zeros([y_max, x_max], dtype=np.uint8)

# FILL PIXEL VALUES IN IMAGE ARRAY
im[y_img, x_img] = pixel_values

3.7 可视化

现在图像的数据已经保存到numpy数组中了,如果我们想要可视化图像,可以这样做:

# CONVERT FROM NUMPY ARRAY TO A PIL IMAGE
from PIL import Image
im2 = Image.fromarray(im)
im2.show()

人类不容易分辨出灰度图数据的差异,因此可以使用光谱图来强化数据差异:

import matplotlib.pyplot as plt
plt.imshow(im, cmap="spectral", vmin=0, vmax=255)
plt.show()

3.8 完整的代码

import numpy as np
from PIL import Image
import matplotlib.pyplot as plt


# ==============================================================================
#                                                                   SCALE_TO_255
# ==============================================================================
def scale_to_255(a, min, max, dtype=np.uint8):
    """ Scales an array of values from specified min, max range to 0-255
        Optionally specify the data type of the output (default is uint8)
    """
    return (((a - min) / float(max - min)) * 255).astype(dtype)


# ==============================================================================
#                                                         POINT_CLOUD_2_BIRDSEYE
# ==============================================================================
def point_cloud_2_birdseye(points,
                           res=0.1,
                           side_range=(-50., 50.),  # left-most to right-most
                           fwd_range=(-50., 50.),  # back-most to forward-most
                           height_range=(-2., 2.),  # bottom-most to upper-most
                           ):
    """ Creates an 2D birds eye view representation of the point cloud data.

    Args:
        points:     (numpy array)
                    N rows of points data
                    Each point should be specified by at least 3 elements x,y,z
        res:        (float)
                    Desired resolution in metres to use. Each output pixel will
                    represent an square region res x res in size.
        side_range: (tuple of two floats)
                    (-left, right) in metres
                    left and right limits of rectangle to look at.
        fwd_range:  (tuple of two floats)
                    (-behind, front) in metres
                    back and front limits of rectangle to look at.
        height_range: (tuple of two floats)
                    (min, max) heights (in metres) relative to the origin.
                    All height values will be clipped to this min and max value,
                    such that anything below min will be truncated to min, and
                    the same for values above max.
    Returns:
        2D numpy array representing an image of the birds eye view.
    """
    # EXTRACT THE POINTS FOR EACH AXIS
    x_points = points[:, 0]
    y_points = points[:, 1]
    z_points = points[:, 2]

    # FILTER - To return only indices of points within desired cube
    # Three filters for: Front-to-back, side-to-side, and height ranges
    # Note left side is positive y axis in LIDAR coordinates
    f_filt = np.logical_and((x_points > fwd_range[0]), (x_points < fwd_range[1]))
    s_filt = np.logical_and((y_points > -side_range[1]), (y_points < -side_range[0]))
    filter = np.logical_and(f_filt, s_filt)
    indices = np.argwhere(filter).flatten()

    # KEEPERS
    x_points = x_points[indices]
    y_points = y_points[indices]
    z_points = z_points[indices]

    # CONVERT TO PIXEL POSITION VALUES - Based on resolution
    x_img = (-y_points / res).astype(np.int32)  # x axis is -y in LIDAR
    y_img = (-x_points / res).astype(np.int32)  # y axis is -x in LIDAR

    # SHIFT PIXELS TO HAVE MINIMUM BE (0,0)
    # floor & ceil used to prevent anything being rounded to below 0 after shift
    x_img -= int(np.floor(side_range[0] / res))
    y_img += int(np.ceil(fwd_range[1] / res))

    # CLIP HEIGHT VALUES - to between min and max heights
    pixel_values = np.clip(a=z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

    # RESCALE THE HEIGHT VALUES - to be between the range 0-255
    pixel_values = scale_to_255(pixel_values,
                                min=height_range[0],
                                max=height_range[1])

    # INITIALIZE EMPTY ARRAY - of the dimensions we want
    x_max = 1 + int((side_range[1] - side_range[0]) / res)
    y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
    im = np.zeros([y_max, x_max], dtype=np.uint8)

    # FILL PIXEL VALUES IN IMAGE ARRAY
    im[y_img, x_img] = pixel_values

    return im


if __name__ == '__main__':
    path_bin = './data/000000.bin'
    points = np.fromfile(path_bin, dtype=np.float32, count=-1).reshape([-1, 4])
    img = point_cloud_2_birdseye(points)
    # im2 = Image.fromarray(img)
    # im2.show()
    # turbo   terrain      tab20
    plt.imshow(img, cmap="tab20", vmin=0, vmax=255)
    plt.show()

 

4 将点云图像转成二维图像

4.1 前视图投影

为了将雷达传感器的“前视图”展开投影到二维图像上,我们必须要将点坐标投影到可以展开的柱体表面,然后再展开到平面上。下面的代码取自Li et al. 2016论文的公式:

# h_res = horizontal resolution of the lidar sensor
# v_res = vertical resolution of the lidar sensor
x_img = np.arctan2(y_lidar, x_lidar)/ h_res 
y_img = np.arctan2(z_lidar, np.sqrt(x_lidar**2 + y_lidar**2))/ v_res

问题在于,以这种方式进行操作会将图像的接缝直接置于汽车的右侧。 将接缝放置在汽车的最后部更有意义,这样,前部和侧面的重要区域就不会相互干扰。 使那些重要区域不中断将使卷积神经网络更容易识别那些重要区域中的整个对象。 以下代码修复了该问题。

# h_res = horizontal resolution of the lidar sensor
# v_res = vertical resolution of the lidar sensor
x_img = np.arctan2(-y_lidar, x_lidar)/ h_res # seam in the back
y_img = np.arctan2(z_lidar, np.sqrt(x_lidar**2 + y_lidar**2))/ v_res

4.2 给坐标设置变换尺度

变量h_res和v_rest依赖于所使用的雷达传感器,在KITTI数据集中,使用的是Velodyne HDL 64E雷达传感器,根据手册,我们列举了下面几条重要的特征:

(1)垂直视场为26.9度,间隔为0.4度。 垂直视场分为传感器上方+2度和传感器下方-24.9度。

(2)360度水平视场,分辨率为0.08-0.35(取决于转速)

(3)旋转速率可以选择在5-20 Hz之间。

下面对代码进行更新:
 

# Resolution and Field of View of LIDAR sensor
h_res = 0.35         # horizontal resolution, assuming rate of 20Hz is used 
v_res = 0.4          # vertical res
v_fov = (-24.9, 2.0) # Field of view (-ve, +ve) along vertical axis
v_fov_total = -v_fov[0] + v_fov[1] 

# Convert to Radians
v_res_rad = v_res * (np.pi/180)
h_res_rad = h_res * (np.pi/180)

# Project into image coordinates
x_img = np.arctan2(-y_lidar, x_lidar)/ h_res_rad
y_img = np.arctan2(z_lidar, d_lidar)/ v_res_rad

然而这导致大约一半的点位于x轴的负半轴,大部分的点位于y轴的负半轴,为了将其投影到二维图像中,我们需要将其最小点的坐标设置为(0,0)点,因此,我们需要对其进行一下变换:

# SHIFT COORDINATES TO MAKE 0,0 THE MINIMUM
x_min = -360.0/h_res/2    # Theoretical min x value based on specs of sensor
x_img = x_img - x_min     # Shift
x_max = 360.0/h_res       # Theoretical max x value after shifting

y_min = v_fov[0]/v_res    # theoretical min y value based on specs of sensor
y_img = y_img - y_min     # Shift
y_max = v_fov_total/v_res # Theoretical max x value after shifting
y_max = y_max + 5         # UGLY: Fudge factor because the calculations based on
                          # spec sheet do not seem to match the range of angles
                          # collected by sensor in the data.

4.3 栅格化为二维图像

现在,我们已经将三维点云投影到二维图像坐标系中,我们可以绘制这些点了:

pixel_values = -d_lidar # Use depth data to encode the value for each pixel
cmap = "jet"            # Color map to use
dpi = 100               # Image resolution
fig, ax = plt.subplots(figsize=(x_max/dpi, y_max/dpi), dpi=dpi)
ax.scatter(x_img,y_img, s=1, c=pixel_values, linewidths=0, alpha=1, cmap=cmap)
ax.set_facecolor((0, 0, 0)) # Set regions with no points to black
ax.axis('scaled')              # {equal, scaled}
ax.xaxis.set_visible(False)    # Do not draw axis tick marks
ax.yaxis.set_visible(False)    # Do not draw axis tick marks
plt.xlim([0, x_max])   # prevent drawing empty space outside of horizontal FOV
plt.ylim([0, y_max])   # prevent drawing empty space outside of vertical FOV
fig.savefig("/tmp/depth.png", dpi=dpi, bbox_inches='tight', pad_inches=0.0)

4.4 最终的代码

import matplotlib.pyplot as plt
import numpy as np


def lidar_to_2d_front_view(points,
                           v_res,
                           h_res,
                           v_fov,
                           val="depth",
                           cmap="jet",
                           saveto=None,
                           y_fudge=0.0
                           ):
    """ Takes points in 3D space from LIDAR data and projects them to a 2D
        "front view" image, and saves that image.

    Args:
        points: (np array)
            The numpy array containing the lidar points.
            The shape should be Nx4
            - Where N is the number of points, and
            - each point is specified by 4 values (x, y, z, reflectance)
        v_res: (float)
            vertical resolution of the lidar sensor used.
        h_res: (float)
            horizontal resolution of the lidar sensor used.
        v_fov: (tuple of two floats)
            (minimum_negative_angle, max_positive_angle)
        val: (str)
            What value to use to encode the points that get plotted.
            One of {"depth", "height", "reflectance"}
        cmap: (str)
            Color map to use to color code the `val` values.
            NOTE: Must be a value accepted by matplotlib's scatter function
            Examples: "jet", "gray"
        saveto: (str or None)
            If a string is provided, it saves the image as this filename.
            If None, then it just shows the image.
        y_fudge: (float)
            A hacky fudge factor to use if the theoretical calculations of
            vertical range do not match the actual data.

            For a Velodyne HDL 64E, set this value to 5.
    """

    # DUMMY PROOFING
    assert len(v_fov) == 2, "v_fov must be list/tuple of length 2"
    assert v_fov[0] <= 0, "first element in v_fov must be 0 or negative"
    assert val in {"depth", "height", "reflectance"}, \
        'val must be one of {"depth", "height", "reflectance"}'

    x_lidar = points[:, 0]
    y_lidar = points[:, 1]
    z_lidar = points[:, 2]
    r_lidar = points[:, 3]  # Reflectance
    # Distance relative to origin when looked from top
    d_lidar = np.sqrt(x_lidar ** 2 + y_lidar ** 2)
    # Absolute distance relative to origin
    # d_lidar = np.sqrt(x_lidar ** 2 + y_lidar ** 2, z_lidar ** 2)

    v_fov_total = -v_fov[0] + v_fov[1]

    # Convert to Radians
    v_res_rad = v_res * (np.pi / 180)
    h_res_rad = h_res * (np.pi / 180)

    # PROJECT INTO IMAGE COORDINATES
    x_img = np.arctan2(-y_lidar, x_lidar) / h_res_rad
    y_img = np.arctan2(z_lidar, d_lidar) / v_res_rad

    # SHIFT COORDINATES TO MAKE 0,0 THE MINIMUM
    x_min = -360.0 / h_res / 2  # Theoretical min x value based on sensor specs
    x_img -= x_min  # Shift
    x_max = 360.0 / h_res  # Theoretical max x value after shifting

    y_min = v_fov[0] / v_res  # theoretical min y value based on sensor specs
    y_img -= y_min  # Shift
    y_max = v_fov_total / v_res  # Theoretical max x value after shifting

    y_max += y_fudge  # Fudge factor if the calculations based on
    # spec sheet do not match the range of
    # angles collected by in the data.

    # WHAT DATA TO USE TO ENCODE THE VALUE FOR EACH PIXEL
    if val == "reflectance":
        pixel_values = r_lidar
    elif val == "height":
        pixel_values = z_lidar
    else:
        pixel_values = -d_lidar

    # PLOT THE IMAGE
    cmap = "jet"  # Color map to use
    dpi = 100  # Image resolution
    fig, ax = plt.subplots(figsize=(x_max / dpi, y_max / dpi), dpi=dpi)
    ax.scatter(x_img, y_img, s=1, c=pixel_values, linewidths=0, alpha=1, cmap=cmap)
    ax.set_facecolor((0, 0, 0))  # Set regions with no points to black
    ax.axis('scaled')  # {equal, scaled}
    ax.xaxis.set_visible(False)  # Do not draw axis tick marks
    ax.yaxis.set_visible(False)  # Do not draw axis tick marks
    plt.xlim([0, x_max])  # prevent drawing empty space outside of horizontal FOV
    plt.ylim([0, y_max])  # prevent drawing empty space outside of vertical FOV

    if saveto is not None:
        fig.savefig(saveto, dpi=dpi, bbox_inches='tight', pad_inches=0.0)
    else:
        fig.show()


if __name__ == "__main__":
    lidar = np.fromfile('./data/000000.bin', dtype=np.float32, count=-1).reshape([-1, 4])

    HRES = 0.35  # horizontal resolution (assuming 20Hz setting)
    VRES = 0.4  # vertical res
    VFOV = (-24.9, 2.0)  # Field of view (-ve, +ve) along vertical axis
    Y_FUDGE = 5  # y fudge factor for velodyne HDL 64E

    lidar_to_2d_front_view(lidar, v_res=VRES, h_res=HRES, v_fov=VFOV, val="depth",
                           saveto="./tmp/lidar_depth.png", y_fudge=Y_FUDGE)

    lidar_to_2d_front_view(lidar, v_res=VRES, h_res=HRES, v_fov=VFOV, val="height",
                           saveto="./tmp/lidar_height.png", y_fudge=Y_FUDGE)

    lidar_to_2d_front_view(lidar, v_res=VRES, h_res=HRES, v_fov=VFOV,
                           val="reflectance", saveto="./tmp/lidar_reflectance.png",
                           y_fudge=Y_FUDGE)

 

5 使用Mayavi模块可视化3D点云数据

Mayavi模块是一个三维科学数据可视化的库,它可以很好的可视化点云数据。

# ==============================================================================
#                                                                     VIZ_MAYAVI
# ==============================================================================
def viz_mayavi(points, vals="distance"):
    x = points[:, 0]  # x position of point
    y = points[:, 1]  # y position of point
    z = points[:, 2]  # z position of point
    # r = lidar[:, 3]  # reflectance value of point
    d = np.sqrt(x ** 2 + y ** 2)  # Map Distance from sensor

    # Plot using mayavi -Much faster and smoother than matplotlib
    import mayavi.mlab

    if vals == "height":
        col = z
    else:
        col = d

    fig = mayavi.mlab.figure(bgcolor=(0, 0, 0), size=(640, 360))
    mayavi.mlab.points3d(x, y, z,
                         col,          # Values used for Color
                         mode="point",
                         colormap='spectral', # 'bone', 'copper', 'gnuplot'
                         # color=(0, 1, 0),   # Used a fixed (r,g,b) instead
                         figure=fig,
                         )
    mayavi.mlab.show()

 

6 使用Matplotlib可视化3D点云数据

Matplotbli的优点就是便于安装,从事机器学习或者大数据的工作人员基本上都已经安装了这个库。但是,使用matplotlib模块依然存在很多缺点:

(1)处理速度慢。如果想要可视化整个雷达点云数据,你的计算机可能会崩溃

(2)点云可视化效果不好。可能不会可视化出所有的点云信息

Mayavi的缺点就是安装麻烦(现在好像比较简单了),如果你想很好的可是话点云数据,你可以尝试安装mayavi模块。

下面是使用matplotlib进行可视化的代码:

import matplotlib.pyplot as plt
from mpl_toolkits.mplot3d import Axes3D

skip = 100   # Skip every n points

fig = plt.figure()
ax = fig.add_subplot(111, projection='3d')
point_range = range(0, points.shape[0], skip) # skip points to prevent crash
ax.scatter(points[point_range, 0],   # x
           points[point_range, 1],   # y
           points[point_range, 2],   # z
           c=points[point_range, 2], # height data for color
           cmap='spectral',
           marker="x")
ax.axis('scaled')  # {equal, scaled}
plt.show()

 

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值