点云处理——将点云转换为鸟瞰图

最新推荐文章于 2025-04-12 20:38:48 发布

W_Tortoise

最新推荐文章于 2025-04-12 20:38:48 发布

阅读量1.8w

点赞数 33

分类专栏：点云&PCL

本文链接：https://blog.csdn.net/learning_tortosie/article/details/88828388

版权

点云&PCL 专栏收录该内容

10 篇文章

订阅专栏

写在前面

最近在求职，发现激光雷达就业机会更多，而且我在这方面基础相对薄弱些，所以打算补一补。

只要开始，就不晚。

这是一篇英文博客翻译，内容是将点云数据转换成鸟瞰图，虽然CSDN上已经有翻译，但我还是想自己过一遍，以加深理解。

接下来我要好好学学PCL！

点云数据

点云数据可以表示为具有N行和至少3列的numpy数组。每行对应一个点，其在空间中的位置至少使用3个值表示，即（x，y，z）。
在这里插入图片描述
如果点云数据来自LIDAR传感器，那么它可能具有每个点的附加值，例如“反射率”，其是在该位置中障碍物反射多少激光光束的量度。在这种情况下，点云数据可能是Nx4的数组。

图像与点云坐标

点云的坐标系与图像中的坐标系具有完全不同的含义。
下图中，蓝色为图像坐标系，橙色为点云坐标系。
在这里插入图片描述
关于图像需要注意的一些重要事项：

图像中的坐标值始终为正
原点位于左上角
坐标是整数值

有关点云坐标的注意事项：

点云中的坐标值可以是正数或负数
坐标可以采用实数编号的值
正x轴表示向前
正y轴表示左
正z轴表示向上

创建点云数据的鸟瞰视图

鸟瞰图的相关坐标系

为了创建鸟眼视图，要使用点云数据x和y轴上的值。
在这里插入图片描述
但是，正如上图所示，必须小心并考虑以下事项：

x和y轴意味着相反的事情
x和y轴指向相反的方向
必须移动值，以便（0,0）是图像中可能的最小值

设置感兴趣区域

仅关注点云的特定区域通常很有用。因此，我们希望创建一个过滤器，仅保留我们感兴趣区域内的点。

由于我们正在俯视数据，并且要将其转换为图像，因此要使用与图像坐标轴更加一致的方向。下面，指定集中在相对于原点的值的范围。原点左侧的任何内容都将被视为负数，而右侧的任何内容都将被视为正数。点云的x轴将被解释为向前的方向（鸟瞰图向上的方向）。

下面的代码将感兴趣的矩形设置为在原点的两侧，跨度为10米，并在其前面20米处。

side_range=(-10, 10)     # left-most to right-most
fwd_range=(0, 20)       # back-most to forward-most

接下来，创建一个过滤器，仅保留实际位于指定的矩形内的点。

# EXTRACT THE POINTS FOR EACH AXIS
x_points = points[:, 0]
y_points = points[:, 1]
z_points = points[:, 2]

# FILTER - To return only indices of points within desired cube
# Three filters for: Front-to-back, side-to-side, and height ranges
# Note left side is positive y axis in LIDAR coordinates
f_filt = np.logical_and((x_points > fwd_range[0]), (x_points < fwd_range[1]))
s_filt = np.logical_and((y_points > -side_range[1]), (y_points < -side_range[0]))
filter = np.logical_and(f_filt, s_filt)
indices = np.argwhere(filter).flatten()

# KEEPERS
x_points = x_points[indices]
y_points = y_points[indices]
z_points = z_points[indices]

将点位置映射到像素位置

目前，我们有一堆带有实数值的点。为了映射这些值，将这些值映射到整数位置值。我们可以将所有x和y值强制转换成整数，但可能会失去很多分辨率。例如，如果这些点的测量单位是以米为单位，则每个像素将表示点云中1x1米的矩形，将丢失任何小于这个矩形的细节。如果你有一个类似山景的点云，这可能没问题。但是如果想能够捕捉更精细的细节，并识别人类，汽车，甚至更小的东西，那么这种方法就没有用了。

但是，可以稍微修改上述方法，以便获得所需的分辨率级别。在强制类型转换成整数之前，可以先扩充数据。例如，如果测量单位是米，我们想要5厘米的分辨率，我们可以做如下的事情：

res = 0.05
# CONVERT TO PIXEL POSITION VALUES - Based on resolution
x_img = (-y_points / res).astype(np.int32)  # x axis is -y in LIDAR
y_img = (-x_points / res).astype(np.int32)  # y axis is -x in LIDAR

可以看出，x轴和y轴交换，方向反转，以便可以处理图像坐标。

平移原点

x和y值仍有负数，还不能投影到图像上，因此还需要平移数据，使得（0,0）位置的数据最小。

# SHIFT PIXELS TO HAVE MINIMUM BE (0,0)
# floor and ceil used to prevent anything being rounded to below 0 after shift
x_img -= int(np.floor(side_range[0] / res))
y_img += int(np.ceil(fwd_range[1] / res))

验证数据是否全是正的：

>>> x_img.min()
7
>>> x_img.max()
199
>>> y_img.min()
1
>>> y_img.max()
199

像素值

到这里，已经使用点数据来指定图像中的x和y位置，现在需要做的是指定这些像素位置填充的值。一种方法是填充高度数据。
有两件事要注意：

像素值应为整数
像素值应该是0-255范围内的值

可以从数据中获取最小和最大高度值，并重新缩放至0-255范围。另一种方法是，设置我们想要关注的高度值范围，并且高于或低于该范围的任何值都被设置为最小值和最大值。这种方法很有用，因为它允许我们从感兴趣的区域获得最大限度的细节。

在下面的代码中，将范围设置为原点下方2米，原点上方半米。

height_range = (-2, 0.5)  # bottom-most to upper-most

# CLIP HEIGHT VALUES - to between min and max heights
pixel_values = np.clip(a = z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

接下来，将这些值重新缩放到0到255之间，并将数据类型转换为整数。

def scale_to_255(a, min, max, dtype=np.uint8):
    """ Scales an array of values from specified min, max range to 0-255
        Optionally specify the data type of the output (default is uint8)
    """
    return (((a - min) / float(max - min)) * 255).astype(dtype)

# RESCALE THE HEIGHT VALUES - to be between the range 0-255
pixel_values  = scale_to_255(pixel_values, min=height_range[0], max=height_range[1])

创建图像数组

现在准备创建图像，首先初始化一个数组，其尺寸取决于我们在矩形中数值的范围和选择的分辨率；然后使用转换为像素位置的x和y点值来指定数组中的索引，并为这些索引分配像素值。

# INITIALIZE EMPTY ARRAY - of the dimensions we want
x_max = 1+int((side_range[1] - side_range[0])/res)
y_max = 1+int((fwd_range[1] - fwd_range[0])/res)
im = np.zeros([y_max, x_max], dtype=np.uint8)

# FILL PIXEL VALUES IN IMAGE ARRAY
im[y_img, x_img] = pixel_values

可视化

目前，图像存储为numpy数组。如果希望将其可视化，可以将其转换为PIL图像。

# CONVERT FROM NUMPY ARRAY TO A PIL IMAGE
from PIL import Image
im2 = Image.fromarray(im)
im2.show()

在这里插入图片描述
人类并不善于分辨灰色和阴影之间的区别，因此可以使用光谱颜色映射来更容易地分辨出差异。可以在matplotlib中做到这一点。

import matplotlib.pyplot as plt
plt.imshow(im, cmap="spectral", vmin=0, vmax=255)
plt.show()

在这里插入图片描述
实际上，这种方式生成的图像与PIL绘制的图像具有完全相同的信息量，因此机器学习算法能够区分高度差异，即使我们人类不能非常清楚地看到差异。

完整代码

为方便起见，将上面的所有代码放在一个函数中，它将鸟瞰图作为一个numpy数组返回。然后，可以使用喜欢的任何方法对其进行可视化，或者将numpy数组输入到机器学习算法中。

import numpy as np


# ==============================================================================
#                                                                   SCALE_TO_255
# ==============================================================================
def scale_to_255(a, min, max, dtype=np.uint8):
    """ Scales an array of values from specified min, max range to 0-255
        Optionally specify the data type of the output (default is uint8)
    """
    return (((a - min) / float(max - min)) * 255).astype(dtype)


# ==============================================================================
#                                                         POINT_CLOUD_2_BIRDSEYE
# ==============================================================================
def point_cloud_2_birdseye(points,
                           res=0.1,
                           side_range=(-10., 10.),  # left-most to right-most
                           fwd_range = (-10., 10.), # back-most to forward-most
                           height_range=(-2., 2.),  # bottom-most to upper-most
                           ):
    """ Creates an 2D birds eye view representation of the point cloud data.

    Args:
        points:     (numpy array)
                    N rows of points data
                    Each point should be specified by at least 3 elements x,y,z
        res:        (float)
                    Desired resolution in metres to use. Each output pixel will
                    represent an square region res x res in size.
        side_range: (tuple of two floats)
                    (-left, right) in metres
                    left and right limits of rectangle to look at.
        fwd_range:  (tuple of two floats)
                    (-behind, front) in metres
                    back and front limits of rectangle to look at.
        height_range: (tuple of two floats)
                    (min, max) heights (in metres) relative to the origin.
                    All height values will be clipped to this min and max value,
                    such that anything below min will be truncated to min, and
                    the same for values above max.
    Returns:
        2D numpy array representing an image of the birds eye view.
    """
    # EXTRACT THE POINTS FOR EACH AXIS
    x_points = points[:, 0]
    y_points = points[:, 1]
    z_points = points[:, 2]

    # FILTER - To return only indices of points within desired cube
    # Three filters for: Front-to-back, side-to-side, and height ranges
    # Note left side is positive y axis in LIDAR coordinates
    f_filt = np.logical_and((x_points > fwd_range[0]), (x_points < fwd_range[1]))
    s_filt = np.logical_and((y_points > -side_range[1]), (y_points < -side_range[0]))
    filter = np.logical_and(f_filt, s_filt)
    indices = np.argwhere(filter).flatten()

    # KEEPERS
    x_points = x_points[indices]
    y_points = y_points[indices]
    z_points = z_points[indices]

    # CONVERT TO PIXEL POSITION VALUES - Based on resolution
    x_img = (-y_points / res).astype(np.int32)  # x axis is -y in LIDAR
    y_img = (-x_points / res).astype(np.int32)  # y axis is -x in LIDAR

    # SHIFT PIXELS TO HAVE MINIMUM BE (0,0)
    # floor & ceil used to prevent anything being rounded to below 0 after shift
    x_img -= int(np.floor(side_range[0] / res))
    y_img += int(np.ceil(fwd_range[1] / res))

    # CLIP HEIGHT VALUES - to between min and max heights
    pixel_values = np.clip(a=z_points,
                           a_min=height_range[0],
                           a_max=height_range[1])

    # RESCALE THE HEIGHT VALUES - to be between the range 0-255
    pixel_values = scale_to_255(pixel_values,
                                min=height_range[0],
                                max=height_range[1])

    # INITIALIZE EMPTY ARRAY - of the dimensions we want
    x_max = 1 + int((side_range[1] - side_range[0]) / res)
    y_max = 1 + int((fwd_range[1] - fwd_range[0]) / res)
    im = np.zeros([y_max, x_max], dtype=np.uint8)

    # FILL PIXEL VALUES IN IMAGE ARRAY
    im[y_img, x_img] = pixel_values

    return im