EGO-Planner原文翻译

小锤锤权你胸口

已于 2023-07-27 19:46:36 修改

阅读量4.1k

点赞数 5

文章标签：无人机算法

于 2023-07-26 14:47:39 首次发布

本文链接：https://blog.csdn.net/weixin_44359306/article/details/131938837

版权

EGO-Planner: An ESDF-Free Gradient-Based Local Planner for Quadrotors

Abstract

Gradient-based planners are widely used for quadrotor local planning, in which a Euclidean Signed Distance Field(ESDF) is crucial for evaluating gradient magnitude and direction. Nevertheless, computing such a field has much redundancy since the trajectory optimization procedure only covers a very limited subspace of the ESDF updating range. In this letter, an ESDF-free gradient-based planning framework is proposed, which significantly reduces computation time. The main improvement is that the collision term in penalty function is formulated by comparing the colliding trajectory with a collision-free guiding path. The resulting obstacle information will be stored only if the trajectory hits new obstacles, making the planner only extract necessary obstacle information. Then, we lengthen the time allocation if dynamical feasibility is violated. An anisotropic curve fitting algorithm is introduced to adjust higher order derivatives of the trajectory while maintaining the original shape. Benchmark comparisons and real-world experiments verify its robustness and high-performance. The source code is released as ros packages.

基于梯度的规划器广泛用于四旋翼局部规划，其中欧几里得符号距离场（ESDF）对于评估梯度大小和方向至关重要。然而，由于轨迹优化过程仅覆盖ESDF更新范围的非常有限的子空间，因此计算这样的场具有很大的冗余。在这封信中，提出了一种基于ESDF的无梯度规划框架，大大减少了计算时间。主要的改进是通过将碰撞轨迹与无碰撞引导路径进行比较来制定惩罚函数中的碰撞项。只有当轨迹碰到新的障碍物时，才会存储由此产生的障碍物信息，这使得规划者只能提取必要的障碍物。然后，如果违反动力学可行性，我们延长时间分配。引入了一种各向异性曲线拟合算法，在保持原始形状的同时调整轨迹的高阶导数。基准比较和真实世界的实验验证了它的稳健性和高性能。源代码以ros包的形式发布。

在这里插入图片描述

I. INTRODUCTION

In recent years, the emergence of quadrotor online planning methods has greatly pushed the boundary of aerial autonomy, making drones fly out of laboratories and appear in numerous real-world applications. Among these methods, gradient-based ones, which smooth a trajectory and utilize the gradient information to improve its clearance, have shown great potential and gain more and more popularity [1].

Traditionally, gradient-based planners rely on a pre-built ESDF map to evaluate the gradient magnitude and direction, and use numerical optimization to generate a local optimal solution. Although the optimization programs enjoy fast convergence,they suffer a lot from constructing the required ESDF before-hand. As the statistics (EWOK [2]'s Table II) states, the ESDF computation takes up to about 70% of total processing time for conducting local planning. Therefore, we can safely claim that, building ESDF has become the bottleneck of gradient-based planners, preventing the method from being applied to resource-limited platforms.

Though ESDF is widely used, few works analyze its necessity.Typically, there are two ways to build an ESDF. As detailed in Section II, methods can be categorized as the incremental global updating [3] ones, and the batch local calculation [4] ones. However, neither of them focuses on the trajectory itself. Consequently, too much computation is spent on calculating ESDF values that make no contribution to the planning. In other words, current ESDF-based methods do not serve the trajectory optimization solely and directly. As shown in Fig. 1, for a general autonomous navigation scenario where the drone is expected to avoid collisions locally, the trajectory covers only a limited space of the ESDF updating range. In practice, although some handcrafted rules can decide a slim ESDF range, they lack theoretical rationality and still induce unnecessary computations.

近年来，四旋翼在线规划方法的出现极大地推动了空中自主的边界，使无人机飞出实验室，出现在众多现实世界的应用中。在这些方法中，基于梯度的方法显示出巨大的潜力，并越来越受欢迎[1]，它平滑轨迹并利用梯度信息来提高其清除率。

传统上，基于梯度的规划者依靠预先构建的ESDF地图来评估梯度大小和方向，并使用数值优化来生成局部最优解。尽管优化程序具有快速收敛性，但由于事先构建所需的ESDF，它们受到了很大的影响。正如统计数据（EWOK[2]的表II）所述，ESDF计算约占进行局部规划总处理时间的70%。因此，我们可以放心地说，建立ESDF已经成为基于梯度的规划者的瓶颈，阻碍了该方法应用于资源有限的平台。

尽管ESDF被广泛使用，但很少有工作分析其必要性。通常，有两种方法可以构建ESDF。如第二节所述，方法可分为增量全局更新[3]方法和批量局部计算[4]方法。然而，他们都没有关注轨迹本身。因此，在计算对规划没有贡献的ESDF值上花费了太多的计算。换言之，当前基于ESDF的方法并不能单独和直接地为轨迹优化服务。如图1所示，对于无人机有望避免局部碰撞的一般自主导航场景，轨迹仅覆盖ESDF更新范围的有限空间。在实践中，尽管一些手工制定的规则可以决定较小的ESDF范围，但它们缺乏理论合理性，仍然会引发不必要的计算。

In this letter, we design an $\bf{E}$ SDF-free $\bf{G}$ radient-based l $\bf{O}$ cal planning framework called $\bf{EGO}$ , and we incorporate careful engineering considerations to make it lightweight and robust. The proposed algorithm is composed of a gradient-based spline optimizer and a post-refinement procedure. Firstly, we optimize the trajectory with smoothness, collision, and dynamical feasibility terms. Unlike traditional approaches that query precomputed ESDF, we model the collision cost by comparing the trajectory inside obstacles with a guiding collision-free path. We then project the forces onto the colliding trajectory and generate estimated gradient to wrap the trajectory out of obstacles. During the optimization, the trajectory will rebound a few times between nearby obstacles and finally terminate in a safe region. In this way, we only calculate the gradient when necessary, and avoid computing ESDF in regions irrelevant to the local trajectory. If the resulted trajectory violates dynamical limits, which is usually caused by unreasonable time allocation, the refinement process is activated. During the refinement, trajectory time is reallocated when the limits are exceeded. With the enlarged time allocation, a new B-spline that fits the previous dynamical infeasible one while balancing the feasibility and fitting accuracy is generated. To improve robustness, the fitting accuracy is modeled anisotropically with different penalties on axial and radial directions.

在本文中，我们设计了一个名为 $\bf{EGO}$ 的基于梯度的免ESDF局部规划框架，并结合了仔细的工程考虑，使其轻量级和健壮。所提出的算法由基于梯度的样条优化器和后细化过程组成。首先，我们用光滑性、碰撞和动力学可行性项来优化轨迹。与查询预先计算的ESDF的传统方法不同，我们通过将障碍物内的轨迹与引导无碰撞路径进行比较来对碰撞成本进行建模。然后，我们将力投影到碰撞轨迹上，并生成估计的梯度，以将轨迹包裹在障碍物之外。在优化过程中，轨迹会在附近的障碍物之间反弹几次，最终在安全区域终止。这样，我们只在必要时计算梯度，避免在与局部轨迹无关的区域计算ESDF。如果所得到的轨迹违反了动态极限，这通常是由不合理的时间分配引起的，则激活细化过程。在细化过程中，当超过限制时，轨迹时间会被重新分配。通过扩大时间分配，生成了一个新的B样条，该B样条在平衡可行性和拟合精度的同时，拟合先前的动态不可行B样条。为了提高鲁棒性，对拟合精度进行了各向异性建模，在轴向和径向上具有不同的惩罚。

To the best knowledge of us, this method is the first to achieve gradient-based local planning without an ESDF. Compared to existing state-of-the-art works, the proposed method generates safe trajectories with comparable smoothness and aggressiveness, but lower computation time of over an order of magnitude by omitting the ESDF maintenance. We perform comprehensive tests in simulation and real-world to validate our method. Contributions of this letter are:

We propose a novel and robust gradient-based quadrotor local planning method, which evaluates and projects gradient information directly from obstacles instead of a pre-built ESDF.
We propose a lightweight yet effective trajectory refinement algorithm, which generates smoother trajectories by formulating the trajectory fitting problem with anisotropic error penalization.
We integrate the proposed method into a fully autonomous quadrotor system, and release our software for the reference of the community.

据我们所知，这种方法是第一种在没有ESDF的情况下实现基于梯度的局部规划的方法。与现有的最先进的工作相比，所提出的方法生成了具有相同平滑性和激进性的安全轨迹，但通过省略ESDF维护，将计算时间降低了一个数量级以上。我们在模拟和现实世界中进行了全面的测试，以验证我们的方法。本文的贡献是：

我们提出了一种新的、稳健的基于梯度的四旋翼局部规划方法，该方法直接从障碍物中评估和投影梯度信息，而不是预先构建的ESDF。
我们提出了一种轻量级但有效的轨迹细化算法，该算法通过公式化具有各向异性误差惩罚的轨迹拟合问题来生成更平滑的轨迹。
我们将所提出的方法集成到一个完全自主的四旋翼系统中，并发布我们的软件供社区参考。

II. RELATED WORK

A. Gradient-Based Motion Planning

Gradient-based motion planning is the mainstream for UAV local trajectory generation, which formulates the problem as unconstrained nonlinear optimization. ESDF is first introduced in robotic motion planning by Ratliff et al.[5]. Utilizing its abundant gradient information, many planning frameworks directly optimize trajectories in the configuration space. Nevertheless, optimizing the trajectory in discrete-time [5], [6] is not suitable for drones, because it is much more sensitive to dynamical constraints. Thereby, [7] proposes a continuous-time polynomial trajectory optimization method for UAV planning. However, the involved integral of the potential function causes a heavy computation burden. Besides, the success rate of this method is around 70%, even with random restarts. For these drawbacks, [2] introduces a B-spline parameterization of the trajectory which takes good advantage of the convex hull property. In [8], the success rate is significantly increased by finding a collision-free initial path as the front-end. Moreover, the performance is further improved when the generation of the initial collision-free path takes into account kinodynamic constraints [9], [10]. Zhou et al. [11] incorporate perception awareness to make the system more robust. Among the above approaches, ESDF plays a vital role in evaluating distance with gradient magnitude and direction to nearby obstacles.

基于梯度的运动规划是无人机局部轨迹生成的主流，它将问题表述为无约束非线性优化。由Ratliff等人[5]首次将ESDF引入机器人运动规划中。许多规划框架利用其丰富的梯度信息，直接优化配置空间中的轨迹。然而，在离散时间[5]、[6]中优化轨迹不适用于无人机，因为它对动力学约束更敏感。因此，[7]提出了一种用于无人机规划的连续时间多项式轨迹优化方法。然而，所涉及的势函数的积分造成了沉重的计算负担。此外，即使随机重新启动，这种方法的成功率也在70%左右。针对这些缺点，[2]引入了轨迹的B样条参数化，该参数化充分利用了凸包特性。在[8]中，通过找到无冲突的初始路径作为前端，成功率显著提高。此外，当初始无碰撞路径的生成考虑到动力学约束[9]，[10]时，性能进一步提高。周等人[11]结合感知，使系统更加健壮。在上述方法中，ESDF在评估与附近障碍物的梯度大小和方向的距离方面发挥着至关重要的作用。

B. Euclidean Signed Distance Field (ESDF)

ESDF has long been used to construct objects from noisy sensor data for over two decades [12], and revive interests in robotics motion planning since [5]. Felzenszwalb et al. [4] propose an envelope algorithm that reduces the time complexity of ESDF construction to O(n) with n denoted as voxel numbers. This algorithm is not suitable for incremental building of ESDF, while dynamic updating of the field is often needed during quadrotor flight. To solve this problem, Oleynikova [13] and Han [3] propose incremental ESDF generation methods, namely Voxblox and FIESTA. Although these methods are highly efficient in dynamic updating cases, the generated ESDF almost always contains redundant information that may not be used in the planning procedure at all. As is shown in Fig. 1, this trajectory only sweeps over a very limited subspace of the whole ESDF updating range. Therefore, it is valuable to design a more intelligent and lightweight method, instead of maintaining the whole field.

二十多年来，ESDF一直被用于从有噪声的传感器数据中构建物体[12]，并自[5]以来重新激发了人们对机器人运动规划的兴趣。Felzenszwalb等人[4]提出了一种包络算法，该算法将ESDF构建的时间复杂度降低到 $O (n)$ ，其中 $n$ 表示为体素数。该算法不适用于ESDF的增量构建，而在四旋翼飞行过程中经常需要对场进行动态更新。为了解决这个问题，Oleynikova[13]和Han[3]提出了增量ESDF生成方法，即Voxblox和FIESTA。尽管这些方法在动态更新情况下是高效的，但生成的ESDF几乎总是包含冗余信息，这些信息可能根本不会在规划过程中使用。如图1，该轨迹仅扫过整个ESDF更新范围的非常有限的子空间。因此，设计一种更智能、更轻量级的方法是有价值的，而不是维护整个领域。

III. COLLISION AVOIDANCE FORCE ESTIMATION

In this letter, the decision variables are control points $\mathbf{Q}$ of a B-spline curve. Each $\mathbf{Q}$ possesses its own environment information independently. Initially, a naive B-spline curve $\mathbf{\Phi}$ satisfying terminal constraints is given, regardless of collision. Then, the optimization procedure starts. For each colliding segment detected in an iteration, a collision-free path $\mathbf{\Gamma}$ is generated. Each control point $\mathbf{Q_i}$ of the colliding segment, after that, will be assigned an anchor point $\mathbf{p}_{ij}$ at the obstacle surface with a corresponding repulsive direction vector $\mathbf{v}_{ij}$ , as shown in Fig. 3(a). Denote by $\in \mathbb{N}_{+}$ the index of control points, and $\in \mathbb{N}$ the index of { $\mathbf{p}, \mathbf{v}$ } pair. Note that each { $\mathbf{p}, \mathbf{v}$ } pair only belongs to one specific control point. For brevity, we omit the subscript $ij$ without causing ambiguity. The detailed { $\mathbf{p}, \mathbf{v}$ } pair generation procedure in this letter is summarized in Algorithm 1 and is illustrated in Fig. 3(b). Then the obstacle distance from $\mathbf{Q}_i$ to the $j^{th}$ obstacle is defined as

在本文中，决策变量是B样条曲线的控制点 $\mathbf{Q}$ 。每个 $\mathbf{Q}$ 都独立地拥有自己的环境信息。首先，给出了一条满足终端约束的天真B样条曲线 $\mathbf{\Phi}$ ，而不考虑碰撞。然后，优化过程开始。对于迭代中检测到的每个碰撞段，将生成一个无碰撞路径 $\mathbf{\Gamma}$ 。之后，碰撞段的每个控制点 $\mathbf{Q_i}$ 将在障碍物表面被分配一个锚点 $\mathbf{p}_{ij}$ ，并具有相应的排斥方向向量 $\mathbf{v}_{ij}$ ，如图3a所示。用 $\in \mathbb{N}_{+}$ 表示控制点的索引，用 $\in \mathbb{N}$ 表示 { $\mathbf{p}, \mathbf{v}$ } 对的索引。请注意，每个{ $\mathbf{p}, \mathbf{v}$ }对只属于一个特定的控制点,为了简洁起见，我们省略了下标 $ij$ 而不会引起歧义。本文中详细的{ $\mathbf{p}, \mathbf{v}$ }对生成过程总结在算法1中，并图解于3b中。然后，从 $\mathbf{Q}_i$ 到第 $j^{th}$ 个障碍物的障碍物距离定义为

$\begin{equation} d_{ij} = (\mathbf{Q}_i - \mathbf{p}_{ij}) \cdot \mathbf{v}_{ij} \end{equation}$

在这里插入图片描述

In order to avoid duplicative { $\mathbf{p}, \mathbf{v}$ } pair generation before the trajectory escapes from the current obstacle during the first several iterations, we adopt a criterion that considers an obstacle which the control point $\mathbf{Q}_i$ lies in as newly discovered, only if the current $\mathbf{Q}_i$ satisfies $d_{ij} > 0$ for all valid $j$ . Besides, this criterion allows only necessary obstacles that contribute to the final trajectory to be taken into optimization. Thus, the operation time is significantly reduced.

To incorporate necessary environmental awareness into the local planner, we need to explicitly construct an objective function that keeps the trajectory away from obstacles. ESDF provides this vital collision information but with the price of a heavy computation burden. In addition, as shown in Fig. 2, ESDF-based planners can easily fall into a local minimum and fail to escape from obstacles, due to the insufficient or even wrong information from ESDF. To avoid such situations, an additional front-end is always needed to provide a collision-free initial trajectory. The above methodology outperforms ESDF in providing the vital information for collision avoidance, since the explicitly designed repulsive force can be fairly effective regarding various missions and environments. Moreover, the proposed method has no requirement for collision-free initialization.

为了避免在前几次迭代期间轨迹逃离当前障碍物之前重复生成{ $\mathbf{p}, \mathbf{v}$ }对，我们采用一个准则，仅当当前 $\mathbf{Q}_i$ 对于所有有效的 $j$ 满足 $d_{ij} > 0$ 时，才将控制点 $\mathbf{Q}_i$ 所处的障碍视为新发现的障碍。此外，该标准只允许对最终轨迹有贡献的必要障碍物被纳入优化。因此，操作时间显著减少。

为了将必要的环境意识纳入局部规划器，我们需要明确构建一个目标函数，使轨迹远离障碍。ESDF提供了这种重要的碰撞信息，但代价是沉重的计算负担。此外，如图2所示，由于ESDF提供的信息不足甚至错误，基于ESDF的规划器很容易陷入局部极小值，无法逃离障碍物。为了避免这种情况，总是需要额外的前端来提供无碰撞的初始轨迹。上述方法在为防撞提供重要信息方面优于ESDF，因为明确设计的排斥力在各种任务和环境中都相当有效。此外，所提出的方法对无冲突初始化没有要求。

IV. GRADIENT-BASED TRAJECTORY OPTIMIZATION

A. Problem Formulation

In this letter, the trajectory is parameterized by a uniform B-spline curve $\Phi$ , which is uniquely determined by its degree $p_b$ , $N_c$ control points ${Q_1, Q_2,\dots, Q_{N_c} }$ , and a knot vector ${t_1, t_2,\dots,t_M}$ , where $\mathbf{Q}_i \in \mathbb{R}^3$ , $t_m \in \mathbb{R}$ and $M = N_c + p_b$ .For simplicity and efficiency of trajectory evaluation, the B-spline used in our method is uniform, which means each knot is separated by the same time interval $\Delta t = t_{m+1} − t_m$ from its predecessor. The problem formulation in this letter is based on the current state-of-the-art quadrotor local planning framework Fast-Planner [14].

本文中，轨迹由一个均匀B样条参数化，它由它的阶数 $p_b$ ， $N_c$ 个控制点 ${Q_1, Q_2,\dots, Q_{N_c} }$ ,和一个节点向量 ${t_1, t_2,\dots,t_M}$ , 定义，其中， $\mathbf{Q}_i \in \mathbb{R}^3$ , $t_m \in \mathbb{R}$ 并且 $M = N_c + p_b$ .为了轨迹评估的简单性和效率，我们的方法中使用的B样条是均匀的，这意味着每个节点都被相同的时间间隔 $\Delta t = t_{m+1} − t_m$ 从其前身分开。本文中的问题构建是基于当前最先进的四旋翼局部规划框架Fast-Planner [14]

B-spline enjoys convex hull property, as shown in Fig. 4. This property indicates that a single span of a B-spline curve is merely controlled by $p_b + 1$ successive control points and lies within the convex hull of these points. For example, a span within $t_i, t_{i+1})$ lies inside the convex hull formed by $\{\mathbf{Q}{i-p_b}, \mathbf{Q}{i−p_b+1},\dots, \mathbf{Q}\}$ . Another property is that the kth derivative of a B-spline is still a B-spline with order $p_{b,k} = pb − k$ . Since $\Delta t$ is identical alone $\mathbf{\Phi}$ , the control points of the velocity $\mathbf{V}_i$ , acceleration $\mathbf{A}_i$ , and jerk $\mathbf{J}_i$ curves are obtained by

如图4所示，B样条具有凸包性，该特性表明，B样条曲线的单跨距仅由
$p_b + 1$ 个连续控制点控制并且位于这些控制点组成的凸包中。例如 $t_i, t_{i+1})$ 时段的B样条分布于由 $\{\mathbf{Q}{i-p_b}, \mathbf{Q}{i−p_b+1},\dots, \mathbf{Q}\}$ 组成的凸包中。另一个性质是B样条的K阶导数仍然是一个 $p_{b,k}$ 阶的B样条， $p_{b,k} = pb − k$ . 由于 $\Delta t$ 在折整个 $\mathbf{\Phi}$ 上都相同, 速度控制点 $\mathbf{V}_i$ , 加速度控制点 $\mathbf{A}_i$ ，加加速度控制点 $\mathbf{J}_i$ 由下式确定。

$\begin{equation} \mathbf{V_i} = \frac{\mathbf{Q}_{i + 1} - \mathbf{Q}_i}{\Delta t}, \mathbf{A_i} = \frac{\mathbf{V}_{i + 1} - \mathbf{V}_i}{\Delta t}, \mathbf{J_i} = \frac{\mathbf{A}_{i + 1} - \mathbf{A}_i}{\Delta t}. \end{equation}$

We follow the work of [15] to plan the control points $\mathbf{Q} \in \mathbb{R}^3$ in a reduced space of differentially flat outputs. The optimization problem is then formulated as follows:

我们遵循[15]的工作，在微分平坦输出的缩减空间中规划控制点 $\mathbf{Q} \in \mathbb{R}^3$ 。然后将优化问题公式化如下：

$\begin{equation} \min_{\mathbf{Q}} J = \lambda_s J_s + \lambda_c J_c + \lambda_d J_d \end{equation}$

where $J_s$ is the smoothness penalty, $J_c$ is for collision, and $J_d$ indicates feasibility. $\lambda_s$ , $\lambda_c$ , $\lambda_d$ are weights for each penalty terms

其中 $J_s$ 是平滑度惩罚， $J_c$ 是碰撞， $J_d$ 表示可行性 $\lambda_s$ 、 $\lambda_c$ 、 $\lambda_d$ 是每个惩罚项的权重

1)Smoothness Penalty: In [2], the smoothness penalty is formulized as the time integral over square derivatives of the trajectory (acceleration, jerk, etc.). In [10], only geometric information of the trajectory is taken regardless of time allocation. In this letter, we combine both methods to penalize squared acceleration and jerk without time integration. Benefiting from the convex hull property, minimizing the control points of second and third order derivatives of the B-spline trajectory is sufficient to reduce these derivatives along the whole curve. Therefore, the smoothness penalty function is formulated as

1)平滑惩罚：在[2]中，平滑惩罚被公式化为轨迹平方导数（加速度、加加速度等）上的时间积分。在[10]中，无论时间分配如何，只获取轨迹的几何信息。本文，我们将这两种方法结合起来，在没有时间积分的情况下惩罚加速度和加加速度的平方。得益于凸包性质，最小化B样条轨迹的二阶和三阶导数的控制点足以减少沿整个曲线的这些导数。因此，平滑度惩罚函数公式化为

$\begin{equation} J_s = \sum^{N_c - 1}_{i = 1} \begin{Vmatrix} \mathbf{A}_i \end{Vmatrix}^2_2 + \sum^{N_c - 2}_{i = 1} \begin{Vmatrix} \mathbf{J}_i \end{Vmatrix}^2_2 \end{equation}$

which minimizes high order derivatives, making the whole trajectory smooth.

这使高阶导数最小化，使整个轨迹平滑。

2)Collision Penalty: Collision penalty pushes control points away from obstacles. This is achieved by adopting a safety clearance $s_f$ and punishing control points with $d_{ij} < s_f$ . In order to further facilitate optimization, we construct a twice continuously differentiable penalty function $j_c$ and suppress its slope as $d_{ij}$ decreases, which yields the piecewise function

2)碰撞惩罚：碰撞惩罚将控制点推离障碍物。这是通过采用安全许可 $s_f$ 和用 $d_{ij}<s_f$ 惩罚控制点来实现的。为了进一步促进优化，我们构造了一个两次连续可微的罚函数 $j_c$ ，并随着 $d_{ij}$ 的减小而抑制其斜率，从而得到分段函数

$\begin{equation} \begin{aligned} j_c(i,j) &=\begin{cases} &0 \quad &(c_{ij} \leq 0)\\ &c_{ij}^3 \quad &(0 < c_{ij} \leq s_f)\\ &3s_f c^2_{ij} - 3s_f^2c_{ij} + s_f^3 \quad &(c_{ij} \geq s_f) \end{cases},\\ c_{ij} &= s_f - d_{ij}, \end{aligned} \end{equation}$

where $j_c(i, j)$ is the cost value produced by $\{\mathbf{p},\mathbf{v}\}_j$ pairs on $\mathbf{Q}_i$ . The cost on each $\mathbf{Q}_i$ is evaluated independently and accumulated from all corresponding $\{\mathbf{p},\mathbf{v}\}_j$ pairs. Thus, a control point obtains a higher trajectory deformation weight if it discovers more obstacles. Specifically, the cost value added to the ith control point is $j_c(Q_i) = \sum^{N_p}_{j=1}j_c(i, j)$ , $N_p$ is the number of $\{\mathbf{p},\mathbf{v}\}_j$ pairs belonging to $\mathbf{Q}_i$ . Combining costs on all $\mathbf{Q}_i$ yields the total cost $J_c$ , i.e.,

其中 $j_c(i, j)$ 是由 $\mathbf{Q}_i$ 的 $\{\mathbf{p},\mathbf{v}\}_j$ 对产生的代价值。每一个 $\mathbf{Q}_i$ 的代价值被单独评估并由相关的所有 $\{\mathbf{p},\mathbf{v}\}_j$ 对累加得来。因此，如果控制点发现更多的障碍物，则控制点获得更高的轨迹变形权重。具体地，加到第i个控制点的代价值是 $j_c(Q_i) = \sum^{N_p}_{j=1}j_c(i, j)$ ， $N_p$ 是属于 $\mathbf{Q}_i$ 的 $\{\mathbf{p},\mathbf{v}\}_j$ 的数量，组合所有 $\mathbf{Q}_i$ 的代价得到总代价值

$\begin{equation} J_c = \sum^{N_c}_{i=1}j_c(\mathbf{Q}_i) \end{equation}$

Unlike traditional ESDF-based methods [2], [10], which compute gradient by trilinear interpolation on the field, we obtain gradient by directly computing the derivative of $J_c$ with respect to $\mathbf{Q}_i$ , which gives

与传统的基于ESDF的方法[2]、[10]不同，传统的方法通过场上的三线性插值来计算梯度，我们通过直接计算 $J_c$ 相对于 $\mathbf{Q}_i$ 的导数来获得梯度，这给出了

$$
\begin{equation}
\frac{\partial J_c}{\partial \mathbf{Q}i} = \sum^{N_c}{i = 1} \sum^{N_p}{j=1} \mathbf{v}{ij} \begin{cases}
&0 \quad &(c_{ij} \leq 0)\
&-3c^2_{ij} \quad &(0 < c_{ij} \leq s_f)\
&-6s_f c_{ij} + 3s_f^2 \quad &(c_{ij} > s_f)
\end{cases}
\end{equation}

3)Feasibility Penalty: Feasibility is ensured by restricting the higher order derivatives of the trajectory on every single dimension, i.e., applying $\left|\mathbf{\Phi}^{(k)}_r(t)\right| < \mathbf{\Phi}^{(k)}_{r,max}$ for all $t$ , where $\in {x, y, z}$ indicates each dimension. Thanks to the convex hull property, constraining derivatives of the control points is sufficient for constraining the whole B-spline. Therefore, the penalty function is formulated as

3)可行性约束:可行性通过约束轨迹每一单独维度的高阶导数保证。比如对每一个时间 $t$ 应用 $\left|\mathbf{\Phi}^{(k)}_r(t)\right| < \mathbf{\Phi}^{(k)}_{r,max}$ ，其中 $\in {x, y, z}$ 代表每一个维度。得益于凸包性，约束控制点的导数对于约束整个B样条线是足够的。因此，罚函数被公式化为

$\begin{equation} J_d = \sum^{N_c}_{i=1} \mathcal{w}_v F(\mathbf{V}_i) + \sum^{N_c - 1}_{i=1} w_a F(\mathbf{A}_i) +\sum^{N_c - 2}_{i=1} \mathcal{w}_j F(\mathbf{J}_i) \end{equation}$

where $w_v$ , $w_a$ , $w_j$ are weights for each terms and $F (\cdot)$ is a
twice continuously differentiable metric function of higher order
derivatives of control points.

$\begin{align} F(\mathbf{C}) &= \sum_{r=x,y,z} f(c_r),\\ f(c_r) &= \begin{cases} a_1 c^2_r +b_1 c_r + c_1 \quad& (c_r \leq -c_j)\\ (-\lambda c_m - c_r)^3 \quad &(-c_j < c_r < -\lambda c_m)\\ 0 \quad &(-\lambda c_m \leq c_r \leq \lambda c_m)\\ (-\lambda c_m - c_r)^3 \quad& (\lambda c_m < c_r < c_j)\\ a_2 c^2_r +b_2 c_r + c_2 \quad& (c_r \geq c_j) \end{cases} \end{align}$

where $c_r \in \mathbf{C} \in \{\mathbf{V}i, \mathbf{A}_i, \mathbf{J}_i\}$ , $a_1, b_1, c_1, a_2, b_2, c_2$ are chosen to meet the second-order continuity, $c_m$ is the derivative limit, $c_j$ is the splitting points of the quadratic interval and the cubic interval.$ \lambda < 1 − \epsilon$ is an elastic coefficient with $\epsilon \ll 1$ to make the final results meet the constraints, since the cost function is a tradeoff of all weighted terms.

其中， $c_r \in \mathbf{C} \in \{\mathbf{V}i, \mathbf{A}_i, \mathbf{J}_i\}$ ， $a_1, b_1, c_1, a_2, b_2, c_2$ 被选定用来满足二阶连续条件。 $c_m$ 是导数的限制， $c_j$ 是二次段和三次段的分界。 $\lambda < 1 − \epsilon$ 是一个弹性系数满足 $\epsilon \ll 1$ 以保证最后的结果满足约束，因为代价函数是所有加权项的折衷。

在这里插入图片描述

B. Numerical Optimization

The formulated problem in this letter features in two aspects. Firstly, the objective function $J$ alters adaptively according to the newly found obstacles. It requires the solver to be able to restart fast. Secondly, quadratic terms dominate the formulation of the objective function, making $J$ approximate quadratic. It means that the utilization of Hessian information can significantly accelerate the convergence. However, obtaining the exact inverse Hessian is prohibitive in real-time applications since it consumes nonnegligible massive computation. To circumvent this, quasi-Newton methods that approximate the inverse Hessian from gradient information are adopted.

本文中的公式化问题有两个方面。首先，目标函数 $J$ 根据新发现的障碍自适应地改变。它要求解算器能够快速重新启动。其次，二次项主导了目标函数的公式，使 $J$ 近似为二次项。这意味着Hessian矩阵信息的利用可以显著加速收敛。然而，在实时应用中，获得精确的逆Hessian是令人望而却步的，因为它消耗了不可忽略的大量计算。为了避免这种情况，采用了从梯度信息近似逆Hessian的拟牛顿方法。

Since the performance of a solver is problem dependent, we compare three algorithms belonging to quasi-Newton methods. They are Barzilai-Borwein method [16] which is capable of fast restart with most crude Hessian estimation, truncated Newton method [17] which estimates Hessian by adding multiple tiny perturbations to a given state, L-BFGS method [18] which approximates Hessian from previous objective function evaluations but requires a serial of iterations to reach a relatively accurate estimation. Comparison in Section VI-B states that L-BFGS outperforms the other two algorithms with appropriately selected memory size, balancing the loss of restart and the accuracy of inverse Hessian estimation. This algorithm is briefly explained as follows. For an unconstrained optimization problem $\min_x\in \mathbb{R}_n f(x)$ , the updating for $x$ follows the approximated Newton step

由于求解器的性能取决于问题，我们比较了属于拟牛顿方法的三种算法。它们是Barzilai-Borwein方法[16]，它能够用最粗糙的Hessian估计快速重新启动；截断牛顿方法[17]，它通过向给定状态添加多个微小扰动来估计Hessian；L-BFGS方法[18]，它从以前的目标函数评估中近似Hessian，但需要一系列迭代才能达到相对准确的估计。第VI-B节中的比较表明，L-BFGS在适当选择内存大小的情况下优于其他两种算法，平衡了重新启动的损失和Hessian逆估计的准确性。该算法的简要说明如下。对于无约束优化问题 $\min_x\in \mathbb{R}_n f(x)$ ， $x$ 的更新遵循近似牛顿步

$\begin{equation} \mathbf{x}_{k + 1} = \mathbf{x}_k - \alpha_k \mathbf{H}_k \nabla \mathbf{f}_k, \end{equation}$

where $\alpha_k$ is the step length and $\mathbf{H}_k$ is updated at every iteration by means of the formula

其中 $\alpha_k$ 是步长， $\mathbf{H}_k$ 每轮迭代依照下式更新

$\begin{equation} \mathbf{H}_{k + 1} = \mathbf{V}^T_k \mathbf{H}_k \mathbf{V}_k + \rho_k \mathbf{s}_k \mathbf{s}_k \end{equation}$

where $\rho_k = (\mathbf{y}^T_k \mathbf{s}_k)^{−1}$ , $\mathbf{V}_k = \mathbf{I} − \rho_k \mathbf{y}_k \mathbf{s}^T_k$ , $\mathbf{s}_k = \mathbf{x}_{k+1} − \mathbf{x}_k$ and $\mathbf{y}_k = \nabla \mathbf{f}_{k+1} −\nabla \mathbf{f}_k$ .

Here $\mathbf{H}_k$ is not calculated explicitly. The algorithm right multiplies $\nabla \mathbf{f}_k$ to Equ. (12) and recursively expands for $m$ steps and then yields the efficient two-loop recursion updating method [16], resulting in linear time/space complexity. The weight of Barzilai-Borwein step is used as the initial inverse Hessian $\mathbf{H}^0_k$ for L-BFGS updating, which is

这里 $\mathbf{H}_k$ 不是显式计算的。算法将 $\nabla \mathbf{f}_k$ 右乘Equ. (12)并递归扩展 $m$ 步，然后产生有效的双循环递归更新方法[16]，导致线性时间/空间复杂性。Barzilai-Borwein步骤的权重被用作L-BFGS更新的初始逆Hessian $\mathbf{H}^0_k$ ，其为

$\begin{equation} \mathbf{H}^0_k = \frac{\mathbf{s}^T_{k - 1} \mathbf{y}_{k - 1}}{\mathbf{y}^T_{k - 1} \mathbf{y}_{k - 1}} \mathbf{I} \quad \text{or}\quad \frac{\mathbf{s}^T_{k - 1} \mathbf{y}_{k - 1}}{\mathbf{s}^T_{k - 1} \mathbf{y}_{k - 1}} \mathbf{I} \end{equation}$

A monotone line search under strong Wolfe condition is used to
enforce convergence

利用强Wolfe条件下的单调线性搜索强制收敛

V. TIME RE-ALLOCATION AND TRAJECTORY REFINEMENT

Allocating an accurate time profile before the optimization is unreasonable, since the planner knows no information about the final trajectory then. Therefore, an additional time re-allocation procedure is vital to ensure dynamical feasibility. Previous works [10], [19] parameterize the trajectory as a non-uniform B-spline and iteratively lengthen a subset of knot spans when some segments exceed derivative limits.

However, one knot span $\Delta t_n$ influences multiple control
points and vice versa, leading to high-order discontinuity to
the previous trajectory when adjusting knot spans near the
start state. In this section, a uniform B-spline trajectory $\mathbf{\Phi}_f$ is
re-generated with reasonable time re-allocation according to the
safe trajectory $\mathbf{\Phi}_s$ from IV. Then, an anisotropic curve fitting
method is proposed to make $\mathbf{\Phi}_f$ freely optimize its control points
to meet higher order derivative constraints while maintaining a
nearly identical shape to $\mathbf{\Phi}_s$ .

在优化之前分配准确的时间剖面是不合理的，因为规划器当时不知道关于最终轨迹的信息。因此，额外的时间重新分配程序对于确保动态可行性至关重要。先前的工作[10]，[19]将轨迹参数化为非均匀B样条线，并在某些线段超过导数极限时迭代延长节点跨度的子集。

但是，一个节点跨度 $\Delta t_n$ 会影响多个控制点，反之亦然，导致调整结跨度时的前一轨迹靠近启动状态。在本节中，根据IV的安全轨迹 $\mathbf{\Phi}_s$ 重新生成了合理时间重分配均匀B样条轨迹 $\mathbf{\Phi}_f$ 。然后提出了一种各向异性的适配方法使得 $\mathbf{\Phi}_f$ 自由地优化它的控制点以满足高阶约束同时保持和 $\mathbf{\Phi}_s$ 几乎相同的形状。

Firstly, as Fast-Planner [14] does, we compute the limits exceeding ratio,

首先，正如Fast Planner[14]所做的那样，我们计算超出比率的极限，

$\begin{equation} r_e = \max\{\left|\mathbf{V}_{i,r}/v_m\right|,\sqrt{\left|\mathbf{A}_{j,r}/a_m\right|}, \sqrt[3]{\left|\mathbf{J}_{k,r}/j_m\right|},1\}, \end{equation}$

where $\in \{1,\dots,N_c − 1\}$ , $\in \{1,\dots,N_c − 2\}$ , $\in \{1,\dots,N_c − 3\}$ and $\in \{x, y, z\}$ axis. A notion with subscript $m$ represents the limitation of a derivative. $r_e$ indicates how much we should lengthen the time allocation for $\mathbf{\Phi}_f$ relative to $\mathbf{\Phi}_s$ . Note that $\mathbf{V}_i$ , $\mathbf{A}_j$ and $\mathbf{J}_k$ are inversely proportional to $ \Delta t$, the square of $\Delta t$ and the cubic of $\Delta t$ , respectively, from Equ.2. Then we obtain the new time span of $\mathbf{\Phi}_f$

其中 $\in \{1,\dots,N_c − 1\}$ , $\in \{1,\dots,N_c − 2\}$ , $\in \{1,\dots,N_c − 3\}$ 并且 $\in \{x, y, z\}$ 轴. 下标为 $m$ 的概念表示导数的限制。 $r_e$ 表示相对于 $\mathbf{\Phi}_s$ ，我们应该延长 $\mathbf{\Phi}_f$ 的时间分配多少。注意 $\mathbf{V}_i$ , $\mathbf{A}_j$ 和 $\mathbf{J}_k$ 分别和 $ \Delta t$, $\Delta t$ 的平方和 $\Delta t$ 的三次方成反比, 而后我们得到了 $\mathbf{\Phi}_f$ 新的时间跨度

$\begin{equation} \Delta t' = r_e \Delta t \end{equation}$

$\mathbf{\Phi}_f$ of time span $\Delta t'$ is initially generated under boundary constraints while maintaining the identical shape and control points number to $\mathbf{\Phi}_s$ , by solving a closed-form min-least square problem. The smoothness and feasibility are then refined by optimization. The penalty function $J^{'}$ formulated by linear combinations of smoothness (Section IV-A1), feasibility (Section IV-A3) and curve fitting (introduced later) is

时间跨度为 $\Delta t'$ 的 $\mathbf{\Phi}_f$ 在边界约束下被初始化并保持和 $\mathbf{\Phi}_s$ 相同的形状和控制点,通过求解一个闭式最小二乘问题。然后通过优化来细化平滑度和可行性。惩罚函数 $J^{'}$ 由光滑性 (Section IV-A1)，动力学可行性(Section IV-A3)和曲线拟合(后续介绍)的线性组合得到

$\begin{equation} \min_{\mathbf{Q}} J' = \lambda_s J_s +\lambda_d J_d + \lambda_f J_f \end{equation}$

where $\lambda_f$ is the weight of fitness term.

其中 $\lambda_f$ 是拟合项的权重。

The fitting penalty function $J_f$ is formulated as the integral of anisotropic displacements from points $\mathbf{\Phi}_f(αT')$ to the corresponding $\mathbf{\Phi}_s(αT)$ , where $T$ and $T^{'}$ are the trajectory duration of $\mathbf{\Phi}_s$ and $\mathbf{\Phi}_f$ , $\alpha \in [0, 1]$ . Since the fitted curve $\mathbf{\Phi}_s$ is already collision-free, we assign the axial displacement of two curves with low penalty weight to relax smoothness adjustment restriction, and radial displacement with high penalty weight to avoid collision. To achieve this, we use the spheroidal metric, shown in Fig. 5, such that displacements at the same spheroid surface produce identical penalties. The spheroid we use for $\mathbf{\Phi}_f(αT')$ is obtained by rotating an ellipse centering at $\mathbf{\Phi}_s(αT)$ about one of its principal axes, the tangent line $\dot{\mathbf{\Phi}}_s(αT)$ . So the axial displacement $d_a$ and radial displacement $d_r$ can be calculated by

拟合惩罚函数 $J_f$ 被公式化为从点 $\mathbf{\Phi}_f(αT')$ 到相应的 $\mathbf{\Phi}_s(αT)$ 的各向异性位移的积分，其中 $T$ 和 $T^{'}$ 是 $\mathbf{\Phi}_s$ and $\mathbf{\Phi}_f$ 的轨迹持续时间， $\alpha \in [0, 1]$ 。由于拟合曲线 $\mathbf{\Phi}_s$ 已经是无碰撞的，我们指定两条曲线的轴向位移具有低惩罚权重来放松平滑度调整限制，并指定具有高惩罚权重的径向位移来避免碰撞。为了实现这一点，我们使用了球面度量，如图5所示，这样在同一球面上的位移会产生相同的惩罚。我们用于 $\mathbf{\Phi}_f(αT')$ 的球体是通过绕其主轴之一，切线 $\dot{\mathbf{\Phi}}_s(αT)$ 旋转一个以 $\mathbf{\Phi}_s(αT)$ 为中心的椭圆而获得的。因此，轴向位移 $d_a$ 和径向位移 $d_r$ 可以通过下式确定

$$
\begin{equation}
\begin{aligned}
d_a &= (\mathbf{\Phi}_f - \mathbf{\Phi}_s) \cdot \frac{\dot{\mathbf{\Phi}}_s}{\left|\left| \dot{\mathbf{\Phi}}_s \right| \right|},\
d_r &= \begin{Vmatrix}
(\mathbf{\Phi}_f - \mathbf{\Phi}_s) \times \frac{\dot{\mathbf{\Phi}}_s}{\left|\left| \dot{\mathbf{\Phi}}_s \right| \right|}
\end{Vmatrix}

\end{aligned}

\end{equation}
$$

The fitness penalty function is

拟合项惩罚函数为

$\begin{equation} J_f = \int^1_0 \left[\frac{d_a(\alpha T')^2}{a^2} + \frac{d_r(\alpha T')^2}{b^2}\right]d\alpha, \end{equation}$

where $a$ and $b$ are semi-major and semi-minor axis of the ellipse, respectively. The problem is solved by L-BFGS.

其中 $a$ 和 $b$ 是椭圆的半长轴和半短轴，问题通过L-BFGS求解
在这里插入图片描述

VI. EXPERIMENT RESULTS

A. Implementation Details

在这里插入图片描述

The planning framework is summarized in Algorithm 2. We set the B-spline order as $p_b = 3$ . The number of control points $N_c$ alters around 25, which is determined by the planning horizon (about 7 m) and the initial distance interval (about 0.3 m) of adjacent points. These are empirical parameters that balance the complexity of the problem with degrees of freedom. The time complexity is $O(N_c)$ , since one control point only affects nearby segments according to the local support property of B-spline. The complexity of L-BFGS is also linear on the same relative tolerance. For collision-free path searching, we adopt A*, which has a good advantage that the path $\Gamma$ always tends to be close to the obstacle surface naturally. Therefore, we can directly selectp at $\Gamma$ without obstacle surface searching. For vector $\mathbf{R}_i$ defined in Fig. 3(b), it can be deduced by the property of uniform B-spline parameterization, that the $\mathbf{R}_i$ satisfies

规划框架在算法2中进行了总结。我们将B样条阶设置为 $p_b = 3$ 。控制点 $N_c$ 的数量在25左右改变，这由规划距离（约7m）和相邻点的初始距离间隔（约0.3m）决定。这些是经验参数，可以平衡问题的复杂性和自由度。时间复杂度为 $O(N_c)$ ，因为根据B样条的局部支持特性，一个控制点只影响附近的线段。L-BFGS的复杂性在相同的相对容差上也是线性的。对于无碰撞路径搜索，我们采用A*，这具有一个很好的优点，即路径 $\Gamma$ 总是倾向于自然地靠近障碍物表面。因此，我们可以在 $\Gamma$ 处直接选择p，而无需进行障碍物表面搜索。对于图3（b）中定义的向量 $\mathbf{R}_i$ ，可以通过均匀b样条参数化的性质推导出， $\mathbf{R}_i$ 满足

$\begin{equation} \mathbf{R}_i = \frac{\mathbf{Q}_{i + 1} - \mathbf{Q}_{i - 1}}{2\Delta t} \end{equation}$

which can be efficiently computed. Equ.18 is discretized to a
finite number of points $\mathbf{\Phi}_f(k\Delta t')$ and $\mathbf{\Phi}_s(k\Delta t)$ , where $\in \mathbb{N}, 0 \leq k \leq \lfloor T /\Delta t \rfloor$ . To further enforce safety, a collision check of a circular pipe with a fixed radius around the final trajectory is performed to provide enough obstacle clearance. The optimizer stops when no collision is detected. Real-world experiments are presented on the same flight platform of [19] with depth acquired by Intel RealSense D435.2 Furthermore, we modify the ROS driver of Intel RealSense to enable the laser emitter strobe every other frame. This allows the device to output high quality depth images with the help of the emitter, and along with binocular images free from laser interference. The modified driver is open-sourced as well.

其可以被有效地计算。Equ.18离散为有限个点 $\mathbf{\Phi}_f(k\Delta t')$ 和 $\mathbf{\Phi}_s(k\Delta t)$ ，其中 $\in \mathbb{N}, 0 \leq k \leq \lfloor T /\Delta t \rfloor$ 。为了进一步加强安全性，对最终轨迹周围具有固定半径的圆形管道进行碰撞检查，以提供足够的障碍物间隙。当未检测到冲突时，优化器将停止。真实世界的实验是在[19]的同一飞行平台上进行的，深度由Intel RealSense D435.2获取。此外，我们修改了Intel RealSense的ROS驱动程序，使激光发射器每隔一帧进行一次频闪。这允许该设备在发射器的帮助下输出高质量的深度图像，以及不受激光干扰的双目图像。修改后的驱动程序也是开源的。

B. Optimization Algorithms Comparison

In this section, three different optimization algorithms, including Barzilai-Borwein (BB) method, limited-memory BFGS (L-BFGS) and truncated Newton (T-NEWTON) method [17], are discussed. Specifically, each algorithm runs for 100 times independently in random maps. All relevant parameters including boundary constraints, time allocation, decision variables initialization, and random seeds, are set identical for different algorithms. The data about success rate, computation time and numbers of objective function evaluations are recorded. Only the successful cases are counted due to the data in failed cases is meaningless. The associated results are shown in Table I, which states that L-BFGS significantly outperforms the other two algorithms. L-BFGS characterizes a type of approximation by means of second order Taylor expansions, which is suitable for optimizing the objective function described in Section IV-B. Truncated Newton method approximates the second order optimization direction $\mathbf{H}^{−1}\nabla \mathbf{f}_k$ as well. However, too many objective function evaluations increase the optimization time. BB-method estimates the Hessian as a scalar $\lambda$ times $\mathbf{I}$ . Nevertheless, the insufficient estimation of Hessian still leads to a low convergence rate.

在本节中，讨论了三种不同的优化算法，包括Barzilai-Borwein（BB）方法、有限内存BFGS（L-BFGS）和截断牛顿（T-Newton）方法[17]。具体来说，每个算法在随机映射中独立运行100次。对于不同的算法，包括边界约束、时间分配、决策变量初始化和随机种子在内的所有相关参数都设置为相同。记录有关成功率、计算时间和目标函数评估次数的数据。只统计成功案例，因为失败案例中的数据是没有意义的。相关结果如表I所示，表明L-BFGS显著优于其他两种算法。L-BFGS通过二阶泰勒展开刻画了一种近似类型，适用于优化第IV-B节中描述的目标函数。截断牛顿法也近似于二阶优化方向 $\mathbf{H}^{−1}\nabla \mathbf{f}_k$ 。然而，过多的目标函数评估会增加优化时间。BB方法将Hessian估计为标量 $\lambda$ 乘以 $\mathbf{I}$ 。然而，Hessian的估计不足仍然导致低收敛率。

在这里插入图片描述

C. Trajectory Generation With & Without ESDF

We use the same setting as Section VI-B to perform this comparison. On account of the low success rate explained in [14] when using straight line initialization for an ESDF-based trajectory generator, we adopt a collision-free initialization. Comparison results are in Table II.

For clarity, ESDF-based methods with and without collision-free initialization are abbreviated as EI and ENI. This comparison gives that the proposed EGO algorithm achieves a comparable success rate to ESDF-based methods with collision-free initialization. However, trajectory energy (jerk integral) produced by EGO is slightly higher. This happens because the control points of EGO which contain more than one ${\mathbf{p}, \mathbf{v}}$ pair produce stronger trajectory deformation force than EI does, as described in Section IV-A2. On the other hand, stronger force accelerates the convergence procedure, resulting in shorter optimization time. Some statistics of ENI (shown in gray) can be less convincing because ENI tests can only succeed in fewer challenge cases where the resulting trajectories are naturally smoother with less energy cost and lower velocity, compared to EI and EGO. Something noteworthy is that although the ESDF updating size is reduced to $10 \times 4 \times 2 m^3$ with $0.1 m$ resolution for a $9 m$ trajectory, the ESDF updating still takes up a majority of the computation time.

我们使用与第VI-B节相同的设置来进行此比较。由于[14]中解释的对基于ESDF的轨迹生成器使用直线初始化时成功率较低，我们采用了无碰撞初始化。比较结果见表二。
为了清楚起见，带有和不带有无冲突初始化的基于ESDF的方法缩写为EI和ENI。这种比较表明，所提出的EGO算法实现了与无冲突初始化的基于ESDF的方法相当的成功率。然而，EGO产生的轨迹能量（急动积分）略高。如第IV-A2节所述，发生这种情况是因为包含一个以上 ${\mathbf{p}, \mathbf{v}}$ 对的EGO的控制点产生比EI更强的轨迹变形力。另一方面，更强的力加速了收敛过程，从而缩短了优化时间。ENI的一些统计数据（以灰色显示）可能不太令人信服，因为与EI和EGO相比，ENI测试只能在挑战较少的情况下成功，在这种情况下，产生的轨迹自然更平滑，能量成本更低，速度更低。值得注意的是，尽管对于 $9 m$ 的轨迹，ESDF更新大小减少到了 $10 \times 4 \times 2 m^3$ ，分辨率为 $0.1 m$ ，但ESDF更新仍然占用了大部分计算时间。

D. Multiple Planners Comparison

We compare the proposed planner with two state-of-the-art methods, Fast-Planner [14] and EWOK [2], which utilize ESDF to evaluate obstacle distance and gradient. Each planner runs for ten times of different obstacle densities from the same starts to ends. The average performance statistics and the ESDF computation time are shown in Table III and Fig. 7. Trajectories generated by three methods on a map of 0.5 obstacles/m2 are illustrated in Fig. 8.

From Table III we conclude that the proposed method achieves shorter flight time and trajectory length but ends up in higher energy cost compared to Fast-Planner. This is mainly caused by the front-end kinodynamic path searching in [14]. EWOK suffers twisty trajectories in dense environments, since the objective function contains exponential terms, which leads to unstable convergence in optimization. Furthermore, we conclude that a lot of computation time without ESDF updating is saved by the proposed method.

我们将所提出的规划器与两种最先进的方法Fast planner[14]和EWOK[2]进行了比较，这两种方法利用ESDF来评估障碍物距离和梯度。每个规划器从相同的起点到终点，在不同的障碍物密度下跑十次。平均性能统计数据和ESDF计算时间如表III和图7所示。在 $0.5$ 个障碍物 $/ m 2$ 的地图上，通过三种方法生成的轨迹如图8所示。

从表III中我们得出结论，与Fast Planner相比，所提出的方法实现了更短的飞行时间和轨迹长度，但最终导致更高的能量消耗。这主要是由[14]中的前端动力学路径搜索引起的。EWOK在密集环境中会遇到扭曲的轨迹，因为目标函数包含指数项，这导致优化中的收敛不稳定。此外，我们得出结论，该方法在不更新ESDF的情况下节省了大量计算时间。

在这里插入图片描述

E. Real-World Experiments

We present several experiments in cluttered unknown environments with limited camera FOV. One experiment is to fly by waypoints given in advance. In this experiment, the drone starts from a small office room, passes through the door, flies around in a big cluttered room, and then returns to the office, as illustrated in Fig. 10(a) and Fig. 11. The narrowest passage of indoor experiments is less than one meter as shown in Fig. 6. By contrast, the drone reaches $3.56 m / s$ in such a cluttered environment.

Another indoor experiment is to chase goals arbitrarily and abruptly given during the flight, as shown in Fig. 10©. In this test, limited FOV puts greater challenges that a feasible trajectory must be generated immediately once a new goal is received or collision threat is detected. Thus, this experiment validates that the proposed planner is capable of performing aggressive flight on the premise of feasibility.

In the outdoor experiments, the drone flies through a forest of massive trees and low bushes, as shown in Fig. 10(b) and Fig. 9. Although the wild airflow around the drone causes swinging of the branches and leaves, making the map less reliable, the drone still reaches a speed above 3 m/s. Therefore, the proposed planner can tackle both experimental and field environments. We refer readers to the video3 for more information.

我们提供一系列在杂乱的未知环境中有限FOV下的实验。其中一个饰演是按照提前给定的路径进行飞行，在这个实验中，无人机从一个小办公室起飞，通过门，在一个大杂乱房间四处飞行，然后回到小办公室，如图10(a)和图11展示。最窄的通行宽度小于一米。相比之下，无人机在这种障碍密集的环境下运行速度达到了 $3.56 m / s$

另一个室内实验是追逐随机突然出现的目标点，如图10©所示，在这场实验中，有限的FOV带来了更大的挑战，因为一旦收到一个新的目标点或检测到碰撞风险一个可行的轨迹必须被立即生成。因此，本实验验证了所提出的规划器在可行性的前提下能够执行侵略性飞行。

在室外实验中，无人机飞过一个大量的树和低矮的灌木组成的森林，如图10(b) 和图9尽管无人机周围的狂野气流导致树枝和树叶摆动，使地图不太可靠，但无人机的速度仍然超过3米/秒。因此，所提出的规划器既可以处理实验环境，也可以处理现场环境。我们建议读者参考视频3了解更多信息。

在这里插入图片描述

VII. CONCLUSION AND FUTURE WORK

In this letter, we investigate the necessity of ESDF for gradient-based trajectory planning and propose an ESDF-free local planner. It achieves comparable performance to some state-of-the-art ESDF-based planners but reduces computation time for over an order of magnitude. Benchmark comparisons and real-world experiments validate that it is robust and highly efficient.

The proposed method still has some flaws, which are the local minimum introduced by A* search and the conservative trajectories introduced by unified time re-allocation. Therefore, we will work on performing topological planning to escape the local minimum and re-formulating the problem to generate near-optimal trajectories. The planner is designed for static environments and can tackle slowly moving obstacles (below 0.5 m/s) without any modification. We will work on dynamic environment navigation by moving object detection and topological planning in the future.

本文中我们探索了ESDF更新对基于梯度的轨迹生成的必要性，并且提出了一种免ESDF的局部规划器。他和当前最先进的基于ESDF的规划器表现具有可比性，但降低了计算时间超过一个数量级。标准对比和真实世界实验验证了它是鲁棒并且高效的

提出的方法还有一些问题，那就是，由A*搜索引入的局部极小值和由统一时间重新分配引入的保守轨迹。因此，我们将致力于执行拓扑规划以避开局部极小值，并重新制定问题以生成接近最优的轨迹。该规划器专为静态环境设计，可以在不进行任何修改的情况下处理缓慢移动的障碍物（低于0.5 m/s）。我们将在未来通过运动物体检测和拓扑规划进行动态环境导航。