行为树(Behavior Tree)介绍

Abstraction in programming has evolved our use of computers from basic arithmetic operations to representing complex real-world phenomena using models. More specific to robotics, abstraction has moved us from low-level actuator control and basic sensing to reasoning about higher-level concepts behaviors, as I define in my Anatomy of a Robotic System post.


In autonomous systems, we have seen an entire host of abstractions beyond “plain” programming for behavior modeling and execution. Some common ones you may find in the literature include teleo-reactive programs, Petri nets, finite-state machines (FSMs), and behavior trees (BTs). In my experience, FSMs and BTs are the two abstractions you see most often today.

在自主系统中,我们已经看到了一大批超越 "普通 "编程的行为建模和执行的抽象方法。你可能会在文献中发现一些常见的抽象,包括远程反应程序、Petri网、有限状态机(FSMs)和行为树(BTs)。根据我的经验,FSM和BT是你今天最常看到的两个抽象概念。

In this post, I will introduce behavior trees with all their terminology, contrast them with finite-state machines, share some examples and software libraries, and as always leave you with some resources if you want to learn more.



As we introduced above, there are several abstractions to help design complex behaviors for an autonomous agent. Generally, these consist of a finite set of entities that map to particular behaviors or operating modes within our system, e.g., “move forward”, “close gripper”, “blink the warning lights”, “go to the charging station”. Each model class has some set of rules that describe when an agent should execute each of these behaviors, and more importantly how the agent should switch between them.


Behavior trees (BTs) are one such abstraction, which I will define by the following characteristics:


  1. Behavior Trees are trees (duh): They start at a root node and are designed to be traversed in a specific order until a terminal state is reached (success or failure).

    行为树是树(duh): 它们从根节点开始,并被设计为按特定顺序遍历,直到达到一个终端状态(成功或失败)。

  2. Leaf nodes are executable behaviors: Each leaf will do something, whether it’s a simple check or a complex action, and will output a status (success, failure, or running). In other words, leaf nodes are where you connect a BT to the lower-level code for your specific application.

    叶子结点是可执行的行为: 每个叶子都会做一些事情,无论是简单的检查还是复杂的行动,并会输出一个状态(成功、失败或运行)。换句话说,叶子节点是你连接BT和你的具体应用的低级代码的地方。

  3. Internal nodes control tree traversal: The internal (non-leaf) nodes of the tree will accept the resulting status of their children and apply their own rules to dictate which node should be expanded next.

    内部节点控制树的遍历: 树的内部(非叶子)节点将接受其子节点的结果状态,并应用自己的规则来决定下一步应该展开哪个节点。

Behavior trees actually began in the videogame industry to define behaviors for non-player characters (NPCs): Both Unreal Engine and Unity (two major forces in this space) have dedicated tools for authoring BTs. This is no surprise; a big advantage of BTs is that they are easy to compose and modify, even at runtime. However, this sacrifices the ease of designing reactive behaviors (for example, mode switches) compared to some of the other abstractions, as you will see later in this post.

行为树实际上开始于电子游戏行业,用于定义非玩家角色(NPC)的行为: Unreal EngineUnity(这个领域的两个主要力量)都有专门的工具来编写BTs。这并不奇怪;BT的一大优势是它们很容易构成和修改,甚至是在运行时。然而,与其他一些抽象概念相比,这牺牲了设计反应式行为(例如,模式切换)的便利性,你将在本篇文章的后面看到。

Since then, BTs have also made it into the robotics domain as robots have become increasingly capable of doing more than simple repetitive tasks. Easily the best resource here is the textbook “Behavior Trees in Robotics and AI: An Introduction” by Michele Colledanchise and Petter Ögren. In fact, if you really want to learn the material you should stop reading this post and go directly to the book … but please stick around?

从那时起,BT也进入了机器人领域,因为机器人越来越有能力做更多简单的重复性任务。这里最好的资源是Michele Colledanchise和Petter Ögren的教科书《机器人和人工智能中的行为树:简介》。事实上,如果你真的想学习这些材料,你应该停止阅读这篇文章,直接去看这本书…但请坚持一下?

下图是Unreal Engine中构建的行为树:
二、行为树术语

Let’s dig into the terminology in behavior trees. While the language is not standard across the literature and various software libraries, I will largely follow the definitions in Behavior Trees in Robotics and AI.


At a glance, these are the types of nodes that make up behavior trees and how they are represented graphically:


Behavior trees execute in discrete update steps known as ticks. When a BT is ticked, usually at some specified rate, its child nodes recursively tick based on how the tree is constructed. After a node ticks, it returns a status to its parent, which can be *Success*, *Failure*, or *Running*.


Execution nodes, which are leaves of the BT, can either be Action or Condition nodes. The only difference is that condition nodes can only return Success or Failure within a single tick, whereas action nodes can span multiple ticks and can return Running until they reach a terminal state. Generally, condition nodes represent simple checks (e.g., “is the gripper open?”) while action nodes represent complex actions (e.g., “open the door”).


Control nodes are internal nodes and define how to traverse the BT given the status of their children. Importantly, children of control nodes can be execution nodes or control nodes themselves. Sequence, Fallback, and Parallel nodes can have any number of children, but differ in how they process said children. Decorator nodes necessarily have one child, and modify its behavior with some custom defined policy.


Scroll through the images below to see how the different control nodes work.


  • Sequence nodes execute children in order until one child returns Failure or all children returns Success.


  • Fallback nodes execute children in order until one of them returns Success or all children return Failure. These nodes are key in designing recovery behaviors for your autonomous agents.


  • Parallel nodes will execute all their children in “parallel”. This is in quotes because it’s not true parallelism; at each tick, each child node will individually tick in order. Parallel nodes return Success when at least M child nodes (between 1 and N) have succeeded, and Failure when all child nodes have failed.

    平行节点将 "平行 "地执行它们的所有子节点。这是有引号的,因为这不是真正的并行;在每次勾选时,每个子节点将按顺序单独勾选。当至少有M个子节点(介于1和N之间)成功时,平行节点会返回成功,而当所有子节点都失败时,则返回失败。

  • Decorator nodes modify a single child node with a custom policy. A decorator has its own set of rules for changing the status of the “decorated node”. For example, an “Invert” decorator will change Success to Failure, and vice-versa. While decorators can add flexibility to your behavior tree arsenal, you should stick to standard control nodes and common decorators as much as possible so others can easily understand your design.

    装饰器节点用自定义的策略修改一个单一的子节点。一个装饰器有自己的一套规则来改变 "被装饰节点 "的状态。例如,一个 "反转 "装饰器将把成功改为失败,反之亦然。虽然装饰器可以为你的行为树武器库增加灵活性,但你应该尽可能地坚持使用标准的控制节点和常见的装饰器,这样别人就可以很容易地理解你的设计。

三、机器人实例: 搜索物体

The best way to understand all the terms and graphics in the previous section is through an example. Suppose we have a mobile robot that must search for specific objects in a home environment. Assume the robot knows all the search locations beforehand; in other words, it already has a world model to operate in.

Our mobile robot example. A simulated TurtleBot3 must move in a known map to find blocks of a given color.


Let’s start simple. If there is only one location (we’ll call it A), then the BT is a simple sequence of the necessary actions: Go to the location and then look for the object.

让我们从简单的开始。如果只有一个地点(我们称之为A),那么BT就是一个简单的必要行动序列: 前往该地点,然后寻找该物体。


Our first behavior tree. Bask in the simplicity while you can…


We’ve chosen to represent navigation as an action node, as it may take some time for the robot to move (returning Running in the process). On the other hand, we represent vision as a condition node, assuming the robot can detect the object from a single image once it arrives at its destination. I admit, this is totally contrived for the purpose of showing one of each execution node.


One very common design principle you should know is defined in the book as explicit success conditions. In simpler terms, you should almost always check before you act. For example, if you’re already at a specific location, why not check if you’re already there before starting a navigation action?



Explicit success conditions use a Fallback node with a condition preceding an action. The guarded action will only execute if the success condition is not met — in this example if the robot is not at location A.


Our robot likely operates in an environment with multiple locations, and the idea is to look in all possible locations until we find the object of interest. This can be done by introducing a root-level Fallback node and repeating the above behavior for each location in some specified order.



We can also use Fallback nodes to define reactive behaviors; that is, if one behavior does not work, try the next one, and so on.


Finally, suppose that instead of looking for a single object, we want to consider several objects — let’s say apples and oranges. This use case of composing conditions can be done with Parallel nodes as shown below.


  • If we accept either an apple or an orange (“OR” condition), then we succeed if one node returns Success.

    如果我们接受一个苹果或一个桔子("OR "条件),那么如果一个节点返回成功,我们就成功了。

  • If we require both an apple and an orange (“AND” condition), then we succeed if both nodes return Success.

    如果我们同时要求一个苹果和一个桔子("AND "条件),那么如果两个节点都返回成功,我们就成功了。

  • If we care about the order of objects, e.g., you must find an apple before finding an orange, then this could be done with a Sequence node instead.



Parallel nodes allows multiple actions and/or conditions to be considered within a single tick.


Of course, you can also compose actions in parallel — for example, turning in place until a person is detected for 5 consecutive ticks. While my example is hopefully simple enough to get the basics across, I highly recommend looking at the literature for more complex examples that really show off the power of BTs.

当然,你也可以并行地组成行动–例如,在原地转弯,直到连续5个tick检测到一个人。 虽然我的例子希望简单到足以让人了解基本情况,但我强烈建议查看文献,了解更复杂的例子,真正展示BT的力量。

四、重新审视机器人的例子: 装饰者和黑板

I don’t know about you, but looking at the BT above leaves me somewhat uneasy. It’s just the same behavior copied and pasted multiple times underneath a Fallback node. What if you had 20 different locations, and the behavior at each location involved more than just two simplified execution nodes? Things could quickly get messy.


In most software libraries geared for BTs you can define these execution nodes as parametric behaviors that share resources (for example, the same ROS action client for navigation, or object detector for vision). Similarly, you can write code to build complex trees automatically and compose them from a ready-made library of subtrees. So the issue isn’t so much efficiency, but readability.


There is an alternative implementation for this BT, which can extend to many other applications. Here’s the basic idea:


  • Introduce decorators: Instead of duplicating the same subtree for each location, have a single subtree and decorate it with a policy that repeats the behavior until successful.

    引入装饰器: 与其为每个位置重复相同的子树,不如有一个单一的子树,并用一个策略来装饰它,重复该行为直到成功。

  • Update the target location at each iteration: Suppose you now have a “queue” of target locations to visit, so at each iteration of the subtree you pop an element from that queue. If the queue eventually ends up empty, then our BT fails.

    在每次迭代中更新目标位置: 假设你现在有一个要访问的目标位置的 “队列”,那么在子树的每次迭代中,你都会从该队列中弹出一个元素。如果这个队列最终是空的,那么我们的BT就会失败。

In most BTs, we often need some notion of shared data like the location queue we’re discussing. This is where the concept of a blackboard comes in: you’ll find blackboard constructs in most BT libraries out there, and all they really are is a common storage area where individual behaviors can read or write data.


Our example BT could now be refactored as follows. We introduce a “GetLoc” action that pops a location from our queue of known locations and writes it to the blackboard as some parameter target_location. If the queue is empty, this returns Failure; otherwise it returns Success. Then, downstream nodes that deal with navigation can use this target_location parameter, which changes every time the subtree repeats.

我们的BT例子现在可以被重构如下。我们引入一个 "GetLoc "动作,从已知位置的队列中弹出一个位置,并将其作为参数target_location写到黑板上。如果队列是空的,这将返回失败;否则,它将返回成功。然后,处理导航的下游节点可以使用这个target_location参数,这个参数在子树每次重复时都会改变。


增加一个黑板和一个 "重复 "装饰器可以大大简化我们的树,即使底层行为是一样的。

You can use blackboards for many other tasks. Here’s another extension of our example: Suppose that after finding an object, the robot should speak with the object it detected, if any. So, the “FoundApple” and “FoundOrange” conditions could write to a located_objects parameter in the blackboard and a subsequent “Speak” action would read it accordingly. A similar solution could be applied, for instance, if the robot needs to pick up the detected object and has different manipulation policies depending on the type of object.

你可以将黑板用于许多其他任务。下面是我们这个例子的另一个扩展: 假设在找到一个物体后,机器人应该与它检测到的物体对话,如果有的话。因此,"FoundApple "和 "FoundOrange "条件可以写入黑板中的located_objects参数,随后的 "Speak "动作会相应地读取它。例如,如果机器人需要拿起检测到的物体,并且根据物体的类型有不同的操作策略,也可以应用类似的解决方案。

Fun fact: This section actually came from a real discussion with Davide Faconti, in which… he essentially schooled me. It brings me great joy to turn my humiliation into an educational experience for you all.

有趣的是:这一部分实际上来自于与Davide Faconti的真实讨论,其中…他基本上是在教育我。把我的羞辱变成对大家的教育,这给我带来了极大的快乐。


Let’s talk about how to program behavior trees! There are quite a few libraries dedicated to BTs, but my two highlights in the robotics space are py_trees and BehaviorTree.CPP.

让我们来谈谈如何对行为树进行编程! 有不少专门用于BT的库,但我在机器人领域的两个亮点是py_trees和BehaviorTree.CPP。

py_trees is a Python library created by Daniel Stonier.

py_trees是一个由Daniel Stonier创建的Python库。

  • Because it uses an interpreted language like Python, the interface is very flexible and you can basically do what you want… which has its pros and cons. I personally think this is a good choice if you plan on automatically modifying behavior trees at run time.


  • It is being actively developed and with every release you will find new features. However, many of the new developments — not just additional decorators and policy options, but the visualization and logging tools — are already full-steam-ahead with ROS 2. So if you’re still using ROS 1 you will find yourself missing a lot of new things. Check out the PyTrees-ROS Ecosystem page for more details.

    它正在被积极开发,每一个版本你都会发现新的功能。然而,许多新的发展–不仅仅是额外的装饰器和策略选项,还有可视化和日志工具–已经在ROS 2中全力推进了。请查看PyTrees-ROS生态系统页面,了解更多细节。

  • Some of the terminology and design paradigms are a little bit different from the Behavior Trees in Robotics book. For example, instead of Fallback nodes this library uses Selector nodes, and these behave slightly differently.


Our navigation example using the py_trees library and rqt_py_trees for visualization.


BehaviorTree.CPP is a C++ library developed by Davide Faconti and Michele Colledanchise (yes, one of the book authors). It should therefore be no surprise that this library follows the book notation much more faithfully.

BehaviorTree.CPP是由Davide Faconti和Michele Colledanchise(是的,本书作者之一)开发的一个C++库。因此,这个库更忠实于书中的符号,这一点应该不足为奇。

  • This library is quickly gaining traction as the behavior tree library of the ROS developers’ ecosystem, because C++ is similarly the language of production quality development for robotics. In fact, the official ROS 2 navigation stack uses this library in its BT Navigator feature.

    这个库作为ROS开发者生态系统的行为树库,正在迅速获得吸引力,因为C++同样是机器人技术的生产质量开发语言。事实上,官方的ROS 2导航栈在其BT导航功能中使用了这个库。

  • It heavily relies on an XML based workflow, meaning that the recommended way to author a BT is through XML files. In your code, you register node types with user-defined classes (which can inherit from a rich library of existing classes), and your BT is automatically synthesized!


  • It is paired with a great tool named Groot which is not only a visualizer, but a graphical interface for editing behavior trees. The XML design principle basically means that you can draw a BT and export it as an XML file that plugs into your code.


  • This all works wonderfully if you know the structure of your BT beforehand, but leaves a little to be desired if you plan to modify your trees at runtime. Granted, you can also achieve this using the programmatic approach rather than XML, but this workflow is not documented/recommended, and doesn’t yet play well with the visualization tools.



Our navigation example using the BehaviorTree.CPP library and Groot for visualization.


So how should you choose between these two libraries? They’re both mature, contain a rich set of tools, and integrate well with the ROS ecosystem. It ultimately boils down to whether you want to use C++ or Python for your development. In my example GitHub repo I tried them both out, so you can decide for yourself!

那么,你应该如何在这两个库中选择呢?它们都很成熟,包含丰富的工具集,并能很好地与ROS的生态系统整合。最终归结为你是想使用C++还是Python进行开发。在我的例子GitHub repo中,我把它们都试了一遍,所以你可以自己决定!


In my time at MathWorks, I was immersed in designing state machines for robotic behavior using Stateflow — in fact, I even did a YouTube livestream on this topic. However, robotics folks often asked me if there were similar tools for modeling behavior trees, which I had never heard of at the time. Fast forward to my first day at CSAIL, my colleague at the time (Daehyung Park) showed me one of his repositories and I finally saw my first behavior tree. It wasn’t long until I was working with them in my project as a layer between planning and execution, which I describe in my 2020 recap blog post.

在MathWorks工作期间,我沉浸在使用Stateflow为机器人行为设计状态机的过程中–事实上,我甚至在YouTube上做过关于这个主题的直播。然而,机器人行业的人经常问我是否有类似的工具来为行为树建模,当时我从未听说过。快到我在CSAIL的第一天,我当时的同事(Daehyung Park)给我看了他的一个资料库,我终于看到了我的第一个行为树。没过多久,我就在我的项目中使用它们作为计划和执行之间的一层,我在2020年的总结博文中描述了这一点。

As someone who has given a lot of thought to “how is a BT different from a FSM?”, I wanted to reaffirm that they both have their strengths and weaknesses, and the best thing you can do is learn when a problem is better suited for one or the other (or both).

作为一个对 "BT和FSM有什么不同 "进行了大量思考的人,我想重申,它们都有各自的优势和劣势,你能做的最好的事情就是学习什么时候一个问题更适合于一个或另一个(或两者)。

The Behavior Trees in Robotics and AI book expands on these thoughts in way more rigor, but here is my attempt to summarize the key ideas:


  • In theory, it is possible to express anything as a BT, FSM, one of the other abstractions, or as plain code. However, each model has its own advantages and disadvantages in their intent to aid design at larger scale.


  • Specific to BTs vs. FSMs, there is a tradeoff between modularity and reactivity. Generally, BTs are easier to compose and modify while FSMs have their strength in designing reactive behaviors.


Let’s use another robotics example to go deeper into these comparisons. Suppose we have a picking task where a robot must move to an object, grab it by closing its gripper, and then move back to its home position. A side-by-side BT and FSM comparison can be found below. For a simple design like this, both implementations are relatively clean and easy to follow.



Behavior Tree (left) and Finite-State Machine (right) for our robot picking example.


Now, what happens if we want to modify this behavior? Say we first want to check whether the pre-grasp position is valid, and correct if necessary before closing the gripper. With a BT, we can directly insert a subtree along our desired sequence of actions, whereas with a FSM we must rewire multiple transitions. This is what we mean when we claim BTs are great for modularity.



Modifications to our BT (left) and FSM (right) if we want to add a pre-grasp correction behavior.


On the other hand, there is the issue of reactivity. Suppose our robot is running on a finite power source, so if the battery is low it must return to the charging station before returning to its task. You can implement something like this with BTs, but a fully reactive behavior (that is, the battery state causes the robot to go charge no matter where it is) is easier to implement with a FSM… even if it looks a bit messy.


On the note of “messy”, behavior tree zealots tend to make the argument of “spaghetti state machines” as reasons why you should never use FSMs. I believe that is not a fair comparison. The notion of a hierarchical finite-state machine (HFSM) has been around for a long time and helps avoid this issue if you follow good design practices, as you can see below. However, it is true that managing transitions in a HFSM is still more difficult than adding or removing subtrees in a BT.

关于 “混乱”,行为树的狂热者倾向于提出 "意大利语状态机 "的论点,作为你不应该使用FSM的理由。我认为这并不是一个公平的比较。分层有限状态机(HFSM)的概念已经存在了很长时间,如果你遵循良好的设计实践,它有助于避免这个问题,你可以在下面看到。然而,在HFSM中管理过渡确实比在BT中添加或删除子树更困难。

There have been specific constructs defined to make BTs more reactive for exactly these applications. For example, there is the notion of a “Reactive Sequence” that can still tick previous children in a sequence even after they have returned Success. In our example, this would allow us to terminate a subtree with Failure if the battery levels are low at any point during that action sequence, which may be what we want.

有一些特定的结构被定义为使BT在这些应用中更具有反应性。例如,有一个 "反应式序列 "的概念,它可以在序列中勾选先前的子树,即使它们已经返回成功。在我们的例子中,这将允许我们在行动序列中的任何时候,如果电池电量不足,就以失败来终止子树,这可能是我们想要的。

  • Adding a battery check and charging action to a BT is easy, but note that this check is not reactive — it only occurs at the start of the sequence. Implementing more reactivity would complicate the design of the BT, but is doable with constructs like Reactive Sequences.



  • FSMs can allow this reactivity by allowing the definition of transitions between any two states.



  • Hierarchical FSMs can clean up the diagram. In this case, we define a superstate named “Nominal”, thus defining two clear operating modes between normal operation and charging.

    层次化的FSM可以清理图示。在这种情况下,我们定义了一个名为 "Nominal "的超状态,从而定义了正常运行和充电之间的两种明确的运行模式。


Because of this modularity / reactivity tradeoff, I like to think that FSMs are good at managing higher-level operating modes (such as normal operation vs. charging), and BTs are good at building complex sequences of behaviors that are excellent at handling recoveries from failure. So, if this design were up to me, it might be a hybrid that looks something like this:



Best of both worlds: High-level mode switches are handled by a FSM and mode-specific behaviors are managed with BTs.



Thank for reading through this introductory post, and I look forward to your comments, questions, and suggestions. If you want to try the code examples, check out my example GitHub repository.


To learn more about behavior trees, here are some good resources that I’ve relied on over the past year and a bit.


