1. Problem
在导航时,遇到一个情况,就是突然很多人出现在机器人附近,它没办法规划轨迹,然后启动move base的terminal直接就crash了,之后的操作就没办法进行了
2. Solution
解决方案有两种:
- 事前:在发生crash前处理突发状况
- 事后:发生crash后,怎么重新启动move base以及重新发布目标位置
在这里只提供一个事后的处理方案,事前的解决方案比较麻烦,要到算法里头去找,这个就不容易弄了,事后补救的方案还是容易实现。
1) 监听move base状态
头文件:
#include <move_base_msgs/MoveBaseAction.h>
#include <move_base_msgs/MoveBaseGoal.h>
#include <actionlib_msgs/GoalStatusArray.h>
topic名:
ros::Subscriber goal_sub =node.subscribe("move_base/status", 10, status_callback); //监听move base 状态
ros::Subscriber BaseStatusSub = node.subscribe("/move_base/result",10, BaseStatusSubCallback); //监听move base运行的结果
BasegoalPub = node.advertise<geometry_msgs::PoseStamped>("/move_base_simple/goal", 1);//发布目标位置
callback函数:
ros::Time move_base_stamp;
ros::Time move_base_stamp_last;
int MoveBaseStatus = 0;
bool move_base_status = true;
bool relaunch_status = true;
int move_status = 0;
void status_callback(const actionlib_msgs::GoalStatusArray& msg)
{
move_base_stamp = msg.header.stamp;
if(!msg.status_list.empty())
move_status = msg.status_list.back().status;
// std::cout<<"move base stamp:"<< msg.header.stamp <<std::endl;
}
void BaseStatusSubCallback(const move_base_msgs::MoveBaseActionResult& msg)
{
MoveBaseStatus=msg.status.status;
}
代码解析:
move_base_stamp
: 记录当前时间
move_base_stamp_last
:记录上一个时间
MoveBaseStatus = 0
:记录当前result的结果
move_base_status = true
: 用来判断move base是否crash,true是正常,false就是出问题了
relaunch_status = true
: 启动一次move base,避免重复启动
move base本身就会一直发布自身状态信息,topic名是move_base/status
,信息类型是actionlib_msgs/GoalStatusArray
,包含如下信息
std_msgs/Header header
uint32 seq
time stamp
string frame_id
actionlib_msgs/GoalStatus[] status_list
uint8 PENDING=0
uint8 ACTIVE=1
uint8 PREEMPTED=2
uint8 SUCCEEDED=3
uint8 ABORTED=4
uint8 REJECTED=5
uint8 PREEMPTING=6
uint8 RECALLING=7
uint8 RECALLED=8
uint8 LOST=9
actionlib_msgs/GoalID goal_id
time stamp
string id
uint8 status
string text
同时也会发布处理结果,topic名是/move_base/result
,这个topic返回的是数字,而每个数字都有对应的含义,如下:
# uint8 PENDING = 0 # The goal has yet to be processed by the action server
# uint8 ACTIVE = 1 # The goal is currently being processed by the action server
# uint8 PREEMPTED = 2 # The goal received a cancel request after it started executing
# # and has since completed its execution (Terminal State)
# uint8 SUCCEEDED = 3 # The goal was achieved successfully by the action server (Terminal State)
# uint8 ABORTED = 4 # The goal was aborted during execution by the action server due
# # to some failure (Terminal State)
# uint8 REJECTED = 5 # The goal was rejected by the action server without being processed,
# # because the goal was unattainable or invalid (Terminal State)
# uint8 PREEMPTING = 6 # The goal received a cancel request after it started executing
# # and has not yet completed execution
# uint8 RECALLING = 7 # The goal received a cancel request before it started executing,
# # but the action server has not yet confirmed that the goal is canceled
# uint8 RECALLED = 8 # The goal received a cancel request before it started executing
# # and was successfully cancelled (Terminal State)
# uint8 LOST = 9 # An action client can determine that a goal is LOST. This should not be
# # sent over the wire by an action server
2) 根据move base的status和result做处理
在了解了状态和结果后,在这里我选用的是status里的stamp做判断,思路就是如果隔5s或10s没有接收到新的stamp,就判断move base已经crash了,这时就用重启指令重新启动move base并重新发布目标位置。具体代码如下:
if( move_base_stamp == move_base_stamp_last) move_base_counting++;
else move_base_counting = 0;
if (move_base_counting>=100)
{
move_base_status = false;
move_base_counting = 0;
if (relaunch_status == true) MoveBaseStatus = -1;
}
move_base_stamp_last = move_base_stamp;
if (move_base_status==false)
{
if (relaunch_status == true)
{
ROS_INFO("Move base got killed!!!!!!!!! Try to relaunch move_base");
system("gnome-terminal -e 'bash -c \"roslaunch /opt/ros/melodic/share/move_base/move_base.launch; exec bash\"'");
ros::Duration(5.0).sleep();
relaunch_status = false;
}
base_goal.header.seq = 1;
base_goal.header.stamp =ros::Time::now();
base_goal.header.frame_id = "map";
base_goal.pose.position.x = 0.0;
base_goal.pose.position.y = 0.0;
base_goal.pose.position.z = 0.0;
base_goal.pose.orientation.x = 0.0;
base_goal.pose.orientation.y = 0.0;
base_goal.pose.orientation.z = 0.0;
base_goal.pose.orientation.w = 1.0;
BasegoalPub.publish(base_goal);
ros::Duration(2.0).sleep();
ROS_INFO("Publish move base goal!");
if (MoveBaseStatus != -1)//这一步也很重要,就是说只有当result有变化才跳出现在的crash事后补救状态,不然就一直发布goal,因为有时候发布一次goal会没响应,要多发几次,这个跟topic这种通信方式有关,如果换成action或service就不会有这种Miss的问题
{
ROS_INFO("Recover success!");
move_base_status = true;
relaunch_status = true;
}
}
代码解析:
if( move_base_stamp == move_base_stamp_last) move_base_counting++
:如果前后两个时间戳是一样的,累计数就+1
else move_base_counting = 0
:时间戳不一样就归0
if (move_base_counting>=100)
:当到100,如果按10Hz频率运行,就是10s,具体根据节点的频率和需求来算
if (relaunch_status == true) MoveBaseStatus = -1
:把result的结果设成-1,即非列表里的数
if (move_base_status==false)
:这时就是出现crash的情况了,开始事后补救
system("gnome-terminal -e 'bash -c \"roslaunch /opt/ros/melodic/share/move_base/move_base.launch; exec bash\"'")
:这句很重要!是重新启动move base的指令,最重要的一点是,把move_base.launch放到/opt/ros/melodic/share/move_base路径下,放在/home下启动不了的,当然,如果有大神知道在/home下启动方法,还望告知,感谢
if (relaunch_status == true)
:启动一次后,就不启动了,不然就一直重复启动
BasegoalPub.publish(base_goal)
:下面就是重新发布goal的位置,根据实际情况调整
if (MoveBaseStatus != -1)
:这一步也很重要,就是说只有当result有变化才跳出现在的crash事后补救状态,不然就一直发布goal,因为有时候发布一次goal会没响应,要多发几次,这个跟topic这种通信方式有关,如果换成action或service就不会有这种Miss的问题
如果大家遇到类似问题,欢迎讨论,特别是如何事前处理。