Robust Constrained Learning-based NMPC enabling reliable mobile robot path tracking

Introduction

这篇文章的目的就是achieving robust constrained, high performance path-tracking in spite of unknown disturbances.

这篇文章的思路:simple process model, high model uncertainty → l e a r n \rightarrow^{learn} learnaccurate, low-uncertainty model

这篇文章用VO做Localization.

这篇文章和传统contrained NMPC有如下两方面的不同:

  • 传统方法中process model是预先设计好并且不变的,在这篇文章中learn到disturbance model来加强process model,使得process model可以predict the mean and uncertainty of effects.
  • 传统contrained NMPC没有考虑模型的不确定性,这篇文章apply robust constraints in real time considering the learned uncertainty. We provide robust constraint satisfaction when uncertainty is high and increased performance as uncertainty is reduced through learning.

这篇文章的主要创新点就是:

  1. use learned models
  2. account for model uncertainty

在这里插入图片描述
上面就是本文整体控制框图。RC-LB-NMPC主要包含两个主要的部分:

  • the robust constrained, path-tracking NMPC algorithm based on an a priori process
  • the GP-based disturbance model

Mathematical Formulation

先大概介绍一下NMPC吧:
At a given sample time, NMPC finds a sequence of control inputs that optimizes the plant behavior over a prediction horizon based on current state. The first input in the optimal sequence is then applied to the system. The entire process is repeated at the next sample time for the new system state.

Robust Constrained NMPC
  • 首先肯定是要讲一下状态转移model

The true system is approximate by the sum of an a priori model and an experienced-based, learned model:
x k + 1 = f ( x k , u k ) + g ( a k ) x_{k+1} = f(x_{k}, u_{k}) + g(a_{k}) xk+1=f(xk,uk)+g(ak)
where:
f ( ⋅ ) f(\cdot) f()——a known nonlinear process model representing our knowledge of f t r u e ( ⋅ ) f_{true}(\cdot) ftrue()
g ( ⋅ ) g(\cdot) g()—— an (initially unknown) disturbance model representing discrepancies between the a priori model and the actual system behavior. g ( ⋅ ) g(\cdot) g() is modeled as GP. For simplicity, a k = ( x k ˉ , u k ) a_{k} = (\bar{x_{k}}, u_{k}) ak=(xkˉ,uk)

  • 再来讲一下cost function

定义the cost function to be minimized over the next K K K time-steps as:
J ( x ˉ , u ) = ( x d − x ˉ ) T Q ( x d − x ˉ ) + ( u d − u ) T R ( u d − u ) J(\bar{x}, u) = (x_{d} - \bar{x})^{T}Q(x_{d} - \bar{x}) + (u_{d} - u)^{T}R(u_{d} - u) J(xˉ,u)=(xdxˉ)TQ(xdxˉ)+(udu)TR(udu)
其中:
Q Q Q是半正定矩阵, R R R是正定矩阵
x d = ( x d , k + 1 , . . . , x d , k + K ) x_{d} = (x_{d, k+1}, ..., x_{d, k+K}) xd=(xd,k+1,...,xd,k+K)——a sequence of desired states
x = ( x k + 1 , . . . , x k + K ) x = (x_{k+1}, ..., x_{k+K}) x=(xk+1,...,xk+K)——a sequence of uncertain predicted states, x ˉ \bar{x} xˉ is the sequence of mean values based on x x x
u d = ( u d , k , . . . , u d , k + K − 1 ) u_{d} = (u_{d, k}, ..., u_{d, k+K-1}) ud=(ud,k,...,ud,k+K1)——a sequence of desired inputs
u = ( u k , . . . , u k + K − 1 ) u = (u_{k}, ..., u_{k+K-1}) u=(uk,...,uk+K1)——a sequence of inputs

  • 接下来就是要定义robust constraint了
    从state和input两个角度定义
基于以上基础,我们就可以formulate the following constrained optimization problem:

x o p t , u o p t = a r g m i n x , u J ( x ˉ , u ) {x_{opt}, u_{opt}} = \underset{x,u}{arg min}J(\bar{x}, u) xopt,uopt=x,uargminJ(xˉ,u) s u b j u c t t o x ˉ k + i + 1 = f ( x ˉ k + i , u k + i ) + g ( a k + i ) , i = 0 , . . . , K − 1 subjuct to \bar{x}_{k+i+1} = f(\bar{x}_{k+i} , u_{k+i}) + g(a_{k+i}), i=0, ..., K-1 subjucttoxˉk+i+1=f(xˉk+i,uk+i)+g(ak+i),i=0,...,K1 c i ( x ˉ , u ) > 0 c_{i}(\bar{x}, u) > 0 ci(xˉ,u)>0

整个算法的流程:
在这里插入图片描述
在算法收敛之后,we apply the first element of the resulting optimal control input sequence for one time-step, and start all over at the next time-step.

Predicting uncertain trajectories

state都是正态分布的,所以使用Sigma-Point Transform来iteratively predict state sequences.

定义state z i = ( x ˉ k + i , μ ( a k + i ) ) ∈ R 2 n z_{i} = (\bar{x}_{k+i}, \mu(a_{k+i})) \in R^{2n} zi=(xˉk+i,μ(ak+i))R2n representing the mean state and disturbance at time k + i k+i k+i with uncertainty P i = d i a g ( ∑ k + i , ∑ g p ( a k + i ) ) P_{i} = diag(\sum_{k+i}, \sum_{gp}(a_{k+i})) Pi=diag(k+i,gp(ak+i))

这个过程循环K次就可以生成完整的 x x x序列。在这种方式下, 3 σ 3\sigma 3σ置信区间accouts for uncertainty arising from both localization and modeling

Gaussian Process Disturbance Model

The learned model depends on disturbance observations collection during previous trials.

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值