PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning 代码解析

论文名称:PRIMAL: Pathfinding via Reinforcement and Imitation Multi-Agent Learning
论文地址:https://arxiv.org/abs/1809.03531
代码地址:https://github.com/gsartoretti/PRIMAL
相关链接:无人驾驶动态避障策略调研 | 机器人动态避障策略 | 行人轨迹预测 | 机器人导航三篇强化学习用于多智能体路径规划的论文

1. 准备工作

1.1 配置环境:

  1. 先新建一个环境:
    conda create --name PRIMAL python=3.6
  2. 新建一个requirements.txt文件,把要安装的包都放进去,再运行 pip install -r requirements安装包:
Cython==0.28.4
gym==0.9.4
Tensorflow==1.13.0
numpy==1.16.0
matplotlib
imageio
tk
networkx

:源代码中是python3.4 + tf 1.3 + numpy 1.13,但是因为我的pycharm版本较高,不能兼容低版本python,因此我配的是python 3.6;py3.6无法兼容1.3版本的tensorflow,因此我又换成了1.13版本的tf。
3. 如果安装失败,也可以一个一个安装(我采用的方法)
:直接pip或者conda安装不了gym时,可以尝试以下两种方法:conda install -c conda-forge gym=0.9.4 或者 pip install gym -i https://pypi.tuna.tsinghua.edu.cn/simple
安装tensorflow出问题时,可以执行以下两句 conda install cudatoolkit=10.0 conda install tensorflow==1.13.1
4. 安装完成后,可以使用conda list命令检查是否都安装上了:
在这里插入图片描述

1.2 调试代码

  1. 安装readme文件中的步骤一步一步来:
    ① 命令行窗口进入cd od_mstar3文件夹,python setup.py build_ext --inplace;报错了error: Unable to find vcvarsall.bat
    网上搜了一下,发现是需要安装Visual Studio,并且在安装时需要勾选 C++组件。如果安装时没有勾选,应当在工具栏中重新安装,之后重启就有vcvarsall模块了。参考文章:已安装vs2017 仍然报错Unable to find vcvarsall.bat

  2. 然而,装了之后还是报错 o(╥﹏╥)o,参照:关于error: Unable to find vcvarsall.bat,我将msvc9compiler.py文件中的find_vcvarsall函数return改成 return r"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\VC\Auxiliary\Build\vcvarsall.bat"
    其他参考文章:Unable to find vcvarsall.bat?

  3. 改好vcvarsall.bat的bug后,又有了一个新bug: fatal error C1083: 无法打开包括文件: “boost/graph/graph_traits.hpp”: No such file or directory。 网上查了一下,这是C++里的BGL库,需要自己下载安装。寻路时会用到它。Boost C++ Libraries 编译安装oost下载安装编译配置使用指南(含Windows和Linux)
    ① 在官网中下载了 boost_1_53_.zip,解压运行bootstrap.bat的时候报错了 boost 编译时出错处理:Failed to build Boost.Build engine。查阅了很多资料,还是没解决,于是决定下载最新版本的boost工具包
    安装时再次报错,fatal error C1083: 无法打开包括文件: “corecrt.hWindows 原生 cmd 窗口下编译 C++(cl命令)出现的问题及解决方法

    ② 朋友安装的Visual Studio 2022,能够顺利装上boost,因此我也重新安装了vs2022。啊!终于可以啦!!

    boost工具安装教程官方安装教程

1.3 linux环境安装boost:

!!!!linux环境直接使用sudo apt-get install boost命令就可以安装了! https://stackoverflow.com/questions/12578499/how-to-install-boost-on-ubuntu

2.代码解析:

2.1 setup.py

按照作者的要求,应该先运行这个文件~
setup.py是调用cpython,用 Python 的语法混合编写 Python 和 C/C++ 代码,提升 Python 速度
调用 C/C++ 代码 教程Cython 基本用法

本代码中是运行cpython_od_matar.pyx这个文件,其中的find_path(world, init_pos, goals, inflation, time_limit)函数是使用ODrM算法进行探索。ODrM相当于专家,generate a high-quality paths。
输入输出

    world - matrix specifying obstacles, 1 for obstacle, 0 for free
    init_pos  - [[x, y], ...] specifying start position for each robot
    goals     - [[x, y], ...] specifying goal position for each robot
    inflation - inflation factor for heuristic
    time_limit - time until failure in seconds

    returns:
    [[[x1, y1], ...], [[x2, y2], ...], ...] path in the joint
    configuration space

2.2 A3C_RNN.py

因为pycharm在服务器运行jupyter文件有点麻烦,我将DRLMAPF_A3C_RNN.ipynb文件改成了A3C_RNN.py文件。这部分负责训练模型

2.3 ACNet.py

ACNet.py中的_build_net()函数对应着论文中的网络结构:
在这里插入图片描述

2.4 mapf_gym环境

2.4.1 mapf_gym与mapf_gym_cap区别

继承了gym库,用于搭建环境
奇怪的是,它与mapf_gym_cap.py中代码几乎一样,只是_observe()函数不太一样:
mapf_gym中限制了mag(agent距离goal的位置)的大小,应该就是视野受限吧:
在这里插入图片描述

github中对这两个文件的解释是:
mapf_gym:Multi-agent path planning gym environment, in which agents learn collective path planning
mapf_gym_cap.py:Multi-agent path planning gym environment, with capped goal distance state value for validation in larger environments

看了一下代码,训练的时候调用的是mapf_gym,测试的时候是mapf_gym_cap.py
直接运行这两个代码会报错NameError: name 'coordinationRatio' is not defined,没查到coordinationRatio这个函数是干什么的,我就把对应的代码print(coordinationRatio(env))注释掉了。

2.4.2 搭建环境

A3C_RNN.py中创建环境:gym=mapf_gym.MAPFEnv(num_agents=n, world0=world[0],goals0=world[1])

2.5 mapgenerator.py

利用tk()库,生成环境,即手动设置obstacle,agent的位置
在这里插入图片描述

2.6 primal_testing.py

加载模型,进行测试,报错[Errno 2] No such file or directory: 'saved_environments/4_agents_10_size_0_density_id_0_environment.npy',没找到哪里会生成这个文件呀。

2.7 unittest

报错:pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to "None",网上说直接在终端跑就行了
But,在终端跑,也报错your graphic drives do not support OpenGL 2.0,网上查了一下,得有GPU才行。

① 试试这个方法:用虚拟形式的图像渲染方式在server端启动虚拟化的图形渲染;没有用。。。
② 再试试这个pyglet.canvas.xlib.NoSuchDisplayException: Cannot connect to “None” ,装了一半发现也需要GPU才行

重新看了看代码,发现作者一开始是注释掉了from gym.envs.classic_control import rendering 这个包,于是我也注释掉了。。。 可是注释掉之后,就看不见图形化界面了啊。

GroupLock.py

负责多线程:
Python多线程编程(一):threading 模块 Thread 类的用法详解
Python 多线程编程(二):threading 模块中 Lock 类的用法详解
【python】详解threading模块:Condition类的使用(三)

3.报错合集:

  1. 报错 Connection to Python debugger failed: Socket operation on nonsocket: configureBlocking
  2. ImportError: DLL load failed: 找不到指定的模块。 解决办法:重新安装numpy和tensorflow
  3. 运行mapgenerator.py代码时报错:TclError: no display name and no $DISPLAY environment variable,解决办法 手动指定显示设备: ① 终端输入printenv grep DISPLAY,查看版本,我这边输出是localhost:10.0;② 再将root = Tk()改成root = Tk(screenName = ':10.0') ③ import的部分加上import matplotlibmatplotlib.use('Agg')
  • 6
    点赞
  • 26
    收藏
    觉得还不错? 一键收藏
  • 10
    评论
以下是分布式ADMM-Lasso加权分位数回归的MATLAB代码: ```matlab function [beta, history] = distributed_admm_lasso_wq(X, y, rho, alpha, q, weights, groups, max_iter, abstol, reltol, quiet) % Distributed ADMM-Lasso with Weighted Quantile Regression % % [beta, history] = distributed_admm_lasso_wq(X, y, rho, alpha, q, weights, % groups, max_iter, abstol, reltol, quiet) % % Solves the following problem via distributed ADMM-Lasso with Weighted Quantile Regression: % % minimize 1/2*sum(w_i*||y_i - X_i*beta||_2^2) + alpha*sum(norm(w.*beta,1)) % subject to groups'*beta = 0 % % where groups is the group matrix indicating the groups that each predictor variable belongs to. % % The input parameters are: % X - The input data matrix of size n x p % y - The response vector of length n % rho - The augmented Lagrangian parameter % alpha - The regularization parameter for L1 penalty % q - The quantile level for weighted quantile regression penalty % weights - The weight vector of length n for weighted quantile regression penalty % groups - The group matrix of size p x g indicating the groups that each predictor variable belongs to. % Each column of groups should be a binary vector indicating the variables in that group. % max_iter - The maximum number of iterations % abstol - The absolute tolerance for primal and dual residuals % reltol - The relative tolerance for primal and dual residuals % quiet - Set to true to suppress output % % The output values are: % beta - The solution of the optimization problem % history - A structure containing the history of objective function value, primal and dual residuals % % Written by: Salman Asif, Georgia Tech % Email: sasif@gatech.edu % Created: March 2012 if nargin < 11, quiet = false; end if nargin < 10, reltol = 1e-2; end if nargin < 9, abstol = 1e-4; end if nargin < 8, max_iter = 1000; end if nargin < 7, groups = ones(size(X,2),1); end if nargin < 6, weights = ones(size(X,1),1); end if nargin < 5, q = 0.5; end if nargin < 4, alpha = 1; end if nargin < 3, rho = 1; end [n,p] = size(X); g = size(groups,2); % Initializing variables beta = zeros(p,1); z = zeros(p,1); u = zeros(p,1); gamma = ones(g,1); % Precompute group norms for speed norms = zeros(g,1); for i = 1:g norms(i) = norm(X(:,groups(:,i))*beta(groups(:,i))); end % ADMM solver if ~quiet fprintf('%3s\t%10s\t%10s\t%10s\t%10s\t%10s\n','iter', 'r norm', 'eps pri', 's norm', 'eps dual', 'objective'); end for k = 1:max_iter % beta update beta = update_beta_wq(X, y, z, u, rho, alpha, weights, q, groups, gamma); % z update zold = z; for i = 1:p zi = beta(i) + u(i); z(i) = soft_thresh(zi, alpha/rho); end % u update u = u + beta - z; % gamma update u_norms = zeros(g,1); for i = 1:g u_norms(i) = norm(X(:,groups(:,i))*(beta(groups(:,i)) - z(groups(:,i)))); gamma(i) = quantile(u_norms(i)./norms(i), q); end % diagnostics, reporting, termination checks history.objval(k) = objective_wq(X, y, beta, alpha, weights, q); history.r_norm(k) = norm(beta(:) - z(:)); history.s_norm(k) = norm(-rho*(z(:) - zold(:))); history.eps_pri(k) = sqrt(p)*abstol + reltol*max(norm(beta(:)), norm(-z(:))); history.eps_dual(k)= sqrt(p)*abstol + reltol*norm(rho*u(:)); if ~quiet fprintf('%3d\t%10.4f\t%10.4f\t%10.4f\t%10.4f\t%10.4f\n', k, ... history.r_norm(k), history.eps_pri(k), ... history.s_norm(k), history.eps_dual(k), history.objval(k)); end if (history.r_norm(k) < history.eps_pri(k) && ... history.s_norm(k) < history.eps_dual(k)) break; end end if ~quiet if k == max_iter fprintf('WARNING: Maximum number of iterations reached\n'); end end end function beta = update_beta_wq(X, y, z, u, rho, alpha, weights, q, groups, gamma) [n,p] = size(X); g = size(groups,2); XtX = zeros(p); Xty = zeros(p,1); for i = 1:g idx = groups(:,i); Xw = bsxfun(@times, X(:,idx), sqrt(weights)); yw = y.*sqrt(weights); Xwz = Xw*z(idx); XtX(idx,idx) = Xw'*Xw./n + rho*eye(sum(idx)); Xty(idx) = Xw'*yw./n + rho*Xwz - u(idx); end beta = zeros(p,1); for i = 1:g idx = groups(:,i); Xw = bsxfun(@times, X(:,idx), sqrt(weights)); yw = y.*sqrt(weights); beta(idx) = linsolve(XtX(idx,idx), Xty(idx)); beta(idx) = (1-gamma(i))*z(idx) + gamma(i)*beta(idx); end end function obj = objective_wq(X, y, beta, alpha, weights, q) res = y - X*beta; obj = 0.5*sum(weights.*res.^2) + alpha*sum(wq_norm(beta, q, weights)); end function val = wq_norm(x, q, w) n = length(x); val = 0; for i = 1:n val = val + w(i)*abs(x(i)); end val = sum(w.*max(abs(x)-quantile(abs(x),q),0)); end function z = soft_thresh(x,lambda) z = sign(x).*max(abs(x) - lambda, 0); end ```
评论 10
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值