Paper Collection of Multi-Agent Reinforcement Learning (MARL)
多智能体强化学习(MARL)论文集
Multi-Agent Reinforcement Learning is a very interesting research area, which has strong connections with single-agent RL, multi-agent systems, game theory, evolutionary computation and optimization theory, and its application in Large Language Models (LLMs) and Robotics.
多智能体强化学习是一个非常有趣的研究领域,它与单智能体RL、多智能体系统、博弈论、进化计算和优化理论及其在大型语言模型(LLMs)和机器人技术中的应用有着密切的联系。
This is a collection of research and review papers of multi-agent reinforcement learning (MARL). The Papers are sorted by time. Any suggestions and pull requests are welcome.
这是多智能体强化学习(MARL)的研究和综述论文集。论文按时间排序。欢迎任何建议和拉取请求。
The sharing principle of these references here is for research. If any authors do not want their paper to be listed here, please feel free to contact us.
这里这些参考文献的共享原则是用于研究。如果任何作者不希望他们的论文被列在这里,请随时与我们联系。
Overview 概述
- Tutorial 教程
- Review Papers 综述论文
- Research Papers 研究论文
- Framework 框架
- Joint action learning 联合行动学习
- Cooperation and competition合作与竞争
- Coordination 协调
- Security 安全
- Self-Play 自玩
- Learning To Communicate 学会沟通
- Transfer Learning 迁移学习
- Imitation and Inverse Reinforcement Learning模仿和逆强化学习
- Meta Learning 元学习
- Application 应用
- Networked MARL (Decentralized Training Decentralized Execution)网络化MARL(去中心化培训,去中心化执行)
- MARL in LLMs (MARL in Large Language Models)MARL in LLMs (大型语言模型中的 MARL)
- MARL in Robotics (MARL in Robotics)MARL在机器人技术中的应用(MARL in Robotics)
Tutorial and Books
教程和书籍
- Multi-Agent Reinforcement Learning: Foundations and Modern Approaches by Stefano V. Albrecht, Filippos Christianos, Lukas Schäfer, 2023.
多智能体强化学习:基础和现代方法,作者:Stefano V. Albrecht、Filippos Christianos、Lukas Schäfer,2023 年。 - Many-agent Reinforcement Learning by Yaodong Yang, 2021. PhD Thesis.
多智能体强化学习,杨耀东,2021 年。博士论文。 - Deep Multi-Agent Reinforcement Learning by Jakob N Foerster, 2018. PhD Thesis.
深度多智能体强化学习,作者:Jakob N Foerster,2018 年。博士论文。 - Multi-Agent Machine Learning: A Reinforcement Approach by H. M. Schwartz, 2014.
多智能体机器学习:一种强化方法,作者:H. M. Schwartz,2014 年。 - Multiagent Reinforcement Learning by Daan Bloembergen, Daniel Hennes, Michael Kaisers, Peter Vrancx. ECML, 2013.
多智能体强化学习,作者:Daan Bloembergen、Daniel Hennes、Michael Kaisers、Peter Vrancx。ECML,2013 年。 - Multiagent systems: Algorithmic, game-theoretic, and logical foundations by Shoham Y, Leyton-Brown K. Cambridge University Press, 2008.
多智能体系统:算法、博弈论和逻辑基础,作者:Shoham Y,Leyton-Brown K. 剑桥大学出版社,2008 年。
Review Papers
综述论文
- An overview of multi-agent reinforcement learning from game theoretical perspective by Yaodong Yang and Jun Wang. 2020.
从博弈论的角度看多智能体强化学习概述,作者:Yaodong Yang 和 Jun Wang。2020. - A Survey and Critique of Multiagent Deep Reinforcement Learning by Pablo Hernandez-Leal, Bilal Kartal and Matthew E. Taylor. 2019.
Pablo Hernandez-Leal、Bilal Kartal 和 Matthew E. Taylor 对多智能体深度强化学习的调查和批评。2019. - Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms by Kaiqing Zhang, Zhuoran Yang, Tamer Başar. 2019.
Multi-Agent Reinforcement Learning: A Selective Overview of Theories and Algorithms(多智能体强化学习:理论和算法的选择性概述),作者:Kaiqing Zhang、Zhuoran Yang、Tamer Başar。2019. - A Survey on Transfer Learning for Multiagent Reinforcement Learning Systems by Silva, Felipe Leno da; Costa, Anna Helena Reali. JAIR, 2019.
Silva, Felipe Leno da 对多智能体强化学习系统的迁移学习调查;科斯塔,安娜·海伦娜·雷利。JAIR,2019 年。 - Autonomously Reusing Knowledge in Multiagent Reinforcement Learning by Silva, Felipe Leno da; Taylor, Matthew E.; Costa, Anna Helena Reali. IJCAI, 2018.
Silva, Felipe Leno da 在多智能体强化学习中自主重用知识;泰勒,马修 E.;科斯塔,安娜·海伦娜·雷利。IJCAI, 2018. - Deep Reinforcement Learning Variants of Multi-Agent Learning Algorithms by Castaneda A O. 2016.
多智能体学习算法的深度强化学习变体,作者:Castaneda,A.,O.,2016 年。 - Evolutionary Dynamics of Multi-Agent Learning: A Survey by Bloembergen, Daan, et al. JAIR, 2015.
多智能体学习的进化动力学:Bloembergen、Daan 等人的调查,JAIR,2015 年。 - Game theory and multi-agent reinforcement learning by Nowé A, Vrancx P, De Hauwere Y M. Reinforcement Learning. Springer Berlin Heidelberg, 2012.
博弈论和多智能体强化学习,作者:Nowé A、Vrancx P、De Hauwere、Y、M. 强化学习。施普林格柏林,海德堡,2012 年。 - Multi-agent reinforcement learning: An overview by Buşoniu L, Babuška R, De Schutter B. Innovations in multi-agent systems and applications-1. Springer Berlin Heidelberg, 2010
多智能体强化学习:Buşoniu L、Babuška R、De Schutter B. 多智能体系统和应用的创新概述-1.施普林格柏林,海德堡,2010 - A comprehensive survey of multi-agent reinforcement learning by Busoniu L, Babuska R, De Schutter B. IEEE Transactions on Systems Man and Cybernetics Part C Applications and Reviews, 2008
多智能体强化学习的综合综述,作者:Busoniu L、Babuska R、De Schutter B. IEEE Transactions on Systems Man: and Control netics Part C Applications and Reviews,2008 年 - If multi-agent learning is the answer, what is the question? by Shoham Y, Powers R, Grenager T. Artificial Intelligence, 2007.
如果多智能体学习是答案,那么问题是什么?作者:Shoham Y、Powers R、Grenager T. 人工智能,2007 年。 - From single-agent to multi-agent reinforcement learning: Foundational concepts and methods by Neto G. Learning theory course, 2005.
从单智能体到多智能体强化学习:基础概念和方法,作者:Neto G.学习理论课程,2005年。 - Evolutionary game theory and multi-agent reinforcement learning by Tuyls K, Nowé A. The Knowledge Engineering Review, 2005.
进化博弈论和多智能体强化学习 作者:Tuyls K, Nowé A.知识工程评论,2005 年。 - An Overview of Cooperative and Competitive Multiagent Learning by Pieter Jan ’t HoenKarl TuylsLiviu PanaitSean LukeJ. A. La Poutré. AAMAS's workshop LAMAS, 2005.
合作和竞争性多智能体学习概述,作者:Pieter Jan 't HoenKarl TuylsLiviu PanaitSean LukeJ.A.拉普特雷。AAMAS的研讨会LAMAS,2005年。 - Cooperative multi-agent learning: the state of the art by Liviu Panait and Sean Luke, 2005.
合作式多智能体学习:Liviu Panait 和 Sean Luke 的艺术状态,2005 年。
Research Papers
研究论文
MARL in LLMs MARL 在LLMs
- Large language model based multi-agents: A survey of progress and challenges by Guo, Taicheng, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, and Xiangliang Zhang. 2024.
基于大型语言模型的多智能体:进展和挑战调查,作者:Guo、Taicheng、Xiuying Chen、Yaqi Wang、Ruidi Chang、Shichao Pei、Nitesh V. Chawla、Olaf Wiest 和 Xiangliang Zhang。2024. - Leveraging Large Language Models for Optimised Coordination in Textual Multi-Agent Reinforcement Learning by Slumbers, Oliver, David Henry Mguni, Kun Shao, and Jun Wang. 2024.
利用大型语言模型优化文本多智能体强化学习中的协调,作者:Slumbers、Oliver、David Henry Mguni、Kun Shao 和 Jun Wang。2024. - Theory of mind for multi-agent collaboration via large language models by Li, Huao, Yu Quan Chong, Simon Stepputtis, Joseph Campbell, Dana Hughes, Michael Lewis, and Katia Sycara. 2023.
Li、Huao、Yu Quan Chong、Simon Stepputtis、Joseph Campbell、Dana Hughes、Michael Lewis 和 Katia Sycara 通过大型语言模型进行多智能体协作的心智理论。2023.
Framework 框架
- Multi-Agent Constrained Policy Optimisation by Shangding Gu, Jakub Grudzien Kuba, Munning Wen, Ruiqing Chen, Ziyan Wang, Zheng Tian, Jun Wang, Alois Knoll, and Yaodong Yang, 2021.
多智能体约束策略优化研究 Shangding Gu, Jakub Grudzien Kuba, Munning 温, 陈瑞清, 王紫燕, Zheng Tian, 王军, Alois Knoll, and Yaodong Yang, 2021. - Settling the Variance of Multi-Agent Policy Gradients by Kuba Jakub, Muning Wen, Linghui Meng, Shangding Gu, Haifeng Zhang, David Mguni, Jun Wang, and Yaodong Yang, NIPS 2021.
Settle the Variance of Multi-Agent Policy Gradients,作者:Kuba Jakub、Muning 温、Linghui Meng、Shangding Gu、Haifeng Zhang、David Mguni、Jun Wang和Yaodong Yang,NIPS 2021。 - QMIX: Monotonic Value Function Factorisation for Deep Multi-Agent Reinforcement Learning by Tabish Rashid, Mikayel Samvelyan, Christian Schroeder de Witt, Gregory Farquhar, Jakob Foerster, Shimon Whiteson. ICML 2018.
QMIX:用于深度多智能体强化学习的单调值函数分解,作者:Tabish Rashid、Mikayel Samvelyan、Christian Schroeder de Witt、Gregory Farquhar、Jakob Foerster、Shimon Whiteson。ICML 2018 年。 - Mean Field Multi-Agent Reinforcement Learning by Yaodong Yang, Rui Luo, Minne Li, Ming Zhou, Weinan Zhang, and Jun Wang. ICML 2018.
Mean Field Multi-Ricforcement Learning,作者:Yaodong Yang、Rui Luo、Minne Li、Ming 周、Weinan Zhang和Jun Wang。ICML 2018 年。 - Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments by Lowe R, Wu Y, Tamar A, et al. arXiv, 2017.
混合合作-竞争环境的多主体参与者-批评家 作者:Lowe R、Wu Y、Tamar A 等人,arXiv,2017 年。 - Deep Decentralized Multi-task Multi-Agent RL under Partial Observability by Omidshafiei S, Pazis J, Amato C, et al. arXiv, 2017.
部分可观察性下的深度去中心化多任务多智能体 RL,作者:Omidshafiei S、Pazis J、Amato C 等人,arXiv,2017 年。 - Multiagent Bidirectionally-Coordinated Nets for Learning to Play StarCraft Combat Games by Peng P, Yuan Q, Wen Y, et al. arXiv, 2017.
用于学习玩《星际争霸》战斗游戏的多智能体双向协调网络,作者:Peng P, Yuan Q, 温 Y, et al. arXiv, 2017. - Robust Adversarial Reinforcement Learning by Lerrel Pinto, James Davidson, Rahul Sukthankar, Abhinav Gupta. arXiv, 2017.
Lerrel Pinto、James Davidson、Rahul Sukthankar、Abhinav Gupta 的鲁棒对抗性强化学习。arXiv,2017 年。 - Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning by Foerster J, Nardelli N, Farquhar G, et al. arXiv, 2017.
Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning 作者:Foerster J、Nardelli N、Farquhar G 等人,arXiv,2017 年。 - Multiagent reinforcement learning with sparse interactions by negotiation and knowledge transfer by Zhou L, Yang P, Chen C, et al. IEEE transactions on cybernetics, 2016.
多智能体强化学习与稀疏交互的协商和知识转移 by 周 L, 杨 P, 陈 C, et al. IEEE Transactions on Cybernetics, 2016. - Decentralised multi-agent reinforcement learning for dynamic and uncertain environments by Marinescu A, Dusparic I, Taylor A, et al. arXiv, 2014.
Marinescu A、Dusparic I、Taylor A 等人针对动态和不确定环境的去中心化多智能体强化学习,arXiv,2014 年。 - CLEANing the reward: counterfactual actions to remove exploratory action noise in multiagent learning by HolmesParker C, Taylor M E, Agogino A, et al. AAMAS, 2014.
清除奖励:消除多智能体学习中探索性动作噪声的反事实行动,作者:HolmesParker C、Taylor M E、Agogino A 等人,AAMAS,2014 年。 - Bayesian reinforcement learning for multiagent systems with state uncertainty by Amato C, Oliehoek F A. MSDM Workshop, 2013.
具有状态不确定性的多智能体系统的贝叶斯强化学习,作者:Amato C,Oliehoek F A. MSDM 研讨会,2013 年。 - Multiagent learning: Basics, challenges, and prospects by Tuyls, Karl, and Gerhard Weiss. AI Magazine, 2012.
多智能体学习:Tuyls、Karl 和 Gerhard Weiss 的基础知识、挑战和前景。人工智能杂志,2012 年。 - Classes of multiagent q-learning dynamics with epsilon-greedy exploration by Wunder M, Littman M L, Babes M. ICML, 2010.
Wunder M、Littman M L、Babes M. ICML,2010 年通过 epsilon 贪婪探索的多智能体 q 学习动力学类。 - Conditional random fields for multi-agent reinforcement learning by Zhang X, Aberdeen D, Vishwanathan S V N. ICML, 2007.
用于多智能体强化学习的条件随机场 by Zhang X, Aberdeen D, Vishwanathan S V N. ICML, 2007. - Multi-agent reinforcement learning using strategies and voting by Partalas, Ioannis, Ioannis Feneris, and Ioannis Vlahavas. ICTAI, 2007.
使用 Partalas、Ioannis、Ioannis Feneris 和 Ioannis Vlahavas 的策略和投票进行多智能体强化学习。ICTAI, 2007. - A reinforcement learning scheme for a partially-observable multi-agent game by Ishii S, Fujita H, Mitsutake M, et al. Machine Learning, 2005.
Ishii S、Fujita H、Mitsutake M 等人撰写的部分可观察多智能体博弈的强化学习方案,机器学习,2005 年。 - Asymmetric multiagent reinforcement learning by Könönen V. Web Intelligence and Agent Systems, 2004.
非对称多智能体强化学习,Könönen V. Web Intelligence and Agent Systems,2004 年。 - Adaptive policy gradient in multiagent learning by Banerjee B, Peng J. AAMAS, 2003.
多智能体学习中的自适应策略梯度,作者:Banerjee B, Peng J. AAMAS, 2003. - Reinforcement learning to play an optimal Nash equilibrium in team Markov games by Wang X, Sandholm T. NIPS, 2002.
在团队马尔可夫博弈中发挥最优纳什均衡的强化学习,作者:Wang X, Sandholm T. NIPS, 2002. - Multiagent learning using a variable learning rate by Michael Bowling and Manuela Veloso, 2002.
Michael Bowling 和 Manuela Veloso 使用可变学习率的多智能体学习,2002 年。 - Value-function reinforcement learning in Markov game by Littman M L. Cognitive Systems Research, 2001.
马尔可夫博弈中的价值函数强化学习,作者:Littman,M.L.,认知系统研究,2001年。 - Hierarchical multi-agent reinforcement learning by Makar, Rajbala, Sridhar Mahadevan, and Mohammad Ghavamzadeh. The fifth international conference on Autonomous agents, 2001.
Makar、Rajbala、Sridhar Mahadevan 和 Mohammad Ghavamzadeh 的分层多智能体强化学习。第五届自主代理国际会议,2001年。 - An analysis of stochastic game theory for multiagent reinforcement learning by Michael Bowling and Manuela Veloso, 2000.
Michael Bowling 和 Manuela Veloso 对多智能体强化学习的随机博弈论的分析,2000 年。
Joint action learning 联合行动学习
- AWESOME: A general multiagent learning algorithm that converges in self-play and learns a best response against stationary opponents by Conitzer V, Sandholm T. Machine Learning, 2007.
AWESOME: 一种通用的多智能体学习算法,在自我游戏中收敛并学习对静止对手的最佳反应,作者:Conitzer V, Sandholm T. Machine Learning, 2007. - Extending Q-Learning to General Adaptive Multi-Agent Systems by Tesauro, Gerald. NIPS, 2003.
将 Q-Learning 扩展到通用自适应多智能体系统,作者:Tesauro、Gerald。NIPS,2003年。 - Multiagent reinforcement learning: theoretical framework and an algorithm. by Hu, Junling, and Michael P. Wellman. ICML, 1998.
多智能体强化学习:理论框架和算法。作者:胡、俊玲和迈克尔·威尔曼。ICML,1998年。 - The dynamics of reinforcement learning in cooperative multiagent systems by Claus C, Boutilier C. AAAI, 1998.
合作多智能体系统中强化学习的动力学,作者:Claus C, Boutilier C. AAAI, 1998. - Markov games as a framework for multi-agent reinforcement learning by Littman, Michael L. ICML, 1994.
马尔可夫博弈作为多智能体强化学习的框架,作者:Littman,Michael L. ICML,1994。
Cooperation and competition
合作与竞争
- Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning by Shunyu Liu, Jie Song, Yihe Zhou, Na Yu, Kaixuan Chen, Zunlei Feng, Mingli Song. TPAMI, 2024.
多智能体强化学习的交互模式解缠,作者:Shunyu Liu, Jie Song, Yihe 周, Na Yu, Kaixuan Chen, Zunlei Feng, Mingli Song.TPAMI,2024 年。 - Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition by Shunyu Liu, Yihe Zhou, Jie Song, Tongya Zheng, Kaixuan Chen, Tongtian Zhu, Zunlei Feng, Mingli Song. AAAI, 2023.
Contrastive Identity-Aware Learning for Multi-Agent Value Decomposition,作者:Shunyu Liu, Yihe Zhou 周, Jie Song, Tongya Zheng, Kaixuan Chen, Tongtian Zhu, Zunlei Feng, Mingli Song.AAAI,2023 年。 - Is Centralized Training with Decentralized Execution Framework Centralized Enough for MARL? by Yihe Zhou, Shunyu Liu, Yunpeng Qing, Kaixuan Chen, Tongya Zheng, Yanhao Huang, Jie Song, Mingli Song. 2023.
对于MARL来说,具有分散执行框架的集中式训练是否足够集中?作者:周义和、刘顺宇、青云鹏、陈凯旋、郑彤亚、黄彦浩、宋杰、宋明丽。2023. - Multi-Agent Reinforcement Learning is a Sequence Modeling Problem, by Wen, Muning, Jakub Grudzien Kuba, Runji Lin, Weinan Zhang, Ying Wen, Jun Wang, and Yaodong Yang, 2022.
多智能体强化学习是一个序列建模问题,作者:温、穆宁、Jakub Grudzien Kuba、林润吉、张伟南、温莹、王军和杨耀东,2022 年。 - The Complexity of Markov Equilibrium in Stochastic Games by Daskalakis, Constantinos, Noah Golowich, and Kaiqing Zhang, 2022.
随机博弈中马尔可夫均衡的复杂性,作者:Daskalakis、Constantinos、Noah Golowich 和 Kaiqing Zhang,2022 年。 - Trust region policy optimisation in multi-agent reinforcement learning by Kuba, Jakub Grudzien, Ruiqing Chen, Munning Wen, Ying Wen, Fanglei Sun, Jun Wang, and Yaodong Yang, ICLR 2022.
Kuba, Jakub Grudzien, Ruiqing Chen, Munning 温, Ying 温, 孙方磊, 王军, 和杨耀东, ICLR 2022. - The Surprising Effectiveness of PPO in Cooperative, Multi-Agent Games by Chao Yu, Akash Velu, Eugene Vinitsky, Yu Wang, Alexandre Bayen, Yi Wu, 2021.
PPO 在合作、多智能体博弈中的惊人效果,作者:Chao Yu、Akash Velu、Eugene Vinitsky、Yu Wang、Alexandre Bayen、Yi Wu,2021 年。 - Human-level performance in 3D multiplayer games with population-based reinforcement learning by Max Jaderberg, Wojciech M. Czarnecki, Iain Dunning, et al. Science 364.6443: 859-865, 2019.
Max Jaderberg、Wojciech M. Czarnecki、Iain Dunning 等人通过基于群体的强化学习在 3D 多人游戏中的人类水平表现,科学 364.6443:859-865,2019 年。 - Emergent complexity through multi-agent competition by Trapit Bansal, Jakub Pachocki, Szymon Sidor, Ilya Sutskever, Igor Mordatch, 2018.
Trapit Bansal、Jakub Pachocki、Szymon Sidor、Ilya Sutskever、Igor Mordatch 通过多代理竞争出现的复杂性,2018 年。 - Learning with opponent learning awareness by Jakob Foerster, Richard Y. Chen2, Maruan Al-Shedivat, Shimon Whiteson, Pieter Abbeel, Igor Mordatch, 2018.
Jakob Foerster、Richard Y. Chen2、Maruan Al-Shedivat、Shimon Whiteson、Pieter Abbeel、Igor Mordatch,2018 年。 - Multi-agent Reinforcement Learning in Sequential Social Dilemmas by Leibo J Z, Zambaldi V, Lanctot M, et al. arXiv, 2017. [Post]
顺序社会困境中的多智能体强化学习,作者:Leibo J Z、Zambaldi V、Lanctot M 等人,arXiv,2017 年。[帖子] - Cooperative Multi-Agent Control Using Deep Reinforcement Learning by Gupta, J. K., Egorov, M., & Kochenderfer, M. AAMAS 2017.
Gupta, J. K., Egorov, M., & Kochenderfer, M. AAMAS 2017 使用深度强化学习进行合作多智能体控制。 - Reinforcement Learning in Partially Observable Multiagent Settings: Monte Carlo Exploring Policies with PAC Bounds by Roi Ceren, Prashant Doshi, and Bikramjit Banerjee, pp. 530-538, AAMAS 2016.
部分可观察多智能体设置中的强化学习:蒙特卡洛探索具有 PAC 边界的政策,作者:Roi Ceren、Prashant Doshi 和 Bikramjit Banerjee,第 530-538 页,AAMAS 2016。 - Opponent Modeling in Deep Reinforcement Learning by He H, Boyd-Graber J, Kwok K, et al. ICML, 2016.
深度强化学习中的对手建模,作者:He H、Boyd-Graber J、Kwok K 等人,ICML,2016 年。 - Multiagent cooperation and competition with deep reinforcement learning by Tampuu A, Matiisen T, Kodelja D, et al. arXiv, 2015.
Tampuu A、Matiisen T、Kodelja D 等人的深度强化学习多智能体合作与竞争,arXiv,2015 年。 - Emotional multiagent reinforcement learning in social dilemmas by Yu C, Zhang M, Ren F. International Conference on Principles and Practice of Multi-Agent Systems, 2013.
社会困境中的情感多智能体强化学习 作者:Yu C, Zhang M, 任 F. 多智能体系统原理与实践国际会议, 2013. - Multi-agent reinforcement learning in common interest and fixed sum stochastic games: An experimental study by Bab, Avraham, and Ronen I. Brafman. Journal of Machine Learning Research, 2008.
共同兴趣和固定和随机博弈中的多智能体强化学习:Bab、Avraham 和 Ronen I. Brafman 的一项实验研究。机器学习研究学报, 2008. - Combining policy search with planning in multi-agent cooperation by Ma J, Cameron S. Robot Soccer World Cup, 2008.
《在多智能体合作中将政策搜索与规划相结合》,作者:马 J,Cameron S.,机器人足球世界杯,2008年。 - Collaborative multiagent reinforcement learning by payoff propagation by Kok J R, Vlassis N. JMLR, 2006.
通过收益传播的协作式多智能体强化学习,作者:Kok J R, Vlassis N. JMLR, 2006. - Learning to cooperate in multi-agent social dilemmas by de Cote E M, Lazaric A, Restelli M. AAMAS, 2006.
在多主体社会困境中学习合作,作者:de Cote E M、Lazaric A、Restelli M. AAMAS,2006 年。 - Learning to compete, compromise, and cooperate in repeated general-sum games by Crandall J W, Goodrich M A. ICML, 2005.
在重复的一般和博弈中学习竞争、妥协和合作,作者:Crandall J W, Goodrich M A. ICML, 2005. - Sparse cooperative Q-learning by Kok J R, Vlassis N. ICML, 2004.
稀疏合作 Q-learning,作者:Kok J R, Vlassis N. ICML, 2004. - Global Convergence of Multi-Agent Policy Gradient in Markov Potential Games by Leonardos, Stefanos, Will Overman, Ioannis Panageas, and Georgios Piliouras. 2021
Leonardos、Stefanos、Will Overman、Ioannis Panageas 和 Georgios Piliouras 在马尔可夫潜力博弈中多智能体政策梯度的全球收敛。2021 - Markov α-Potential Games: Equilibrium Approximation and Regret Analysis by Xin G, et al, 2023
马尔可夫α势博弈:均衡近似与遗憾分析,作者:Xin G 等人,2023 年 - A Natural Actor-Critic Framework for Zero-Sum Markov Games Ahmet A. et al, ICML, 2022
零和马尔可夫博弈的自然行动者-批评者框架Ahmet A. 等人,ICML,2022 年
Coordination 协调
- Collaborating with Humans without Human Data by DJ Strouse, Kevin R. McKee, Matt Botvinick, Edward Hughes, Richard Everett. NeurIPS 2021.
DJ Strouse、Kevin R. McKee、Matt Botvinick、Edward Hughes、Richard Everett 的《与没有人类数据的人类合作》。神经IPS 2021。 - Coordinated Multi-Agent Imitation Learning by Le H M, Yue Y, Carr P. arXiv, 2017.
协调多智能体模仿学习 作者:Le H M, Yue Y, Carr P. arXiv, 2017. - Reinforcement social learning of coordination in networked cooperative multiagent systems by Hao J, Huang D, Cai Y, et al. AAAI Workshop, 2014.
Reinforcement Social Learning of coordination in networked cooperative multiagent systems,作者:Hao J, Huang D, Cai Y, et al. AAAI Workshop, 2014. - Coordinating multi-agent reinforcement learning with limited communication by Zhang, Chongjie, and Victor Lesser. AAMAS, 2013.
协调多智能体强化学习与有限的交流 作者:Zhang、Chongjie 和 Victor Lesser。AAMAS,2013 年。 - Coordination guided reinforcement learning by Lau Q P, Lee M L, Hsu W. AAMAS, 2012.
协调引导的强化学习,作者:Lau Q P, Lee M L, Hsu W. AAMAS, 2012. - Coordination in multiagent reinforcement learning: a Bayesian approach by Chalkiadakis G, Boutilier C. AAMAS, 2003.
多智能体强化学习中的协调:贝叶斯方法,作者:Chalkiadakis G、Boutilier C. AAMAS,2003 年。 - Coordinated reinforcement learning by Guestrin C, Lagoudakis M, Parr R. ICML, 2002.
Guestrin C、Lagoudakis M、Parr R. ICML,2002 年的协调强化学习。 - Reinforcement learning of coordination in cooperative multi-agent systems by Kapetanakis S, Kudenko D. AAAI/IAAI, 2002.
合作多智能体系统中协调的强化学习,作者:Kapetanakis S, Kudenko D. AAAI/IAAI, 2002.
Security 安全
- Markov Security Games: Learning in Spatial Security Problems by Klima R, Tuyls K, Oliehoek F. The Learning, Inference and Control of Multi-Agent Systems at NIPS, 2016.
马尔可夫安全博弈:在空间安全问题中的学习,作者:Klima R、Tuyls K、Oliehoek F。NIPS 多智能体系统的学习、推理和控制,2016 年。 - Cooperative Capture by Multi-Agent using Reinforcement Learning, Application for Security Patrol Systems by Yasuyuki S, Hirofumi O, Tadashi M, et al. Control Conference (ASCC), 2015
使用强化学习的多智能体协同捕获,Yasuyuki S、Hirofumi O、Tadashi M 等人的安全巡逻系统应用,控制会议 (ASCC),2015 年 - Improving learning and adaptation in security games by exploiting information asymmetry by He X, Dai H, Ning P. INFOCOM, 2015.
利用信息不对称提高安全博弈的学习和适应能力,作者:He X, Dai H, Ning P. INFOCOM, 2015.
Self-Play 自玩
- A Comparison of Self-Play Algorithms Under a Generalized Framework by Daniel Hernandez, Kevin Denamganai, Sam Devlin, et al. IEEE Transactions on Games 2021
A Comparison of Self-Play Algorithms under a Generalized Frameworks,作者:Daniel Hernandez、Kevin Denamganai、Sam Devlin 等人,IEEE Transactions on Games 2021 - A Unified Game-Theoretic Approach to Multiagent Reinforcement Learning by Marc Lanctot, Vinicius Zambaldi, Audrunas Gruslys, Angeliki Lazaridou, Karl Tuyls, Julien Perolat, David Silver, Thore Graepel. NIPS 2017.
Marc Lanctot、Vinicius Zambaldi、Audrunas Gruslys、Angeliki Lazaridou、Karl Tuyls、Julien Perolat、David Silver、Thore Graepel 的多智能体强化学习的统一博弈论方法。NIPS 2017. - Deep reinforcement learning from self-play in imperfect-information games by Heinrich, Johannes, and David Silver. arXiv, 2016.
Heinrich、Johannes 和 David Silver 在不完全信息博弈中的自我游戏中的深度强化学习。arXiv,2016 年。 - Fictitious Self-Play in Extensive-Form Games by Heinrich, Johannes, Marc Lanctot, and David Silver. ICML, 2015.
海因里希、约翰内斯、马克·兰克托和大卫·西尔弗在广泛形式游戏中的虚构自我游戏。ICML,2015 年。
Learning To Communicate 学会沟通
- Learning to ground multi-agent communication with autoencoders by Lin, Toru, Jacob Huh, Christopher Stauffer, Ser Nam Lim, and Phillip Isola. 2021.
学习使用自动编码器进行多智能体通信,作者是 Lin、Toru、Jacob Huh、Christopher Stauffer、Ser Nam Lim 和 Phillip Isola。2021. - Emergent Communication through Negotiation by Kris Cao, Angeliki Lazaridou, Marc Lanctot, Joel Z Leibo, Karl Tuyls, Stephen Clark, 2018.
Kris Cao、Angeliki Lazaridou、Marc Lanctot、Joel Z Leibo、Karl Tuyls、Stephen Clark 的《通过谈判进行紧急沟通》,2018 年。 - Emergence of Linguistic Communication From Referential Games with Symbolic and Pixel Input by Angeliki Lazaridou, Karl Moritz Hermann, Karl Tuyls, Stephen Clark. ICLR 2018.
Angeliki Lazaridou、Karl Moritz Hermann、Karl Tuyls、Stephen Clark 从具有符号和像素输入的指涉游戏中出现语言交流。ICLR 2018 年。 - Emergence of Language with Multi-agent Games: Learning to Communicate with Sequences of Symbols by Serhii Havrylov, Ivan Titov. ICLR Workshop, 2017.
多智能体博弈的语言出现:学习与符号序列进行交流,作者:Serhii Havrylov,Ivan Titov。ICLR 研讨会,2017 年。 - Learning Cooperative Visual Dialog Agents with Deep Reinforcement Learning by Abhishek Das, Satwik Kottur, et al. arXiv, 2017.
Abhishek Das、Satwik Kottur 等人 ArXiv,2017 年,通过深度强化学习学习学习合作视觉对话代理。 - Emergence of Grounded Compositional Language in Multi-Agent Populations by Igor Mordatch, Pieter Abbeel. arXiv, 2017. [Post]
Igor Mordatch, Pieter Abbeel 在多智能体群体中出现扎根组合语言。arXiv,2017 年。[帖子] - Cooperation and communication in multiagent deep reinforcement learning by Hausknecht M J. 2017.
多智能体深度强化学习中的合作与交流,Hausknecht, M., J., 2017. - Multi-agent cooperation and the emergence of (natural) language by Lazaridou A, Peysakhovich A, Baroni M. arXiv, 2016.
多代理合作和(自然)语言的出现,作者:Lazaridou A, Peysakhovich A, Baroni M. arXiv, 2016. - Learning to communicate to solve riddles with deep distributed recurrent q-networks by Foerster J N, Assael Y M, de Freitas N, et al. arXiv, 2016.
学习沟通以解决具有深度分布式递归 q 网络的谜语,作者:Foerster J N、Assael Y M、de Freitas N 等人,arXiv,2016 年。 - Learning to communicate with deep multi-agent reinforcement learning by Foerster J, Assael Y M, de Freitas N, et al. NIPS, 2016.
通过Foerster J,Assael Y M,de Freitas N等人学习通过深度多智能体强化学习进行交流,NIPS,2016年。 - Learning multiagent communication with backpropagation by Sukhbaatar S, Fergus R. NIPS, 2016.
学习具有反向传播的多智能体通信,作者:Sukhbaatar S、Fergus R. NIPS,2016 年。 - Efficient distributed reinforcement learning through agreement by Varshavskaya P, Kaelbling L P, Rus D. Distributed Autonomous Robotic Systems, 2009.
通过Varshavskaya P, Kaelbling L P, Rus D.的协议进行有效的分布式强化学习。 分布式自主机器人系统,2009年。
Transfer Learning 迁移学习
- Simultaneously Learning and Advising in Multiagent Reinforcement Learning by Silva, Felipe Leno da; Glatt, Ruben; and Costa, Anna Helena Reali. AAMAS, 2017.
Silva, Felipe Leno da 的《多智能体强化学习的同时学习和咨询》;格拉特,鲁本;和科斯塔,安娜·海伦娜·雷利。AAMAS,2017 年。 - Accelerating Multiagent Reinforcement Learning through Transfer Learning by Silva, Felipe Leno da; and Costa, Anna Helena Reali. AAAI, 2017.
通过迁移学习加速多智能体强化学习,作者:Silva、Felipe Leno da;和科斯塔,安娜·海伦娜·雷利。AAAI,2017 年。 - Accelerating multi-agent reinforcement learning with dynamic co-learning by Garant D, da Silva B C, Lesser V, et al. Technical report, 2015
通过动态协同学习加速多智能体强化学习,作者:Garant D、da Silva B C、Lesser V 等人。技术报告,2015 - Transfer learning in multi-agent systems through parallel transfer by Taylor, Adam, et al. ICML, 2013.
通过并行迁移在多智能体系统中进行迁移学习,作者:Taylor, Adam, et al. ICML, 2013. - Transfer learning in multi-agent reinforcement learning domains by Boutsioukis, Georgios, Ioannis Partalas, and Ioannis Vlahavas. European Workshop on Reinforcement Learning, 2011.
Boutsioukis、Georgios、Ioannis Partalas 和 Ioannis Vlahavas 在多智能体强化学习领域的迁移学习。欧洲强化学习研讨会,2011 年。 - Transfer Learning for Multi-agent Coordination by Vrancx, Peter, Yann-Michaël De Hauwere, and Ann Nowé. ICAART, 2011.
Vrancx、Peter、Yann-Michaël de Hauwere 和 Ann Nowé 的多智能体协调迁移学习。ICAART,2011 年。
Imitation and Inverse Reinforcement Learning
模仿和逆强化学习
- On the Utility of Learning about Humans for Human-AI Coordination by Micah Carroll, Rohin Shah, Mark K. Ho, Thomas L. Griffiths, Sanjit A. Seshia, Pieter Abbeel, Anca Dragan. NeurIPS 2019.
关于了解人类对人类协调的效用,作者:Micah Carroll、Rohin Shah、Mark K. Ho、Thomas L. Griffiths、Sanjit A. Seshia、Pieter Abbeel、Anca Dragan。神经IPS 2019。 - Multi-Agent Adversarial Inverse Reinforcement Learning by Lantao Yu, Jiaming Song, Stefano Ermon. ICML 2019.
多智能体对抗性逆强化学习,作者:Lantao Yu、Jiaming Song、Stefano Ermon。ICML 2019 年。 - Multi-Agent Generative Adversarial Imitation Learning by Jiaming Song, Hongyu Ren, Dorsa Sadigh, Stefano Ermon. NeurIPS 2018.
多智能体生成对抗性模仿学习,作者:Jiaming Song、Hongyu 任、Dorsa Sadigh、Stefano Ermon。NeurIPS 2018 年。 - Cooperative inverse reinforcement learning by Hadfield-Menell D, Russell S J, Abbeel P, et al. NIPS, 2016.
Hadfield-Menell D、Russell S J、Abbeel P 等人的合作逆强化学习,NIPS,2016 年。 - Comparison of Multi-agent and Single-agent Inverse Learning on a Simulated Soccer Example by Lin X, Beling P A, Cogill R. arXiv, 2014.
模拟足球示例中多智能体和单智能体逆向学习的比较,作者:Lin X、Beling P A、Cogill R. arXiv,2014 年。 - Multi-agent inverse reinforcement learning for zero-sum games by Lin X, Beling P A, Cogill R. arXiv, 2014.
零和博弈的多智能体逆强化学习,作者:Lin X、Beling P A、Cogill R. arXiv,2014 年。 - Multi-robot inverse reinforcement learning under occlusion with interactions by Bogert K, Doshi P. AAMAS, 2014.
Bogert, K, Doshi, P. AAMAS, 2014. - Multi-agent inverse reinforcement learning by Natarajan S, Kunapuli G, Judah K, et al. ICMLA, 2010.
多智能体逆强化学习,作者:Natarajan S、Kunapuli G、Judah K 等人,ICMLA,2010 年。
Meta Learning 元学习
- Continuous Adaptation via Meta-Learning in Nonstationary and Competitive Environments by l-Shedivat, M. 2018.
在非平稳和竞争环境中通过元学习进行持续适应,作者:L-Shedivat,M.,2018 年。
Application 应用
- MuZero with Self-competition for Rate Control in VP9 Video Compression by Amol Mandhane, Anton Zhernov, Maribeth Rauh, Chenjie Gu, et al. arXiv 2022.
MuZero with Self-competition for Rate Control in VP9 Video Compression,作者:Amol Mandhane、Anton Zhernov、Maribeth Rauh、Chenjie Gu 等人,arXiv 2022。 - MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence by Zheng L et al. NIPS 2017 & AAAI 2018 Demo. (Github Page)
MAgent: A Many-Agent Reinforcement Learning Platform for Artificial Collective Intelligence,作者:Zheng L et al. NIPS 2017 和 AAAI 2018 Demo。(Github页面) - Collaborative Deep Reinforcement Learning for Joint Object Search by Kong X, Xin B, Wang Y, et al. arXiv, 2017.
Collaborative Deep Reinforcement Learning for Joint Object Search by Kong X, Xin B, Wang Y, et al. arXiv, 2017. - Multi-Agent Stochastic Simulation of Occupants for Building Simulation by Chapman J, Siebers P, Darren R. Building Simulation, 2017.
Chapman J、Siebers P、Darren R. Building Simulation,2017 年。 - Extending No-MASS: Multi-Agent Stochastic Simulation for Demand Response of residential appliances by Sancho-Tomás A, Chapman J, Sumner M, Darren R. Building Simulation, 2017.
扩展 No-MASS:住宅电器需求响应的多智能体随机模拟,作者:Sancho-Tomás A、Chapman J、Sumner M、Darren R. 建筑模拟,2017 年。 - Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving by Shalev-Shwartz S, Shammah S, Shashua A. arXiv, 2016.
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving,作者:Shalev-Shwartz S、Shammah S、Shashua A. arXiv,2016 年。 - Applying multi-agent reinforcement learning to watershed management by Mason, Karl, et al. Proceedings of the Adaptive and Learning Agents workshop at AAMAS, 2016.
将多智能体强化学习应用于流域管理,Mason、Karl 等人。AAMAS的适应性和学习代理研讨会论文集,2016年。 - Crowd Simulation Via Multi-Agent Reinforcement Learning by Torrey L. AAAI, 2010.
通过多智能体强化学习进行人群模拟,作者:Torrey L. AAAI,2010 年。 - Traffic light control by multiagent reinforcement learning systems by Bakker, Bram, et al. Interactive Collaborative Information Systems, 2010.
Bakker、Bram 等人通过多智能体强化学习系统控制交通信号灯,交互式协作信息系统,2010 年。 - Multiagent reinforcement learning for urban traffic control using coordination graphs by Kuyer, Lior, et al. oint European Conference on Machine Learning and Knowledge Discovery in Databases, 2008.
Kuyer, Lior, et al. oint European Conference on Machine Learning and Knowledge Discovery in Databases, 2008. 使用协调图进行城市交通控制的多智能体强化学习。 - A multi-agent Q-learning framework for optimizing stock trading systems by Lee J W, Jangmin O. DEXA, 2002.
用于优化股票交易系统的多智能体 Q 学习框架,作者:Lee J W、Jangmin O. DEXA,2002 年。 - Multi-agent reinforcement learning for traffic light control by Wiering, Marco. ICML. 2000.
Wiering, Marco 的交通信号灯控制多智能体强化学习。ICML.2000年。
Networked MARL 联网 MARL
- QD-Learning: A Collaborative Distributed Strategy for Multi-Agent Reinforcement Learning Through Consensus Innovations by Kar, Soummya and Moura, José M. F. and Poor, H. Vincent. IEEE Transactions on Signal Processing 2013.
QD-Learning:通过 Kar、Soummya 和 Moura、José MF 和 Poor、H. Vincent 的共识创新实现多智能体强化学习的协作分布式策略。IEEE信号处理汇刊,2013年。 - Fully Decentralized Multi-Agent Reinforcement Learning with Networked Agents by Kaiqing Zhang, Zhuoran Yang, Han Liu, Tong Zhang, Tamer Basar. ICML 2018.
Fully Decentralized Multi-Agents Reinforcement Learning with Networked Agents,作者:Kaiqing Zhang, Zhuoran Yang, Han Liu, Tong Zhang, Tamer Basar.ICML 2018 年。 - Value Propagation for Decentralized Networked Deep Multi-agent Reinforcement Learning by Chao Qu, Shie Mannor, Huan Xu, Yuan Qi, Le Song, Junwu Xiong. NIPS 2019.
去中心化网络深度多智能体强化学习的价值传播,作者:Chao Qu、Shie Mannor、Huan Xu、Yuan Qi、Le Song、Junwu Xiong。NIPS 2019. - Multi-agent Reinforcement Learning for Networked System Control by Tianshu Chu, Sandeep Chinchali, Sachin Katti. ICLR 2020.
多智能体强化学习用于网络系统控制,作者:Tianshu Chu、Sandeep Chinchali、Sachin Katti。ICLR 2020 年。 - F2A2: Flexible fully-decentralized approximate actor-critic for cooperative multi-agent reinforcement learning by Wenhao Li, Bo Jin, Xiangfeng Wang, Junchi Yan, Hongyuan Zha. arXiv 2020.
F2A2:用于合作多智能体强化学习的灵活全去中心化近似参与者-批评者,作者:Wenhao Li、Bo Jin、Xiangfeng Wang、Junchi Yan、Hongyuan Zha。arXiv 2020 年。 - Scalable Reinforcement Learning of Localized Policies for Multi-Agent Networked Systems by Guannan Qu, Adam Wierman, Na Li. L4DC 2020.
多智能体网络系统的本地化策略的可扩展强化学习,作者:Guannan Qu, Adam Wierman, Na Li. L4DC 2020. - Finite-Sample Analysis For Decentralized Batch Multi-Agent Reinforcement Learning With Networked Agents by Zhang, Kaiqing and Yang, Zhuoran and Liu, Han and Zhang, Tong and Başar, Tamer. TAC 2021.
基于网络智能体的分散批量多智能体强化学习的有限样本分析,作者:Zhang、Kaiqing 和 Yang、Zhuoran 和 Liu、Han 和 Zhang、Tong 和 Başar、Tamer。TAC 2021 年。