AAAI2021论文列表

AAAI 的英文全称是 Association for the Advance of Artificial Intelligence——美国人工智能协会。该协会是人工智能领域的主要学术组织之一,其主办的年会也是人工智能领域的国际顶级会议。在中国计算机学会的国际学术会议排名以及清华大学新发布的计算机科学推荐学术会议和期刊列表中,AAAI 均被列为人工智能领域的 A 类顶级会议

AAAI 2021论文接收列表如下:
https://aaai.org/Conferences/AAAI-21/wp-content/uploads/2020/12/AAAI-21_Accepted-Paper-List.Main_.Technical.Track_.pdf

其中有82篇强化学习领域论文:
416: Robust Reinforcement Learning: A Case Study in Linear Quadratic Regulation
Bo Pang, Zhong-­‐Ping Jiang

676: Scalable First-­‐Order Methods for Robust MDPs
Julien Grand Clement, Christian Kroer

710: Maintenance of Social Commitments in Multiagent Systems
Pankaj Telang, Munindar Singh, Neil Yorke-­‐Smith

1137: Self-­‐Supervised Attention-­‐Aware Reinforcement Learning
Haiping Wu, Khimya Khetarpal, Doina Precup

1169: Hierarchical Reinforcement Learning for Integrated Recommendation
Ruobing Xie, Shaoliang Zhang, Rui Wang, Feng Xia, Leyu Lin

2088: Combining Reinforcement Learning with Lin-­‐Kernighan-­‐Helsgaun Algorithm for the Traveling Salesman Problem
Jiongzhi Zheng, Kun He, Jianrong Zhou, Yan Jin, Chumin Li

2136: Learning to Reweight Imaginary Transitions for Model-­‐Based Reinforcement Learning
Wenzhen Huang, Qiyue Yin, Junge Zhang, KAIQI HUANG

2294: Exploration-­‐Exploitation in Multi-­‐Agent Learning: Catastrophe Theory Meets Game Theory
Stefanos Leonardos, Georgios Piliouras

2431: Advice-­‐Guided Reinforcement Learning in a Non-­‐Markovian Environment
Daniel Neider, Jean-­‐Raphaël Gaglione, Ivan Gavran, Ufuk Topcu, Bo Wu, Zhe Xu

2441: Content Masked Loss: Human-­‐Like Brush Stroke Planning in a Reinforcement Learning Painting Agent
Peter Schaldenbrand, Jean Oh

2453: Metrics and Continuity in Reinforcement Learning
Charline Le Lan, Marc G. Bellemare, Pablo Samuel Castro

2666: Synthesis of Search Heuristics for Temporal Planning via Reinforcement Learning
Andrea Micheli, Alessandro Valentini

2971: Lipschitz Lifelong Reinforcement Learning
Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman

3011: Exact Reduction of Huge Action Spaces in General Reinforcement Learning
Sultan Javed Majeed, Marcus Hutter

3094: Visual Tracking via Hierarchical Deep Reinforcement Learning
Dawei Zhang, Zhonglong Zheng, Riheng Jia, Minglu Li

3193: Adaptive Prior-­‐Dependent Correction Enhanced Reinforcement Learning for Natural Language Generation
Wei Cheng, Ziyan Luo, Qiyue Yin

3279: A Hybrid Stochastic Gradient Hamiltonian Monte Carlo Method
Chao Zhang, Zhijian Li, Zebang Shen, Jiahao Xie, Hui Qian

3412: Sequential Generative Exploration Model for Partially Observable Reinforcement Learning
Haiyan Yin, Jianda Chen, Sinno Pan, Sebastian Tschiatschek

3679: Learning Task-­‐Distribution Reward Shaping with Meta-­‐Learning
Haosheng Zou, Tongzheng Ren, Dong Yan, Hang Su, Jun Zhu

3727: Visual Comfort Aware-­‐Reinforcement Learning for Depth Adjustment of Stereoscopic 3D Images
Hak Gu Kim, Minho Park, Sangmin Lee, Seongyeop Kim, Yong Man Ro

3812: Scheduling of Time-­‐Varying Workloads Using Reinforcement Learning
Shanka Subhra Mondal, Nikhil Sheoran, Subrata Mitra

4386: DEAR: Deep Reinforcement Learning for Online Advertising Impression in Recommender Systems
Xiangyu Zhao, Changsheng Gu, Haoshenglun Zhang, Xiwang Yang, Xiaobing Liu, Jiliang Tang , Hui Liu

4719: Complexity and Algorithms for Exploiting Quantal Opponents in Large Two-­‐Player Games
David Milec, Jakub Cerny, Viliam Lisy, Bo An

4999: Bayesian Optimized Monte Carlo Planning
John Mern, Anil Yildiz, Zachary Sunberg, Tapan Mukerji, Mykel Kochenderfer

5008: Towards Effective Context for Meta-­‐Reinforcement Learning: An Approach Based on Contrastive Learning
Haotian Fu, Hongyao Tang, Jianye Hao, Chen Chen, Xidong Feng, Dong Li, Wulong Liu

5012: Improved POMDP Tree Search Planning with Prioritized Action Branching
John Mern, Anil Yildiz, Lawrence Bush, Tapan Mukerji, Mykel Kochenderfer

5046: Anytime Heuristic and Monte Carlo Methods for Large-­‐Scale Simultaneous Coalition Structure Generation and Assignment
Fredrik Präntare, Fredrik Heintz, Herman Appelgren

5101: Reinforcement Learning with Trajectory Feedback
Yonathan Efroni, Nadav Merlis, Shie Mannor

5167: Encoding Human Domain Knowledge to Warm Start Reinforcement Learning
Andrew Silva, Matthew Gombolay

5284: GLIB: Efficient Exploration for Relational Model-­‐Based Reinforcement Learning via Goal-­Literal Babbling
Rohan Chitnis, Tom Silver, Joshua Tenenbaum, Leslie Kaelbling, Tomas Lozano-­‐Perez

5303: Provably Good Solutions to the Knapsack Problem via Neural Networks of Bounded Size
Christoph Hertrich, Martin Skutella

5320: WCSAC: Worst-­‐Case Soft Actor Critic for Safety-­‐Constrained Reinforcement Learning
Qisong Yang, Thiago D. Simão, Simon H Tindemans, Matthijs T. J. Spaan

5334: Queue-­‐Learning: A Reinforcement Learning Approach for Providing Quality of Service
Majid Raeis, Ali Tizghadam, Alberto Leon-­‐Garcia

5546: Improving Sample Efficiency in Model-­‐Free Reinforcement Learning from Images
Denis Yarats, Amy Zhang, Ilya Kostrikov, Brandon Amos, Joelle Pineau, Rob Fergus

5657: A Sample-­‐Efficient Algorithm for Episodic Finite-­‐Horizon MDP with Constraints
Krishna C Kalagarla, Rahul Jain, Pierluigi Nuzzo

5712: Resilient Multi-­‐Agent Reinforcement Learning with Adversarial Value Decomposition
Thomy Phan, Lenz Belzner, Thomas Gabor, Andreas Sedlmeier, Fabian Ritz, Claudia Linnhoff-­Popien

5906: Domain Adaptation in Reinforcement Learning via Latent Unified State Representation
Jinwei Xing, Takashi Nagata, Kexin Chen, Xinyun Zou, Emre Neftci, Jeffrey Prof. Krichmar

5930: Uncertainty-­‐Aware Policy Optimization: A Robust, Adaptive Trust Region Approach
James Queeney, Ioannis Paschalidis, Christos G. Cassandras

5971: Deep Recurrent Belief Propagation Network for POMDPs
Yuhui Wang, Xiaoyang Tan

6031: Inverse Reinforcement Learning from Like-­‐Minded Teachers
Ritesh Noothigattu, Tom Yan, Ariel D Procaccia

6049: FontRL: Chinese Font Synthesis via Deep Reinforcement Learning
Yitian Liu, Zhouhui Lian

6070: Coordination between Individual Agents in Multi-­‐Agent Reinforcement Learning
Yang Zhang, Qingyu Yang, Dou An, Chengwei Zhang

6211: Constrained Risk-­‐Averse Markov Decision Processes
Mohamadreza Ahmadi, Ugo Rosolia, Michel Ingham, Richard M Murray, Aaron Ames

6310: A Deep Reinforcement Learning Approach to First-­‐Order Logic Theorem Proving
Maxwell Crouse, Ibrahim Abdelaziz, Bassem Makni, Spencer Whitehead, Cristina Cornelio, Pavan Kapanipathi, Kavitha Srinivas, Veronika Thost, Michael Witbrock, Achille Fokoue

6343: The Maximin Support Method: An Extension of the D’Hondt Method to Approval-­‐Based Multiwinner Elections
Luis Sanchez-­‐Fernandez, Norberto Fernández García, Jesús Fisteus, Markus Brill

6428: Reinforcement Learning Based Multi-­‐Agent Resilient Control: From Deep Neural Networks to an Adaptive Law
Jian Hou, Fangyuan Wang, Lili Wang, Zhiyong Chen

6610: Learning Game-­‐Theoretic Models of Multiagent Trajectories Using Implicit Layers
Philipp Geiger, Christoph-­‐Nikolas Straehle

6977: DeepTrader: A Deep Reinforcement Learning Approach for Risk-­‐Return Balanced Portfolio Management with Market Conditions Embedding
Zhicheng Wang, Biwei Huang, Shikui Tu, Kun Zhang, Lei Xu

7018: Reinforcement Learning with a Disentangled Universal Value Function for Item Recommendation
Kai Wang, Zhene Zou, Qilin Deng, Jianrong Tao, Runze Wu, Changjie Fan, Liang Chen, Peng Cui

7394: Learning Model-­‐Based Privacy Protection under Budget Constraints
Junyuan Hong, Haotao Wang, Zhangyang Wang, Jiayu Zhou

7572: Towards Fully Automated Manga Translation
Ryota Hinami, Shonosuke Ishiwatari, Kazuhiko Yasuda, Yusuke Matsui

7657: The Value-­‐Improvement Path: Towards Better Representations for Reinforcement Learning
Will Dabney, Andre Barreto, Mark Rowland, Robert Dadashi, John Quan, Marc G. Bellemare, David Silver

7812: Text-­‐Based RL Agents with Commonsense Knowledge: New Challenges, Environments and Baselines
Keerthiram Murugesan, Mattia Atzeni, Pavan Kapanipathi, Pushkar Shukla, Sadhana Kumaravel, Gerald Tesauro, Kartik Talamadupula, Mrinmaya Sachan, Murray Campbell

7911: DSLR : Dynamic to Static Lidar Scan Reconstruction Using Adversarially Trained Auto Encoder
Prashant Kumar, Sabyasachi Sahoo, Vanshil Shah, Vineetha Kondameedi, Abhinav Jain, Akshaj Verma, Chiranjib Bhattacharyya, Vinay Vishwanath

7936: Dynamic Automaton-­‐Guided Reward Shaping for Monte Carlo Tree Search
Alvaro Velasquez, Brett Bissey, Lior Barak, Andre Beckus, Ismail Alkhouri, Daniel Melcer, George Atia

7952: Sample Efficient Reinforcement Learning with REINFORCE
Junzi Zhang, Jongho Kim, Brendan O’Donoghue, Stephen Boyd

8029: Reinforcement Learning of Sequential Price Mechanisms
Gianluca Brero, Alon Eden, Matthias Gerstgrasser, David Parkes, Duncan Rheingans-­‐Yoo

8042: Robust Finite-­‐State Controllers for Uncertain POMDPs
Murat Cubuktepe, Nils Jansen, Sebastian Junges, Ahmadreza Marandi, Marnix Suilen, Ufuk Topcu

8168: TAC: Towered Actor Critic for Handling Multiple Action Types in Reinforcement Learning for Drug Discovery
Sai Krishna Gottipati, Yashaswi Pathak, Boris Sattarov, . Sahir, Rohan Nuttall, Mohammad Amini, Matthew E. Taylor, Sarath Chandar

8181: Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs
Aria HasanzadeZonuzy, Archana Bura, Dileep Kalathil, Srinivas Shakkottai

8186: Solving Common-­‐Payoff Games with Approximate Policy Iteration
Samuel Sokota, Edward Lockhart, Finbarr Timbers, Elnaz Davoodi, Ryan D’Orazio, Neil Burch, Martin Schmid, Michael Bowling, Marc Lanctot

8323: DeepSynth: Automata Synthesis for Automatic Task Segmentation in Deep Reinforcement Learning
Mohammadhosein Hasanbeig, Natasha Yogananda Jeppu, Alessandro Abate , Tom Melham, Daniel Kroening

8398: Inverse Reinforcement Learning with Explicit Policy Estimates
Navyata Sanghvi, Shinnosuke Usami, Mohit Sharma, Joachim Groeger, Kris Kitani

8545: Mean-­‐Variance Policy Iteration for Risk-­‐Averse Reinforcement Learning
Shangtong Zhang, Bo Liu, Shimon Whiteson

8556: Iterative Bounding MDPs: Learning Interpretable Policies via Non-­‐Interpretable Methods
Nicholay Topin, Stephanie Milani, Fei Fang, Manuela Veloso

8619: Temporal-­‐Logic-­‐Based Reward Shaping for Continuing Reinforcement Learning Tasks
Yuqian Jiang, Sudarshanan Bharadwaj, Bo Wu, Rishi Shah, Ufuk Topcu, Peter Stone

8771: Online 3D Bin Packing with Constrained Deep Reinforcement Learning
Hang Zhao, Qijin She, Chenyang Zhu, Yin Yang, Kai Xu

9385: A General Offline Reinforcement Learning Framework for Interactive Recommendation
Teng Xiao, Donglin Wang

9457: Minimax Regret Optimisation for Robust Planning in Uncertain Markov Decision Processes
Marc Rigter, Bruno Lacerda, Nick Hawes

9459: Planning from Pixels in Atari with Learned Symbolic Representations
Andrea Dittadi, Frederik K Drachmann, Thomas Bolander

9813: Combining Reinforcement Learning and Constraint Programming for Combinatorial Optimization
Quentin Cappart, Thierry Moisan, Louis-­‐Martin Rousseau, Isabeau Prémont-­‐Schwarz, Andre Cire

9862: Distributional Reinforcement Learning via Moment Matching
Thanh Tang Nguyen, Sunil Gupta, Svetha Venkatesh

9869: Non-­‐Asymptotic Convergence of Adam-­‐Type Reinforcement Learning Algorithms under Markovian Sampling
Huaqing Xiong, Tengyu Xu, Yingbin Liang, Wei Zhang

9983: Data-­‐Driven Competitive Algorithms for Online Knapsack and Set Cover
Ali Zeynali, Bo Sun, Mohammad Hajiesmaili, Adam Wierman

10000: Inverse Reinforcement Learning with Natural Language Goals
Li Zhou, Kevin Small

10014: Decentralized Policy Gradient Descent Ascent for Safe Multi-­‐Agent Reinforcement Learning
Songtao Lu, Kaiqing Zhang, Tianyi Chen, Tamer Basar, Lior Horesh

10033: Visual Transfer for Reinforcement Learning via Wasserstein Domain Confusion
Josh Roy, George Konidaris

10098: Policy Optimization as Online Learning with Mediator Feedback
Alberto Maria Metelli, Matteo Papini, Pierluca D’Oro, Marcello Restelli

10284: Model-­‐Free Online Learning in Unknown Sequential Decision Making Problems and Games
Gabriele Farina

10346: Deep Bayesian Quadrature Policy Optimization
Ravi Tej Akella, Kamyar Azizzadenesheli, Mohammad Ghavamzadeh, Animashree Anandkumar, Yisong Yue

7256: K-­‐N-­‐MOMDPs: Towards Interpretable Solutions for Adaptive Management
Jonathan Ferrer Mestres, Thomas Dietterich, Olivier Buffet, Iadine Chades

  • 1
    点赞
  • 5
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值