【笔记SDRL】

资源存储库

已于 2023-12-24 21:20:10 修改

阅读量806

点赞数 8

文章标签：笔记机器人人工智能

于 2023-12-24 20:45:42 首次发布

本文链接：https://blog.csdn.net/wq6qeg88/article/details/135186512

版权

2. Safe RL Baselines

2. 安全的RL基线

SAFE REINFORCEMENT LEARNING
安全强化学习

Safe-Reinforcement-Learning-Baselines

2.1. Safe Single Agent RL Baselines

2.1. 安全的单智能体RL基线

Consideration of risk in reinforcement learning, Paper, Not Find Code, (Accepted by ICML 1994)
Consider of risk in reinforcement learning， Paper， Not Find Code，（被ICML 1994接受）
Multi-criteria Reinforcement Learning, Paper, Not Find Code, (Accepted by ICML 1998)
Multi-criteria Reinforcement Learning， Paper， Not Find Code，（被ICML 1998录用）
Lyapunov design for safe reinforcement learning, Paper, Not Find Code, (Accepted by ICML 2002)
Lyapunov design for safe reinforcement learning， Paper， Not Find Code，（被ICML 2002录用）
Risk-sensitive reinforcement learning, Paper, Not Find Code, (Accepted by Machine Learning, 2002)
Risk-sensitive reinforcement learning， Paper， Not Find Code，（被机器学习接受， 2002）
Risk-Sensitive Reinforcement Learning Applied to Control under Constraints, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2005)
Risk-sensitive reinforcement Learning Applied to Control under constraints， paper， not find code，（被Journal of Artificial Intelligence Research， 2005录用）
An actor-critic algorithm for constrained markov decision processes, Paper, Not Find Code, (Accepted by Systems & Control Letters, 2005)
An actor-critic algorithm for constrained markov decision processes， Paper， Not Find Code，（Accepted by Systems & Control Letters， 2005）
Reinforcement learning for MDPs with constraints, Paper, Not Find Code, (Accepted by European Conference on Machine Learning 2006)
Reinforcement learning for MDPs with constraints， Paper， Not Find Code，（被欧洲机器学习会议接受 2006）
Discounted Markov decision processes with utility constraints, Paper, Not Find Code, (Accepted by Computers & Mathematics with Applications, 2006)
Discounted Markov decision processes with utility constraints， Paper， Not Find Code，（Accepted by Computers & Mathematics with Applications， 2006）
Constrained reinforcement learning from intrinsic and extrinsic rewards, Paper, Not Find Code, (Accepted by International Conference on Development and Learning 2007)
来自内在和外在奖励的约束强化学习，论文，找不到代码，（2007年国际发展与学习会议接受）
Safe exploration for reinforcement learning, Paper, Not Find Code, (Accepted by ESANN 2008)
强化学习的安全探索，论文，Not Find Code，（被 ESANN 2008 接受）
Percentile optimization for Markov decision processes with parameter uncertainty, Paper, Not Find Code, (Accepted by Operations research, 2010)
具有参数不确定性的马尔可夫决策过程的百分位数优化，论文，Not Find Code，（被运筹学录用，2010 年）
Probabilistic goal Markov decision processes, Paper, Not Find Code, (Accepted by IJCAI 2011)
概率目标马尔可夫决策过程，论文，不查找代码，（被IJCAI 2011接受）
Safe reinforcement learning in high-risk tasks through policy improvement, Paper, Not Find Code, (Accepted by IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (ADPRL) 2011)
通过策略改进在高风险任务中实现安全强化学习，论文，Not Find Code，（被 IEEE 自适应动态规划和强化学习研讨会（ADPRL） 2011 接受）
Safe Exploration in Markov Decision Processes, Paper, Not Find Code, (Accepted by ICML 2012)
马尔可夫决策过程中的安全探索，论文，Not Find Code，（被ICML 2012录用）
Policy gradients with variance related risk criteria, Paper, Not Find Code, (Accepted by ICML 2012)
具有方差相关风险标准的策略梯度，论文，Not Find Code，（被 ICML 2012 接受）
Risk aversion in Markov decision processes via near optimal Chernoff bounds, Paper, Not Find Code, (Accepted by NeurIPS 2012)
通过近最优切尔诺夫边界在马尔可夫决策过程中的风险规避，论文，未找到代码，（被 NeurIPS 2012 接受）
Safe Exploration of State and Action Spaces in Reinforcement Learning, Paper, Not Find Code, (Accepted by Journal of Artificial Intelligence Research, 2012)
Safe Exploration of State and Action Spaces in Reinforcement Learning，论文， Not Find Code，（Journal of Artificial Intelligence Research， 2012）
An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes, Paper, Not Find Code, (Accepted by Journal of Optimization Theory and Applications, 2012)
An Online Actor–Critic Algorithm with Function Approximation for Constrained Markov Decision Processes，论文， Not Find Code，（Journal of Optimization Theory and Applications， 2012）
Safe policy iteration, Paper, Not Find Code, (Accepted by ICML 2013)
安全策略迭代，论文，不查找代码，（被 ICML 2013 接受）
Reachability-based safe learning with Gaussian processes, Paper, Not Find Code (Accepted by IEEE CDC 2014)
基于可达性的安全学习与高斯过程，论文，Not Find Code（IEEE CDC 2014接受）
Safe Policy Search for Lifelong Reinforcement Learning with Sublinear Regret, Paper, Not Find Code, (Accepted by ICML 2015)
具有亚线性遗憾的终身强化学习的安全策略搜索，论文，找不到代码，（被 ICML 2015 接受）
High-Confidence Off-Policy Evaluation, Paper, Code (Accepted by AAAI 2015)
高置信度非政策评估，论文，代码（被AAAI 2015接受）
Safe Exploration for Optimization with Gaussian Processes, Paper, Not Find Code (Accepted by ICML 2015)
使用高斯过程进行优化的安全探索，论文，未查找代码（被 ICML 2015 接受）
Safe Exploration in Finite Markov Decision Processes with Gaussian Processes, Paper, Not Find Code (Accepted by NeurIPS 2016)
使用高斯过程在有限马尔可夫决策过程中进行安全探索，论文，Not Find Code（被 NeurIPS 2016 接受）
Safe and efficient off-policy reinforcement learning, Paper, Code (Accepted by NeurIPS 2016)
安全高效的策略外强化学习，论文，代码（被 NeurIPS 2016 接受）
Safe, Multi-Agent, Reinforcement Learning for Autonomous Driving, Paper, Not Find Code (only Arxiv, 2016, citation 530+)
Safe， Multi-Agent， Reinforcement Learning for Autonomous Driving， Paper， Not Find Code（仅限 Arxiv，2016 年，引文 530+）
Safe Learning of Regions of Attraction in Uncertain, Nonlinear Systems with Gaussian Processes, Paper, Code (Accepetd by CDC 2016)
具有高斯过程的不确定非线性系统中吸引力区域的安全学习，论文，代码（CDC 2016 认证）
Safety-constrained reinforcement learning for MDPs, Paper, Not Find Code (Accepted by InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2016)
Safety-constrained reinforcement learning for MDPs， paper， not find code （被InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2016接受）
Convex synthesis of randomized policies for controlled Markov chains with density safety upper bound constraints, Paper, Not Find Code (Accepted by American Control Conference 2016)
具有密度安全上限约束的受控马尔可夫链的随机策略的凸合成，论文，Not Find Code（2016年美国控制会议录用）
Combating Deep Reinforcement Learning’s Sisyphean Curse with Intrinsic Fear, Paper, Not Find Code (only Openreview, 2016)
用内在的恐惧对抗深度强化学习的西西弗斯诅咒，论文，找不到代码（仅限 Openreview，2016 年）
Combating reinforcement learning’s sisyphean curse with intrinsic fear, Paper, Not Find Code (only Arxiv, 2016)
用内在的恐惧对抗强化学习的西西弗斯诅咒，Paper，Not Find Code（仅限 Arxiv，2016 年）
Constrained Policy Optimization (CPO), Paper, Code (Accepted by ICML 2017)
约束策略优化（CPO）、论文、代码（被 ICML 2017 接受）
Risk-constrained reinforcement learning with percentile risk criteria, Paper, , Not Find Code (Accepted by The Journal of Machine Learning Research, 2017)
Risk-constrained reinforcement learning with percentile risk criteria， Paper，， Not Find Code（被《机器学习研究杂志》接受，2017 年）
Probabilistically Safe Policy Transfer, Paper, Not Find Code (Accepted by ICRA 2017)
概率安全政策传输，纸质，未查找代码（被 ICRA 2017 接受）
Accelerated primal-dual policy optimization for safe reinforcement learning, Paper, Not Find Code (Arxiv, 2017)
安全强化学习的加速原始双策略优化，论文，Not Find Code（Arxiv，2017）
Stagewise safe bayesian optimization with gaussian processes, Paper, Not Find Code (Accepted by ICML 2018)
使用高斯过程进行阶段安全贝叶斯优化，论文，未查找代码（被 ICML 2018 接受）
Leave no Trace: Learning to Reset for Safe and Autonomous Reinforcement Learning, Paper, Code (Accepted by ICLR 2018)
Leave no Trace： Learning to Reset for Safe and Autonomous Reinforcement Learning， Paper， Code （被ICLR 2018录用）
Safe Model-based Reinforcement Learning with Stability Guarantees, Paper, Code (Accepted by NeurIPS 2018)
具有稳定性保证的安全基于模型的强化学习，论文，代码（被 NeurIPS 2018 接受）
A Lyapunov-based Approach to Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
A Lyapunov-based Approach to Safe Reinforcement Learning， Paper， Not Find Code（被 NeurIPS 2018 接受）
Constrained Cross-Entropy Method for Safe Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2018)
安全强化学习的约束交叉熵方法，论文，未查找代码（被 NeurIPS 2018 录用）
Safe Reinforcement Learning via Formal Methods, Paper, Not Find Code (Accepted by AAAI 2018)
通过形式化方法、论文、Not Find Code 进行安全强化学习（被 AAAI 2018 接受）
Safe exploration and optimization of constrained mdps using gaussian processes, Paper, Not Find Code (Accepted by AAAI 2018)
使用高斯过程对约束 mdp 进行安全探索和优化，论文，未查找代码（被 AAAI 2018 接受）
Safe reinforcement learning via shielding, Paper, Code (Accepted by AAAI 2018)
通过屏蔽进行安全强化学习，论文，代码（被AAAI 2018接受）
Trial without Error: Towards Safe Reinforcement Learning via Human Intervention, Paper, Not Find Code (Accepted by AAMAS 2018)
无错误试验：通过人工干预实现安全的强化学习，论文，而不是查找代码（被 AAMAS 2018 接受）
Learning-based Model Predictive Control for Safe Exploration and Reinforcement Learning, Paper, Not Find Code (Accepted by CDC 2018)
基于学习的模型预测控制，用于安全探索和强化学习，论文，未查找代码（被 CDC 2018 接受）
The Lyapunov Neural Network: Adaptive Stability Certification for Safe Learning of Dynamical Systems, Paper, Code (Accepted by CoRL 2018)
Lyapunov 神经网络：用于动态系统安全学习的自适应稳定性认证，论文，代码（被 CoRL 2018 接受）
OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World, Paper, Not Find Code (Accepted by ICRA 2018)
OptLayer - Practical Constrained Optimization for Deep Reinforcement Learning in the Real World， Paper， Not Find Code（被ICRA 2018接受）
Safe reinforcement learning on autonomous vehicles, Paper, Not Find Code (Accepted by IROS 2018)
自动驾驶汽车上的安全强化学习，纸质，找不到代码（被 IROS 2018 接受）
Trial without error: Towards safe reinforcement learning via human intervention, Paper, Code (Accepted by AAMAS 2018)
无误试验：通过人为干预实现安全的强化学习，论文，代码（被 AAMAS 2018 接受）
Safe reinforcement learning: Learning with supervision using a constraint-admissible set, Paper, Not Find Code (Accepted by Annual American Control Conference (ACC) 2018)
安全强化学习：使用约束允许的集合进行监督学习，纸质，找不到代码（被 2018 年美国年度控制会议（ACC）接受）
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems, Paper, Not Find Code (Accepted by IEEE Transactions on Automatic Control 2018)
A General Safety Framework for Learning-Based Control in Uncertain Robotic Systems， Paper， Not Find Code （被IEEE Transactions on Automatic Control 2018录用）
Verification and repair of control policies for safe reinforcement learning, Paper, Not Find Code (Accepted by Applied Intelligence, 2018)
安全强化学习控制策略的验证和修复，论文，Not Find Code（被 Applied Intelligence 接受，2018 年）
Safe Exploration in Continuous Action Spaces, Paper, Code, (only Arxiv, 2018, citation 200+)
连续行动空间中的安全探索，论文，代码，（仅 Arxiv，2018 年，引文 200+）
Safe exploration of nonlinear dynamical systems: A predictive safety filter for reinforcement learning, Paper, Not Find Code (Arxiv, 2018, citation 40+)
非线性动力系统的安全探索：用于强化学习的预测性安全滤波器，论文，Not Find Code（Arxiv，2018 年，引文 40+）
Batch policy learning under constraints, Paper, Code (Accepted by ICML 2019)
约束下的批量策略学习，论文，代码（被ICML 2019录用）
Safe Policy Improvement with Baseline Bootstrapping, Paper, Not Find Code (Accepted by ICML 2019)
使用基线引导、纸质、未查找代码进行安全策略改进（被 ICML 2019 接受）
Convergent Policy Optimization for Safe Reinforcement Learning, Paper, Code (Accepted by NeurIPS 2019)
Convergent Policy Optimization for Safe Reinforcement Learning，论文，代码（被 NeurIPS 2019 录用）
Constrained reinforcement learning has zero duality gap, Paper, Not Find Code (Accepted by NeurIPS 2019)
约束强化学习具有零对偶性差距，论文，找不到代码（被 NeurIPS 2019 接受）
Reinforcement learning with convex constraints, Paper, Code (Accepted by NeurIPS 2019)
具有凸约束的强化学习，论文，代码（被 NeurIPS 2019 接受）
Reward constrained policy optimization, Paper, Not Find Code (Accepted by ICLR 2019)
奖励约束策略优化，论文，找不到代码（被 ICLR 2019 接受）
Supervised policy update for deep reinforcement learning, Paper, Code, (Accepted by ICLR 2019)
深度强化学习的监督策略更新，论文，代码，（被 ICLR 2019 接受）
End-to-end safe reinforcement learning through barrier functions for safety-critical continuous control tasks, Paper, Code (Accepted by AAAI 2019)
通过屏障功能进行端到端安全强化学习，用于安全关键型连续控制任务，论文，代码（被AAAI 2019接受）
Lyapunov-based safe policy optimization for continuous control, Paper, Not Find Code (Accepted by ICML Workshop RL4RealLife 2019)
基于Lyapunov的连续控制安全策略优化，论文，找不到代码（被ICML研讨会RL4RealLife 2019接受）
Safe reinforcement learning with model uncertainty estimates, Paper, Not Find Code (Accepted by ICRA 2019)
具有模型不确定性估计的安全强化学习，论文，未找到代码（被 ICRA 2019 接受）
Safe reinforcement learning with scene decomposition for navigating complex urban environments, Paper, Code, (Accepted by IV 2019)
用于导航复杂城市环境的场景分解安全强化学习，论文，代码，（被IV 2019录用）
Verifiably safe off-model reinforcement learning, Paper, Code (Accepted by InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2019)
可验证安全的模型外强化学习，论文，代码（被InInternational Conference on Tools and Algorithms for the Construction and Analysis of Systems 2019接受）
Probabilistic policy reuse for safe reinforcement learning, Paper, Not Find Code, (Accepted by ACM Transactions on Autonomous and Adaptive Systems (TAAS), 2019)
用于安全强化学习的概率策略重用，论文，Not Find Code，（被 ACM Transactions on Autonomous and Adaptive Systems （TAAS）接受，2019 年）
Projected stochastic primal-dual method for constrained online learning with kernels, Paper, Not Find Code, (Accepted by IEEE Transactions on Signal Processing, 2019)
Projected stochastic primal-dual method for constrained online learning with kernels， Paper， Not Find Code，（被IEEE Transactions on Signal Processing录用，2019年）
Resource constrained deep reinforcement learning, Paper, Not Find Code, (Accepted by 29th International Conference on Automated Planning and Scheduling 2019)
资源受限的深度强化学习，论文，找不到代码，（被2019年第29届自动化计划与调度国际会议接受）
Temporal logic guided safe reinforcement learning using control barrier functions, Paper, Not Find Code (Arxiv, Citation 25+, 2019)
使用控制屏障函数的时间逻辑引导的安全强化学习，论文，Not Find Code （Arxiv，引文 25+，2019 年）
Safe policies for reinforcement learning via primal-dual methods, Paper, Not Find Code (Arxiv, Citation 25+, 2019)
通过原始对偶方法进行强化学习的安全策略，论文，Not Find Code（Arxiv，引文 25+，2019 年）
Value constrained model-free continuous control, Paper, Not Find Code (Arxiv, Citation 35+, 2019)
值约束无模型连续控制，论文，未查找代码（Arxiv，引文 35+，2019 年）
Safe Reinforcement Learning in Constrained Markov Decision Processes (SNO-MDP), Paper, Code (Accepted by ICML 2020)
约束马尔可夫决策过程中的安全强化学习（SNO-MDP），论文，代码（被ICML 2020录用）
Responsive Safety in Reinforcement Learning by PID Lagrangian Methods, Paper, Code (Accepted by ICML 2020)
PID拉格朗日方法、论文、代码在强化学习中的响应式安全性（被ICML 2020录用）
Constrained markov decision processes via backward value functions, Paper, Code (Accepted by ICML 2020)
通过反向值函数的约束马尔可夫决策过程，论文，代码（被ICML 2020接受）
Projection-Based Constrained Policy Optimization (PCPO), Paper, Code (Accepted by ICLR 2020)
基于预测的约束策略优化（PCPO），论文，代码（被 ICLR 2020 接受）
First order constrained optimization in policy space (FOCOPS),Paper, Code (Accepted by NeurIPS 2020)
策略空间中的一阶约束优化（FOCOPS），论文，代码（被 NeurIPS 2020 接受）
Safe reinforcement learning via curriculum induction, Paper, Code (Accepted by NeurIPS 2020)
通过课程诱导进行安全强化学习，论文，代码（被 NeurIPS 2020 接受）
Constrained episodic reinforcement learning in concave-convex and knapsack settings, Paper, Code (Accepted by NeurIPS 2020)
凹凸和背包环境中的约束情景强化学习，论文，代码（被 NeurIPS 2020 接受）
Risk-sensitive reinforcement learning: Near-optimal risk-sample tradeoff in regret, Paper, Not Find Code (Accepted by NeurIPS 2020)
风险敏感型强化学习：后悔中的近最优风险样本权衡，论文，找不到代码（被 NeurIPS 2020 接受）
IPO: Interior-point Policy Optimization under Constraints, Paper, Not Find Code (Accepted by AAAI 2020)
IPO： Inside-point Policy Optimization under constraints， paper， not find code （被AAAI 2020录用）
Safe reinforcement learning using robust mpc, Paper, Not Find Code (IEEE Transactions on Automatic Control, 2020)
使用鲁棒 MPC 的安全强化学习，论文，找不到代码（IEEE Transactions on Automatic Control，2020 年）
Safe reinforcement learning via projection on a safe set: How to achieve optimality? Paper, Not Find Code (Accepted by IFAC 2020)
通过在安全集上投影进行安全强化学习：如何实现最优？论文，找不到代码（被IFAC 2020接受）
Reinforcement learning for safety-critical control under model uncertainty, using control lyapunov functions and control barrier functions, Paper, Not Find Code (Accepted by RSS 2020)
模型不确定性下安全关键控制的强化学习，使用控制李雅普诺夫函数和控制屏障函数，论文，未找到代码（被RSS 2020接受）
Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning, Paper, Code, (Accepted by International Joint Conference on Neural Networks (IJCNN) 2020)
Learning Transferable Domain Priors for Safe Exploration in Reinforcement Learning，论文，代码，（被国际神经网络联合会议（IJCNN） 2020 接受）
Safe reinforcement learning through meta-learned instincts, Paper, Not Find Code (Accepted by The Conference on Artificial Life 2020)
通过元学习本能进行安全强化学习，Paper， Not Find Code（被 2020 年人工生命会议接受）
Learning safe policies with cost-sensitive advantage estimation, Paper, Not Find Code (Openreview 2020)
通过成本敏感优势估计学习安全策略，纸质，找不到代码（Openreview 2020）
Safe reinforcement learning using probabilistic shields, Paper, Not Find Code (2020)
使用概率盾牌的安全强化学习，纸，找不到代码（2020）
A constrained reinforcement learning based approach for network slicing, Paper, Not Find Code (Accepted by IEEE 28th International Conference on Network Protocols (ICNP) 2020)
A constrained reinforcement learning based approach for network slicing， Paper， Not Find Code （被IEEE 28th International Conference on Network Protocols （ICNP） 2020接受）
Safe reinforcement learning: A control barrier function optimization approach, Paper, Not Find Code (Accepted by the International Journal of Robust and Nonlinear Control)
安全强化学习：控制屏障函数优化方法，论文，Not Find Code（被International Journal of Robust and Nonlinear Control录用）
Exploration-exploitation in constrained mdps, Paper, Not Find Code (Arxiv, 2020)
受约束 mdp 中的探索开发，纸，找不到代码（Arxiv，2020 年）
Safe reinforcement learning using advantage-based intervention, Paper, Code (Accepted by ICML 2021)
使用基于优势的干预进行安全强化学习，论文，代码（被ICML 2021接受）
Shortest-path constrained reinforcement learning for sparse reward tasks, Paper, Code, (Accepted by ICML 2021)
稀疏奖励任务的最短路径约束强化学习，论文，代码，（被 ICML 2021 录用）
Density constrained reinforcement learning, Paper, Not Find Code (Accepted by ICML 2021)
密度约束强化学习，论文，未找到代码（被 ICML 2021 接受）
CRPO: A New Approach for Safe Reinforcement Learning with Convergence Guarantee, Paper, Not Find Code (Accepted by ICML 2021)
CRPO：具有收敛保证的安全强化学习的新方法，纸质，而不是查找代码（被 ICML 2021 接受）
Safe Reinforcement Learning by Imagining the Near Future (SMBPO), Paper, Code (Accepted by NeurIPS 2021)
通过想象不久的将来（SMBPO）进行安全强化学习，论文，代码（被 NeurIPS 2021 接受）
Exponential Bellman Equation and Improved Regret Bounds for Risk-Sensitive Reinforcement Learning, Paper, Not Find Code (Accepted by NeurIPS 2021)
Exponential Bellman Equation and Improved Regret Bounds for Risk-sensitive Reinforcement Learning，论文， Not Find Code（被 NeurIPS 2021 接受）
Risk-Sensitive Reinforcement Learning: Symmetry, Asymmetry, and Risk-Sample Tradeoff, Paper, Not Find Code (Accepted by NeurIPS 2021)
风险敏感强化学习：对称性、不对称性和风险样本权衡，论文，找不到代码（被 NeurIPS 2021 接受）
Safe reinforcement learning with natural language constraints, Paper, Code, (Accepted by NeurIPS 2021)
具有自然语言约束的安全强化学习，论文，代码，（被 NeurIPS 2021 接受）
Learning policies with zero or bounded constraint violation for constrained mdps, Paper, Not Find Code (Accepted by NeurIPS 2021)
针对受约束的 mdp 具有零约束或有限约束违反的学习策略，论文，未查找代码（被 NeurIPS 2021 接受）
Conservative safety critics for exploration, Paper, Not Find Code (Accepted by ICLR 2021)
保守的探索安全批评者，纸，找不到代码（被 ICLR 2021 接受）
Wcsac: Worst-case soft actor critic for safety-constrained reinforcement learning, Paper, Not Find Code (Accepted by AAAI 2021)
Wcsac：安全约束强化学习的最坏情况软演员评论家，论文，找不到代码（被 AAAI 2021 接受）
Risk-averse trust region optimization for reward-volatility reduction, Paper, Not Find Code (Accepted by IJCAI 2021)
Risk-avene-trust region optimization for reward-volatility reduction， Paper， Not Find Code （被 IJCAI 2021 录用）
AlwaysSafe: Reinforcement Learning Without Safety Constraint Violations During Training, Paper, Code (Accepted by AAMAS 2021)
AlwaysSafe：在训练、论文、代码期间不违反安全约束的强化学习（被 AAMAS 2021 接受）
Safe Continuous Control with Constrained Model-Based Policy Optimization (CMBPO), Paper, Code (Accepted by IROS 2021)
基于约束模型的策略优化（CMBPO）的安全连续控制，论文，代码（被 IROS 2021 接受）
Context-aware safe reinforcement learning for non-stationary environments, Paper, Code (Accepted by ICRA 2021)
非平稳环境的情境感知安全强化学习，论文，代码（被 ICRA 2021 录用）
Robot Reinforcement Learning on the Constraint Manifold, Paper, Code (Accepted by CoRL 2021)
约束流形上的机器人强化学习，论文，代码（被 CoRL 2021 接受）
Provably efficient safe exploration via primal-dual policy optimization, Paper, Not Find Code (Accepted by the International Conference on Artificial Intelligence and Statistics 2021)
通过原始-双重策略优化进行可证明的高效安全探索，论文，而不是查找代码（被 2021 年人工智能与统计国际会议接受）
Safe model-based reinforcement learning with robust cross-entropy method, Paper, Code (Accepted by ICLR 2021 Workshop on Security and Safety in Machine Learning Systems)
基于模型的安全强化学习与鲁棒的交叉熵方法，论文，代码（被 ICLR 2021 机器学习系统安全与安全研讨会接受）
MESA: Offline Meta-RL for Safe Adaptation and Fault Tolerance, Paper, Code (Accepted by Workshop on Safe and Robust Control of Uncertain Systems at NeurIPS 2021)
MESA： Offline Meta-RL for Safe Adaptation and Fault Tolerance，论文，代码（被 NeurIPS 2021 的不确定系统安全鲁棒控制研讨会接受）
Safe Reinforcement Learning of Control-Affine Systems with Vertex Networks, Paper, Code (Accepted by Conference on Learning for Dynamics and Control 2021)
基于顶点网络的控制仿射系统的安全强化学习，论文，代码（被 2021 年动力学与控制学习会议接受）
Can You Trust Your Autonomous Car? Interpretable and Verifiably Safe Reinforcement Learning, Paper, Not Find Code (Accepted by IV 2021)
你能相信你的自动驾驶汽车吗？可解释且可验证的安全强化学习，纸质，找不到代码（被 IV 2021 接受）
Provably safe model-based meta reinforcement learning: An abstraction-based approach, Paper, Not Find Code (Accepted by CDC 2021)
可证明安全的基于模型的元强化学习：基于抽象的方法，论文，而不是查找代码（被 CDC 2021 接受）
Recovery RL: Safe Reinforcement Learning with Learned Recovery Zones, Paper, Code, (Accepted by IEEE RAL, 2021)
Recovery RL： Safe Reinforcement Learning with Learned Recovery Zones，论文，代码，（被IEEE RAL接受，2021年）
Reinforcement learning control of constrained dynamic systems with uniformly ultimate boundedness stability guarantee, Paper, Not Find Code (Accepted by Automatica, 2021)
具有均匀极限有界稳定性保证的约束动态系统的强化学习控制，论文，Not Find Code（被 Automatica 录用，2021 年）
A predictive safety filter for learning-based control of constrained nonlinear dynamical systems, Paper, Not Find Code (Accepted by Automatica, 2021)
用于基于学习的约束非线性动力系统的控制的预测安全滤波器，论文，Not Find Code（被 Automatica 接受，2021 年）
A simple reward-free approach to constrained reinforcement learning, Paper, Not Find Code (Arxiv, 2021)
约束强化学习的简单无奖励方法，Paper，Not Find Code（Arxiv，2021 年）
State augmented constrained reinforcement learning: Overcoming the limitations of learning with rewards, Paper, Not Find Code (Arxiv, 2021)
状态增强约束强化学习：用奖励克服学习的局限性，纸，找不到代码（Arxiv，2021）
DESTA: A Framework for Safe Reinforcement Learning with Markov Games of Intervention, Paper, Not Find Code (Arxiv, 2021)
DESTA：使用马尔可夫干预游戏的安全强化学习框架，论文，而不是查找代码（Arxiv，2021 年）
Safe Exploration in Model-based Reinforcement Learning using Control Barrier Functions, Paper, Not Find Code (Arxiv, 2021)
使用控制屏障函数在基于模型的强化学习中进行安全探索，论文，Not Find Code（Arxiv，2021 年）
Constrained Variational Policy Optimization for Safe Reinforcement Learning, Paper, Code (ICML 2022)
安全强化学习的约束变分策略优化，论文，代码（ICML 2022）
Stability-Constrained Markov Decision Processes Using MPC, Paper, Not Find Code (Accepted by Automatica, 2022)
使用 MPC、纸质、未查找代码的稳定性约束马尔可夫决策过程（被 Automatica 接受，2022 年）
Constrained Reinforcement Learning for Vehicle Motion Planning with Topological Reachability Analysis, Paper, Not Find Code (Accepted by Robotics, 2022)
Constrained Reinforcement Learning for Vehicle Motion Planning with Topological Reachability Analysis，论文， Not Find Code （Accepted by Robotics， 2022）
Safe reinforcement learning using robust action governor, Paper, Not Find Code (Accepted by In Learning for Dynamics and Control, 2022)
使用鲁棒动作调控器的安全强化学习，论文，Not Find Code（被 In Learning for Dynamics and Control，2022 年接受）
A primal-dual approach to constrained markov decision processes, Paper, Not Find Code (Arxiv, 2022)
约束马尔可夫决策过程的原始对偶方法，论文，不查找代码（Arxiv，2022）
SAUTE RL: Almost Surely Safe Reinforcement Learning Using State Augmentation, Paper, Not Find Code (Arxiv, 2022)
SAUTE RL：几乎肯定使用状态增强的安全强化学习，论文，而不是查找代码（Arxiv，2022 年）
Finding Safe Zones of policies Markov Decision Processes, Paper, Not Find Code (Arxiv, 2022)
寻找策略的安全区域马尔可夫决策过程，纸质，而不是查找代码（Arxiv，2022 年）
CUP: A Conservative Update Policy Algorithm for Safe Reinforcement Learning, Paper, Code (Arxiv, 2022)
CUP：安全强化学习的保守更新策略算法，论文，代码（Arxiv，2022 年）
SAFER: Data-Efficient and Safe Reinforcement Learning via Skill Acquisition, Paper, Not Find Code (Arxiv, 2022)
SAFER：通过技能获取、纸质、而不是查找代码进行数据高效和安全的强化学习（Arxiv，2022 年）
Penalized Proximal Policy Optimization for Safe Reinforcement Learning, Paper, Not Find Code (Arxiv, 2022)
安全强化学习的惩罚近端策略优化，论文，找不到代码（Arxiv，2022 年）
Mean-Semivariance Policy Optimization via Risk-Averse Reinforcement Learning, Paper, Not Find Code (Arxiv, 2022)
通过规避风险强化学习进行均值-半方差策略优化，论文，Not Find Code（Arxiv，2022 年）
Convergence and sample complexity of natural policy gradient primal-dual methods for constrained MDPs, Paper, Not Find Code (Arxiv, 2022)
约束MDPs的自然策略梯度初对偶方法的收敛性和样本复杂性，论文，Not Find Code（Arxiv，2022）
Guided Safe Shooting: model based reinforcement learning with safety constraints, Paper, Not Find Code (Arxiv, 2022)
引导式安全射击：基于模型的安全约束强化学习，论文，找不到代码（Arxiv，2022 年）
Safe Reinforcement Learning via Confidence-Based Filters, Paper, Not Find Code (Arxiv, 2022)
通过基于置信度的过滤器、纸质、Not Find Code 进行安全强化学习（Arxiv，2022 年）
TRC: Trust Region Conditional Value at Risk for Safe Reinforcement Learning, Paper, Code (Accepted by IEEE RAL, 2022)
TRC：安全强化学习风险的信任区域条件值，论文，代码（被IEEE RAL接受，2022年）
Efficient Off-Policy Safe Reinforcement Learning Using Trust Region Conditional Value at Risk, Paper, Not Find Code (Accepted by IEEE RAL, 2022)
使用信任区域条件值进行有效的策略外安全强化学习，论文，未找到代码（IEEE RAL，2022 年接受）
Enhancing Safe Exploration Using Safety State Augmentation, Paper, Not Find Code (Arxiv, 2022)
使用安全状态增强增强安全探索，纸质，不查找代码（Arxiv，2022 年）
Towards Safe Reinforcement Learning via Constraining Conditional Value-at-Risk, Paper, Not Find Code (Accepted by IJCAI 2022)
通过约束条件风险值、纸质、不查找代码实现安全强化学习（被 IJCAI 2022 接受）
Safe reinforcement learning of dynamic high-dimensional robotic tasks: navigation, manipulation, interaction, Paper, Not Find Code (Arxiv, 2022)
动态高维机器人任务的安全强化学习：导航、操作、交互、纸、找不到代码（Arxiv，2022）
Safe Exploration Method for Reinforcement Learning under Existence of Disturbance, Paper, Not Find Code (Arxiv, 2022)
干扰存在下强化学习的安全探索方法，论文，不查找代码（Arxiv，2022）
Guiding Safe Exploration with Weakest Preconditions, Paper, Code (Arxiv, 2022)
Guiding Safe Exploration with Weakest Preconditions， Paper， Code （Arxiv， 2022）
Temporal logic guided safe model-based reinforcement learning: A hybrid systems approach, Paper, Not Find Code (Accepted by Nonlinear Analysis: Hybrid Systems, 2022)
时间逻辑引导的安全基于模型的强化学习：混合系统方法，论文，Not Find Code（被非线性分析接受：混合系统，2022 年）
Provably Safe Reinforcement Learning via Action Projection using Reachability Analysis and Polynomial Zonotopes, Paper, Not Find Code (Arxiv, 2022)
使用可达性分析和多项式Zonotopes通过行动投影进行可证明安全的强化学习，论文，Not Find Code（Arxiv，2022）
Model-based Safe Deep Reinforcement Learning via a Constrained Proximal Policy Optimization Algorithm, Paper, Code (Arxiv, 2022)
基于模型的安全深度强化学习，通过约束的近端策略优化算法，论文，代码（Arxiv，2022）
Safe Model-Based Reinforcement Learning with an Uncertainty-Aware Reachability Certificate, Paper, Not Find Code (Arxiv, 2022)
基于模型的安全强化学习，具有不确定性感知可达性证书，论文，找不到代码（Arxiv，2022 年）
UNIFY: a Unified Policy Designing Framework for Solving Constrained Optimization Problems with Machine Learning, Paper, Not Find Code (Arxiv, 2022)
UNIFY：使用机器学习解决约束优化问题的统一策略设计框架，论文，而不是查找代码（Arxiv，2022 年）
Enforcing Hard Constraints with Soft Barriers: Safe Reinforcement Learning in Unknown Stochastic Environments, Paper, Not Find Code (Arxiv, 2022)
使用软屏障实施硬约束：未知随机环境中的安全强化学习，论文，找不到代码（Arxiv，2022 年）
Safe Reinforcement Learning Using Robust Control Barrier Functions, Paper, Not Find Code (Accepted by IEEE RAL, 2022)
使用鲁棒控制屏障函数的安全强化学习，论文，未查找代码（IEEE RAL 接受，2022 年）
Model-free Neural Lyapunov Control for Safe Robot Navigation, Paper, Code, Demo (Accepted by IROS 2022)
用于安全机器人导航的无模型神经 Lyapunov 控制，论文、代码、演示（被 IROS 2022 接受）
Safe Reinforcement Learning via Probabilistic Logic Shields, Paper, Code (Accepted by IJCAI 2023, Distinguished Paper Award)
通过概率逻辑扩展板进行安全强化学习，论文，代码（被IJCAI 2023接受，杰出论文奖）

2.2. Safe Multi-Agent RL Baselines

2.2. 安全的多智能体RL基线

Multi-Agent Constrained Policy Optimisation (MACPO), Paper, Code (Arxiv, 2021)
多智能体约束策略优化（MACPO），论文，代码（Arxiv，2021 年）
MAPPO-Lagrangian, Paper, Code (Arxiv, 2021)
MAPPO-Lagrangian，论文，代码（Arxiv，2021 年）
Decentralized policy gradient descent ascent for safe multi-agent reinforcement learning, Paper, Not Find Code (Accepted by AAAI 2021)
Decentralized policy gradient descent ascent for safe multi-agent reinforcement learning， Paper， Not Find Code（被 AAAI 2021 接受）
Safe multi-agent reinforcement learning via shielding, Paper, Not Find Code (Accepted by AAMAS 2021)
通过屏蔽、纸质、未查找代码进行安全的多智能体强化学习（被 AAMAS 2021 接受）
CMIX: Deep Multi-agent Reinforcement Learning with Peak and Average Constraints, Paper, Not Find Code (Accepted by Joint European Conference on Machine Learning and Knowledge Discovery in Databases 2021)
CMIX： Deep Multi-agent Reinforcement Learning with Peak and Average Constraints， Paper， Not Find Code（被 2021 年欧洲机器学习和数据库中的知识发现联合会议接受）
Safe multi-agent reinforcement learning through decentralized multiple control barrier functions, Paper, , Not Find Code (Arxiv 2021)
通过分散式多控制屏障函数进行安全的多智能体强化学习，Paper，，Not Find Code （Arxiv 2021）

3. Surveys 3. 综述

A comprehensive survey on safe reinforcement learning, Paper (Accepted by Journal of Machine Learning Research, 2015)
关于安全强化学习的综合调查，论文（被《机器学习研究杂志》录用，2015年）

Safe learning and optimization techniques: Towards a survey of the state of the art, Paper (Accepted by In International Workshop on the Foundations of Trustworthy AI Integrating Learning, Optimization and Reasoning, 2020)
安全学习和优化技术：迈向最先进的调查，论文（2020 年被 In In the Foundations on the Foundations of Trustworthy AI Integrating Learning， Optimization and Reasoning 接受）

Safe learning in robotics: From learning-based control to safe reinforcement learning, Paper (Accepted by Annual Review of Control, Robotics, and Autonomous Systems, 2021)
机器人技术中的安全学习：从基于学习的控制到安全强化学习，论文（被《控制、机器人和自主系统年度评论》接受，2021 年）

Policy learning with constraints in model-free reinforcement learning: A survey, Paper (Accepted by IJCAI 2021)
无模型强化学习中具有约束的政策学习：一项调查，论文（被 IJCAI 2021 接受）

A Review of Safe Reinforcement Learning: Methods, Theory and Applications, Paper (Arxiv, 2022)
安全强化学习综述：方法、理论和应用，论文（Arxiv，2022 年）

State-wise Safe Reinforcement Learning: A Survey, Paper (Accepted by IJCAI 2023)
State-wise Safe Reinforcement Learning： A Survey， Paper （被 IJCAI 2023 接受）

4. Theses 4. 论文

Safe reinforcement learning, Thesis (PhD thesis, Philip S. Thomas, University of Massachusetts Amherst, 2015)
安全强化学习，论文（博士论文，Philip S. Thomas，马萨诸塞大学阿默斯特分校，2015年）
Safe Exploration in Reinforcement Learning: Theory and Applications in Robotics, Thesis (PhD thesis, Felix Berkenkamp, ETH Zurich, 2019)
强化学习中的安全探索：机器人理论与应用，论文（博士论文，Felix Berkenkamp，苏黎世联邦理工学院，2019年）