【论文阅读】Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Agains-CSDN博客

本文链接：https://blog.csdn.net/qq_44848524/article/details/127240528

该研究揭示了深度强化学习在时间相关数据上的成员推理攻击的显著脆弱性，提出了一种针对性的攻击框架。实验表明，这种攻击能以超过84%的个人准确率和97%的集体准确率推断训练数据，引发对模型部署的隐私担忧。此外，学习状态对隐私泄露程度有重大影响。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

一.论文信息

题目：Where Did You Learn That From? Surprising Effectiveness of Membership Inference Attacks Against Temporally Correlated Data in Deep Reinforcement Learning（成员推理攻击在深度强化学习中对时间相关数据的惊人有效性）
**发表年份：**2021
**会议：**Axriv
**论文链接：**https://arxiv.org/abs/2109.03975
**作者信息：**Maziar Gomrokchi，Susan Amin，Hossein Aboutalebi

二.论文结构

alt加图
片描述

三.论文内容

摘要

While significant research advances have been made in the field of deep reinforcement learning, a major challenge to widespread industrial adoption of deep reinforcement learning that has recently surfaced but little explored is the potential vulnerability to privacy breaches. In particular, there have been no concrete adversarial attack strategies in literature tailored for studying the vulnerability of deep reinforcement learning algorithms to membership inference attacks.
To address this gap, we propose an adversarial attack framework tailored for testing the vulnerability of deep reinforcement learning algorithms to membership inference attacks. More specifically, we design a series of experiments to investigate the impact of temporal correlation, which naturally exists in reinforcement learning training data, on the probability of information leakage. Furthermore, we study the differences in the performance of collective and individual membership attacks against deep reinforcement learning algorithms. Experimental results show that the proposed adversarial attack framework is surprisingly effective at inferring the data used during deep reinforcement training with an accuracy exceeding 84% in individual and 97% in collective mode on two different control tasks in OpenAI Gym, which raises serious privacy concerns in the deployment of models resulting from deep reinforcement learning. Moreover, we show that the learning state of a reinforcement learning algorithm significantly influences the level of the privacy breach.

摘要中文版

虽然在深度强化学习领域取得了重大的研究进展，但最近浮出水面但很少探索的深度强化学习在工业上广泛采用的一个主要挑战是隐私泄露的潜在脆弱性。特别是，没有具体的对抗性攻击文献中针对研究深度强化学习算法对成员推理攻击的脆弱性而量身定制的策略。
为了解决这一差距，我们提出了一种对抗性攻击框架，专门用于测试深度强化学习算法对成员推理攻击的脆弱性。更具体地说，我们设计了一系列实验研究了强化学习训练数据中自然存在的时间相关性对信息泄露概率的影响。此外，我们研究了针对深度强化学习的集体和个人成员攻击的性能差异实验结果表明，所提出的对抗性攻击框架在推断深度强化训练期间使用的数据方面非常有效，在 OpenAI Gym 中的两个不同控制任务中，个人准确率超过 84%，集体模式下准确率超过 97%，这提高了严重的隐私性深度强化学习导致模型部署的担忧。此外，我们表明强化学习算法的学习状态显着影响隐私泄露的程度。

结论

1.与单个成员推理攻击相比，强化学习在集体环境中明显更容易受到成员推理攻击。
2.由环境设定的最大轨迹长度对深度强化学习模型中使用的训练数据是否容易受到成员推理攻击起着重要作用。
3.揭示了时间相关性在攻击训练中的作用，以及攻击者在多大程度上可以利用这些信息来设计针对深度强化学习的高精度成员推理攻击。