【联邦学习安全综述】A Comprehensive Survey of Privacy-preserving Federated Learning: A Taxonomy, Review, and F

作为综述,本文提供了很多写作素材。

原文标题:A Comprehensive Survey of Privacy-preserving Federated Learning: A Taxonomy, Review, and Future Directions

索引

本文不局限于介绍了以下内容:

  1. 举例了机器学习的大规模应用;数据隐私泄漏的危机事件;数据保护法规
  2. 很多别的FL privacy-preserving综述 1.2节
  3. 主流隐私保护技术;隐私度量
  4. 联邦学习的优势

2 联邦学习

主要研究方向

(1) 高效学习:improving the efficiency and effectiveness of FL [24, 68, 87, 100, 120, 211, 213], (2) 抵御攻击 improving the security of FL to attacks that aim to undermine the integrity of FL models and degrade the model performance [9, 13, 185, 198], and (3) 保护训练集隐私 improving the privacy preservation of FL on private user data and avoiding privacy leakages [116, 137, 144, 193, 199].

完整流程

包括server初始化、client本地训练、global aggregation

分类

根据 data partitioning,可以分为 纵向FL、横向FL、混合FL(这一种对应的是联邦迁移学习)
这篇文章用符号表示的不错。

联邦迁移学习

分类
Based on the categorization of transfer learning [136], FTL methods can be divided into three categories [205]: instance-based FTL, feature-based FTL, and parameter-based FTL.

用什么聚合

聚合 gradient or weight? 优劣分析
gradient:更加容易收敛
weight:不需要每一轮都聚合一次,稍微安全一些

相关概念的区分

在这里插入图片描述

3 主要的 隐私保护 机制

介绍了主流的保护机制,并进行了优劣比较。

Encryption Techniques

主要有:homomorphic encryption, secret sharing, and secure multi-party computation.

homomorphic encryption

homomorphic encryption 还可以分为 partially homomorphic encryption (支持 additively homomorphic encryption [88] 或 multiplicative homomorphic encryption,不可以兼得) [134] and fully homomorphic encryption (supports both additive and multiplicative operations ).
比较:Compared with partially homomorphic encryption, fully homomorphic encryption provides stronger encryption but suffers from computation costs.

secret sharing

Secret sharing [156] is a cryptographic scheme where a secret key consisting of n shares can be reconstructed only if a sufficient number of shares are combined.
缺点:
However, these methods are vulnerable to the dishonest dealer or malicious participants.

secure multi-party computation

优劣:
As its advantages, (1) there is no requirement of trusted third-parties, (2) the tradeoff between data utility and data privacy is eliminated, and (3) a high accuracy is achieved.
The disadvantages are computational overhead and high communication costs.

Perturbation Techniques

In summary, perturbation techniques are simple, efficient, and usually do not require knowledge of the data distribution. However, perturbed data may be vulnerable to probabilistic attacks, and it is difficult to mitigate such risk without reducing the data utility.

differential privacy

分为:global differential and local differential privacy

Additive perturbation

This technique is simple and can preserve the statistical properties [86]. However, it may degrade the data utility and may be vulnerable to a noise reduction.

Multiplicative perturbation

Compared with the additive perturbation, multiplicative perturbation is more effective, because reconstructing the original data from the perturbed data of the multiplicative perturbation is more difficult [50].

Anonymization Techniques

主要有 k-anonymity, l-diversity, and t-closeness。

Privacy-preserving Metrics

分为: (1) privacy metrics for measuring the loss of privacy of a dataset.
(2) utility metrics for measuring the data utility of the protected data for data analysis purposes.

4 联邦学习的隐私保护

主要的保护方法有:encryption-based, perturbation-based, anonymization-based, and hybrid PPFL.

本章先分析 risks, 再提出解决方案。

Privacy Risks in FL

从5个角度分析面临的威胁
在这里插入图片描述

1 攻击者是谁?

internal actors (participating clients and the central server) and external actors (model consumers and eavesdroppers).

2 被动or主动?

With FL, passive attackers only observe computations (e.g., the weights, gradients, and final model) during the training and inference phases [123, 228], whereas active attackers can influence the FL system by manipulating the model parameters to achieve adversarial goals [129, 165].

3 攻击发生阶段

Training phase and inference phase

4 什么形式的泄露

Weight update, gradient update, and the final model

model-parameter-based attacks leak much more sensitive information than the query-based attacks.

5 要窃取什么信息

Inference attacks, including inference of class representatives, memberships , properties of training data, and training samples and labels.
然后他把每一类攻击都讲了一遍。

防御措施

有点类似第三章,但是是针对 FL 场景下做的。

Homomorphic encryption-based PPFL

一般而言, each participant calculates its local gradient using its local dataset, and then encrypts the local gradient, and sends the encrypted gradient to the server for aggregation.
但是具体方法还是有不同的改进

However, the homomorphic encryption utilized in these studies brings about a large communication and
computational overhead.

Secret sharing–based PPFL

SMC-based PPFL

However, the secure aggregation protocol used in this approach incurs significant communication costs. A challenge of SMC-based FL methods is to improve the computational efficiency, because significant computational resources are required to complete a training round in FL frameworks [147].

Perturbation-based PPFL

(1) global differential privacy-based, (2) local differential privacy-based, (3) additive perturbation-based, and (4) multiplicative perturbation-based PPFL methods.
本文对上述四种方法都做了介绍。

Hybrid PPFL

混合了上述几种方法,提出的一个完整的框架。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

学渣渣渣渣渣

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值