2021WSB-day4-1: Patel教授讲解Federated learning for biometrics application生物特征识别中的联邦学习机制

本文介绍了联邦学习的基本概念,包括其动机、关键技术如FedAvg及面临的挑战。探讨了如何通过差分隐私等方法保护用户数据,并展示了在生物特征认证等场景的应用案例。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >



听百家之言,集百家智慧,站在巨人肩上攀登


讲解的老师是来自约翰霍普金斯的Patel教授.

Vishal M. Patel is an Assistant Professor in the Department of Electrical and Computer Engineering (ECE) at Johns Hopkins University.

Federated learning for biometrics application

这个为我们不是经常见到的,所以今天我们讲一下啦

Agenda
Part 1
Motivation
Federated learning

  • FedAvg
    − - SplitNN
  • Privacy-enhancing methods for federated learning

Part 2

  • Applications
    • Face anti-spoofing
    • Active authentication
  • Open problems

AlexNet vs LeNet

AlexNet可以是因为:

  • Availability of large annotated data
  • More layers a Capture more invariances
    More computing
  • Availability and affordability of GPUs
    Better regularization
    Dropout
    New nonlinearities
  • Rectified Linear Unit (ReLU)
  • Parametric Rectified Linear Unit (PReLU)

最主要还是数据集。

数据不简单呀

Collecting and annotating datasets

  • Expensive
  • Labor intensive
  • User privacy issues
    • GDPR: General Data Protection Regulation
    • HIPAA: Health Insurance Portability and Accountability Act, 1996
    • SHIELD: Stop Hacks and Improve Electronic Data Security Act, Jan 12019
    • PCI: Payment Card Industry Data Security Standard, 2004
    • IRB: Institutional Review Board

所以,我们想知道怎么保护用户的隐私

Data privacy (protect the data)

  • Cancelable biometrics
    • Modify data through revocable and non-invertible transformations
  • BioHashing
    Random projections are used to generate templates
  • Differential privacy
    An algorithm is differentially private if its behavior hardly changes when a single individual joins or leaves the dataset
    Hide unique samples (add noise to data)
  • Homomorphic encryption
    Perform calculations on encrypted data

Federated learning (build protection into the models)

  • Machine learning on decentralized data
  • Communication-efficient learning of deep networks from decentralized data, AISTATS 2017, McMahan et al. (Google)

联邦学习

在这里插入图片描述

Federated Learning - Applications

  • Learning over smart phones
    • Mobile-based biometrics applications
    • Active authentication
  • Learning across organizations
    • Multi-institutional collaboration
  • Internet of things
    • Wearable devices, autonomous vehicles, smart homes, …

Federated Iearning - Challenges

Communication

  • Federated networks are comprised of a massive number of device’s which causes communication in the network to be slower than local computations (i.e. expensive communication)
  • Need communication-efficient methods that iteratively send model updates as part of the training process

Systems heterogeneity

  • Storage, computational, and communication capabilities of each device in federated networks may differ due to variability in hardware (CPU, memory), network connectivity ( 3 G , 4 G , 5 G , (3 G, 4 G, 5 G, (3G,4G,5G, wifi), and power (battery level)
  • Stragglers and fault tolerance significantly more prevalent

Non-IID data

  • Devices frequently generate and collect data in a non-identically distributed manner across the network.
  • Unbalanced data
  • Increases the likelihood of stragglers, and may add complexity in terms of modeling, analysis, and evaluation

Privacy issues

攻击者可以重构用户的数据,基于模型参数:
在这里插入图片描述

FL with differential privacy

你就往里面加noise。

传输参数的时候,你要threshold,truncate, 加noise。这样攻击者基本上就没有办法了。
在这里插入图片描述
Three key properties

  • There is a tradeoff between convergence performance and privacy protection levels, i.e., better convergence performance leads to a lower protection level
  • Given a fixed privacy protection level, increasing the number N \mathrm{N} N of overall clients participating in F L \mathrm{FL} FL can improve the convergence performance
  • There is an optimal number aggregation times (communication rounds) in terms of convergence performance for a given protection level

在这里插入图片描述

Tools

在这里插入图片描述

应用

在这里插入图片描述

检测是不是真的人

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

活跃验证

在这里插入图片描述
就是持续性的验证用户。用户拿着手机,走路,拍照,打字,触摸等都可以进行持续的验证授权。
在这里插入图片描述
在这里插入图片描述

one class classfication problem

在这里插入图片描述
找一个boundary。可以尝试解决一个分类的一些问题。

在这里插入图片描述

在这里插入图片描述
在这里插入图片描述

Summary

Federated learning promises to be an active area of research

Open problems

  • Domain adaptive FL methods
  • Benchmarks
  • Unsupervised and semi-supervised FL
  • Privacy preserving FL methods
  • Novel FL models for biometrics and surveillance applications

在这里插入图片描述

问答QA

问: safety, 如果用户总是在手机上做一些故意的错误的打字?

  • 有其他的用户的数据辅助
  • 我们需要的是average model不是local model

问: 我们怎么做FL的研究,我们没有这么多设备呀?

  • 很多人有做,我们可能使用同样的idea在不同的领域
  • 你可以尝试任何问题,如果你有这么多数据
  • 如果你没有足够数据,把他拆分,当作不同的数据中心

问: Hi, Vishal, great work! FedPAD is to average the weights, and is a linear solution. Since CNN is a non-linear model, do you have any non-linear solution to combine the parameters from different data centers?

  • aggregate参数的话,这个也可以是non-liner的, 在这里插入图片描述
    你可以加non-linear啦。

问: Thank you, professor. Are there any other privacy protection methods besides differential privacy?

  • differential privacy是因为有直接的证明
  • 其他的,譬如cancellable biometric
  • 这些都没有人做,你可以尝试

问:And how should we experiment when there are not so many mobile clients?

  • 是的,客服端越多越好
  • 没有的话,你也可以尝试,看看会怎样

问: 100手机,和只有1部手机?

  • local data local model,不会受到影响

问: Hello professor. Thank you for your presentation. The data should be synchronized when they are uploaded to the sever. Is there a particular strategy on severs about data synchronization and integrality for federated learning?

  • 数据一直是在本地,服务器只有模型参数

问: Thank you for the presentation.I want to know if a owner of the data also owns part of the copyright of the trained model (parameters) according to some laws such as GDPR

  • GDPR can you identify the ppl?
  • 有可能拿到用户数据,譬如重构图像
  • no, 不是的
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

MrCharles

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值