【论文笔记08】Model inversion attacks that exploit confidence information and basic countermeasures 模型反转攻击

本文链接：https://blog.csdn.net/weixin_45632492/article/details/115393147

目录导引

系列传送
Model inversion attacks that exploit confidence information and basic countermeasures
Reference

系列传送

【Active Learning】
【论文笔记01】Learning Loss for Active Learning, CVPR 2019
【论文笔记02】Active Learning For Convolutional Neural Networks: A Core-Set Approch, ICLR 2018
【论文笔记03】Variational Adversarial Active Learning, ICCV 2019
【论文笔记04】Ranked Batch-Mode Active Learning，ICCV 2016

【Transfer Learning】
【论文笔记05】Active Transfer Learning, IEEE T CIRC SYST VID 2020
【论文笔记06】Domain-Adversarial Training of Neural Networks, JMLR 2016

【Differential Privacy】
【论文笔记07】A Survey on Differentially Private Machine Learning, IEEE CIM 2020

【Model inversion attack】
【论文笔记08】Model inversion attacks that exploit confidence information and basic countermeasures, SIGSAC 2015

Model inversion attacks that exploit confidence information and basic countermeasures

原文传送

1 Abstract

Machine-learning (ML) algorithms are increasingly utilized in privacy-sensitive applications such as predicting lifestyle choices, making medical diagnoses, and facial recognition. 机器学习算法在带有敏感隐私的领域的应用越来越多了，包括预测生活风格，药物诊断，面部试别。

在model inversion attack领域，Fredrikson等人最近做了一个关于个人药物熵地线性分类器案例，表明对ML模型的对抗访问可以被滥用来学习个体的敏感基因组信息(sensitive genomic information)。

本文设计了新的model inversion attack方法，在获得predictions的同时也explots confidence values.

This paper study two settings:

decision trees for lifestyle surveys as used on machine-learning-as-a-service systems
neural networks for facial recognition.

Their experiments show that:

Attacks that are able to estimate whether a respondent in a lifestyle survey admitted to cheating on their significant other.
The attack model can recover recognizable images of people’s faces given only their name and access to the ML model.

在攻击之外，本文还探究了一些基本对策，说明在不怎么影响模型性能的情况下(with negligible degradation to utility)，人们可以避免这些MI(model inversion) attacks.

2 Background

2.1 ML basis

本质上，机器学习就是在学习一个从feature space到response space的一个函数：
$f:\mathbb{R}^d \rightarrow Y$
where $d$ means there are d features. 如果 $Y$ 是一个有限集合(finite set)，我们称 $f$ 是一个分类器(classifier), $Y$ 是类别标签(classes).；而如果 $Y = R$ 那么 $f$ 是一个回归模型，或者直接称为回归。

2.2 ML APIs

setting
Systems that incorporate models $f$ will do so via well-defined application-programming interfaces (APIs).

The recent trend towards ML-as-a-service systems exemplifies this model, where users upload their training data and subsequently use such an API to query a model trained by the service.

这些API一般是在HTTP(S)上提供的，微软，谷歌都有。
根据API提供的接入权限不同，导致black-box和white-box两种情形。

2.3 Threat models

black-box
只能在线地获得responses

white-box
可以获得 $f$ 的描述，即可以在本地运行模型

3 The Fredrikson et al. attack

在这里插入图片描述

4 Map inverters for trees

这一节介绍基于树模型的攻击

4.1 Decision Tree

Decision Tree 决策树可以分为两类：

Regression, where the class variable is continuous
Classification, where the class variable is discrete

A decision tree model recursively partitions the feature space into disjoint regions $R_1$ , …, $R_m$ . Predictions are made for an instance $(x; y)$ by finding the region containing $x$ , and returning the most likely value for $y$ observed in the training data within that region.[1]