目录
Introduction
背景论文:Membership inference attacks against machine learning models
1.Membership inference attack (MIA) can succeed on overfitted models with only black-box access to the model
2.Attack model can be constructed using labeled datasets generated by a set of shadow models
3. MIA is effective when the target models are overfitted to training data
问题:whether it is feasible to perform MIA on well-generalized models with only black-box access
关于ML privacy risk,作者的思考方向与答案:
(1) Is overfitting a root cause of membership disclosure from a machine learning model?
Overfitting can be sufficient but is by no means necessary for exposing membership information from training data.
(2) Is generalization the right solution for membership disclosure?
Existing regularization approaches are insufficient to defeat MIA(这一点与背景论文中的结论不同)
(3) What is the fundamental cause of membership disclosure?
Such information leaks are caused by the unique influences a specific instance in the training set can have on the learning model
Overfitting is essentially a special case of such unique influences but the generic situation is much more complicated.
In detection of overfitting:look for an instance’s positive impact on a model’s accuracy for the training set and limited/negative impact for the testing set.
In detection of the unique influences in general:consider the case when an instance both contributes useful information to the model for predicting other instances and brings in noise uniquely characterizing itself.
The model generalization methods that suppress overfitting may reduce the noise introduced by training instances, but cannot completely remove their unique influences, particularly the influences essential for the model’s prediction power.
GMIA–Generalized MIA
overfitted models: answers (probabilities) to the queries on the training instances differ significantly from those to other queries
well-generalized model :behaves similarly on the training data and test data, therefore Cannot utilize shadow models to generate a meaningful training set for the attack model
GMIA: detecting and analyzing vulnerable target records (outliers) to infer their membership
- extracting high-level feature vector from the intermediate outputs of models trained on the data(accessible to the adversary) to estimate whether a given instance is an outlier
假设前提:an outlier is more likely to be a vulnerable target record when it