Counterfactual Fairness 2021.3.12

概述

本文是机器学习公平性领域的文章,介绍了反事实公平。反事实公平,是指同一个体或者群体的预测结果在现实世界中和反事实世界中应该相似(仅受保护属性不同)。作者使用利用因果推断的工具制定建模公平性的框架。
本文的主要贡献是:
利用因果框架来模拟受保护属性和数据之间的关系,描述了因果推断技术如何设计公平算法,并论证了这对于正确地解决公平性的因果关系至关重要。

详述

一、背景

Let A denote the set of protected attributes of an individual,X denote the other observable attributes of any particular individual, U the set of relevant latent attributes which are not observed, and let Y denote the outcome to be predicted, which itself might be contaminated with historical biases,Y^ is the predictor.

1、公平性定义

(1)Fairness Through Unawareness (FTU)
An algorithm is fair so long as any protected attributes A are not explicitly used in the decision-making process.

(2) Individual Fairness (IF)
Given a metric d(·, ·), if individuals i and j are similar under this metric (i.e., d(i, j) is small) then their predictions should be similar:ˆY (X(i), A(i)) ≈ˆY (X(j), A(j)).

(3)Demographic Parity (DP)
P(ˆY |A = 0) = P(ˆY |A = 1).

(4)Equality of Opportunity (EO)
P(ˆY = 1|A = 0, Y = 1) = P(ˆY = 1|A = 1, Y = 1)

2、因果图和反事实

内生变量(endogenous variable)
在因果模型或者因果系统中,如果一个变量能够被该系统中的其他变量所决定或被影响,那么就称这个变量为内生变量。纯内生变量是说,这个变量可以系统中的其他变量所完全决定,理论上精确无误。那么这个变量其实也就没用了,可以被其他变量所替代掉。半内生变量是说这个变量会被系统中的其他变量所影响,但是不会被完全决定,还有一些系统外未考虑的因素。纯内生和半内生都是内生变量,之所以区别是为了理解因果关系上的差异。
外生变量(exogenous variable)
在因果模型或者因果系统中,如果一个变量独立于系统中其他所有变量,其他变量的变化不对该变量造成影响,那么就称该变量为外生变量。该系统的外生变量可能由系统外的因素所决定。要注意的是外生变量是独立于其他所有变量,所以其对立面是内生变量和半内生变量。另外如果你在模型中添加了会对外生变量产生影响的变量,那么该外生变量就不再是外生变量了。

因果上讲,外生变量是是对因变量的独立的决定因素;统计上讲,外生变量线性独立于其他自变量。
举一个造糖厂的例子,假定划定的系统内包含糖产量、天气、害虫、燃料价格等变量。我们可以假设糖产量是就是一个完全的内生变量,认为产量被天气、害虫、燃料价格这些因素完全决定。天气就是一个完全的外生变量,因为糖产量、害虫、燃料价格是不会影响到天气的。而害虫就是一个半内生变量,害虫某种程度是受天气影响的,但是不是被天气完全决定,还受到比如说农药、天敌威胁等外部因素的影响。
来自https://www.zhihu.com/question/56223861/answer/160750437

(1)We define a causal model as a triple (U, V, F) of sets

U is a set of latent background variables, which are factors not caused by any variable in the set V of observable variables;
F is a set of functions {f1, . . . , fn}, one for each Vi∈ V , such that Vi= fi(pai, Upai), pai⊆ V{Vi} and Upai⊆ U.

(2)反事实计算步骤:
外展(Abduction):使用证据W=w来确定U的后验分布。
干预(Action):通过用Z= z 来替换原来模型F中变量Z 的表达式,从而修改原模型F 为 Fz。
预测(Prediction):使用修改后的模型Fz 和第一步计算出的P(U |W = w)来计算V的剩余元素上的隐含分布。

https://zhuanlan.zhihu.com/p/120909701这里有具体计算示例。

3、反事实公平

(1)Definition
Predictor ˆY is counterfactually fair if under any context X = x and A = a,
P ( Y ^ A ← a ( U ) = y ∣ X = x , A = a ) = P ( Y ^ A ← a ′ ( U ) = y ∣ X = x , A = a ) P\left(\hat{Y}_{A \leftarrow a}(U)=y \mid X=x, A=a\right)=P\left(\hat{Y}_{A \leftarrow a^{\prime}}(U)=y \mid X=x, A=a\right) P(Y^Aa(U)=yX=x,A=a)=P(Y^Aa(U)=yX=x,A=a)
for all y and for any value a’ attainable by A.
Differences between Xa and Xa’ must be caused by variations on A only.

(2)Examples
Scenario 1: The Red Car.
U:aggressive driving
X:the red car feature
A: race
Y: accident rate
在这里插入图片描述
Using the red car feature X to predict accident rate Y would seem to be an unfair prediction.

Scenario 2: High Crime Regions.
U: the totality of socioeconomic factors and policing practices that both influence where an individual may live and how likely they are to be arrested and charged.
(所有社会经济因素和警务实践,这些因素既影响个人可能居住的地方,也影响他们被逮捕和起诉的可能性。)
X: neighborhood(location)
A: race
Y: crime rates
在这里插入图片描述
Higher observed arrest rates in some neighborhoods are due to greater policing there, not because people of different races are any more or less likely to break the law.

(3)Implications(启示)
在这里插入图片描述

4、实施反事实公平

(1)算法
在这里插入图片描述
(2)设计输入因果模型
反事实公平的假设:
1⃣️ Build ˆY using only the observable non-descendants of A.
2⃣️ Postulate background latent variables that act as non-deterministic causes of observable variables.Information about X is passed to Y ^ \hat{Y} Y^ via P(U | x, a).
3⃣️ Postulate a fully deterministic model with latent variables.

三个层面下的反事实公平概念:
Level 1. Build Y ^ \hat{Y} Y^ using only the observable non-descendants of A.

Level 2. Postulate background latent variables that act as non-deterministic causes of observable variables, based on explicit domain knowledge and learning algorithms.

Level 3. Postulate a fully deterministic model with latent variables.

5、例子

(1) the Prediction of Success in Law School

The Law School Admission Council conducted a survey across 163 law schools in the United States . It contains information on 21,790 law students such as their entrance exam scores (LSAT), their grade-point average (GPA) collected prior to law school, and their first year average grade (FYA).

Given this data, a school may wish to predict if an applicant will have a high FYA.These predictions are not biased by an individual’s race and sex.

  1. Full: the standard technique of using all features, including sensitive features such as race and sex to make predictions;
  2. Unaware: fairness through unawareness, where we do not use race and sex as features.

Level 1 uses any features which are not descendants of race and sex for prediction.

As we believe LSAT, GPA, and FYA are all biased by race and sex, we cannot use any observed features to construct a counterfactually fair predictor as described in Level 1.

Level 2 models latent ‘fair’ variables which are parents of observed variables.
These variables are independent of both race and sex.

we postulate that a latent variable: a student’s knowledge (K), affects GPA, LSAT, and FYA scores.We perform inference on this model using an observed training set to estimate the posterior distribution of K.
在这里插入图片描述

Level 3 models the data using an additive error model, and uses the independent error terms to make predictions.

we model GPA, LSAT, and FYA as continuous variables with additive error terms independent of race and sex (that may in turn be correlated with one-another).
在这里插入图片描述

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值