因果推断理论框架 Potenial Outcomes Framework

1.Potenial Outcomes Framework

  因果效应通常无法直接计算(无法同时观测一个样本施加和不施加干预的结果),所以通常是通过观测数据推断,由于相关性 ≠ \neq =因果性,观测结果不直接等于ATE,Potenial Outcomes Framework提供了一套从观测结果获得因果效应的理论
定义:
X X X: 协变量
T T T:T=1干预组,T=0对照组
Y Y Y:observed outcome观测结果
Y 0 , Y 1 Y_0,Y_1 Y0,Y1:potential outcome潜在结果,如果接受干预T=1或者T=0时的潜在结果
E ( Y 0 ) , E ( Y 1 ) {E}(Y_0),{E}(Y_1) E(Y0),E(Y1):潜在结果的均值,如果所有人接受干预T=1(或者T=0)的均值
ATE(average causal treatment effect) :
Δ = μ 1 − μ 0 = E ( Y 1 ) − E ( Y 0 ) \Delta = \mu_1-\mu_0 = {E}(Y_1) - {E}(Y_0) Δ=μ1μ0=E(Y1)E(Y0)

2.Observation Studies

  针对某个样本无法同时获得T=1和T=0的结果,样本的潜在结果Y可以写为: Y = Y 1 T + Y 0 ( 1 − T ) Y = Y_1T + Y_0(1-T) Y=Y1T+Y0(1T)。通常情况下是无法从观测数据直接得到ATE的。由于confounders的存在,T=1和T=0组无法直接比较,导致相关性 ≠ \neq =因果性,相关性可由观测结果得到,因果性即为需要计算的ATE。

相关性: E ( Y ∣ T = 1 ) − E ( Y ∣ T = 0 ) E(Y|T=1)-E(Y|T=0) E(YT=1)E(YT=0)
因果性: E ( Y 1 ) − E ( Y 0 ) E(Y_1)-E(Y_0) E(Y1)E(Y0)

2.1 相关性 ≠ \neq =因果性举例

  一组观测数据发现穿鞋睡觉和醒来头痛有强相关性,这明显不符合常识:

E ( Y ∣ T = 1 ) − E ( Y ∣ T = 0 ) = E ( 头 痛 = 1 ∣ 穿 鞋 睡 觉 = 1 ) − E ( 头 痛 = 1 ∣ 穿 鞋 睡 觉 = 0 ) E(Y|T=1)-E(Y|T=0)=E(头痛=1|穿鞋睡觉=1)-E(头痛=1|穿鞋睡觉=0) E(YT=1)E(YT=0)=E(=1穿=1)E(=1穿=0)

  但是穿鞋睡觉和头痛相关,实际上是由confounder喝酒引起的。从下图中可以看到,T=1穿鞋睡觉组和T=0组喝酒人数占比相差很大。所以,要得到穿鞋睡觉对头痛的因果效应(ATE),需刨除喝酒影响,使得两组喝酒人数占比一致,这样两组数据结果才是可比的。

2.2 相关性 ≠ \neq =因果性证明

Y ‾ ( 1 ) \overline Y^{(1)} Y(1)为观测到的T=1的所有样本均值
Y ‾ ( 1 ) = E ( Y ∣ T = 1 ) = E ( Y 1 T + Y 0 ( 1 − T ) ∣ T = 1 ) = E ( Y 1 ∣ T = 1 ) (1) \overline Y^{(1)} = {E}(Y|T=1) = {E}( Y_1T + Y_0(1-T)|T=1) = {E}( Y_1|T=1) \tag1 Y(1)=E(YT=1)=E(Y1T+Y0(1T)T=1)=E(Y1T=1)(1) 但是 E ( Y 1 ∣ T = 1 ) ≠ E ( Y 1 ) {E}(Y_1|T=1) \neq {E}(Y_1) E(Y1T=1)=E(Y1) ,因为 E ( Y 1 ) {E}(Y_1) E(Y1)是所有样本接受干预的潜在结果的均值。
E ( Y 1 ∣ T = 1 ) − E ( Y 0 ∣ T = 0 ) = E ( Y 1 − Y 0 ∣ T = 1 ) ⏞ A T T + E ( Y 0 ∣ T = 1 ) − E ( Y 0 ∣ T = 0 ) ⏞ b i a s ≠ Δ ≠ E ( Y 1 ) − E ( Y 0 ) (2) \begin{aligned} {E}(Y_1|T=1)-{E}(Y_0|T=0) &= \overbrace{ {E}(Y_1-Y_0|T=1)}^{ATT} +\overbrace{ {E}(Y_0|T=1) - {E}(Y_0|T=0)}^{bias} \\ &\neq \Delta \neq {E}(Y_1) - {E}(Y_0) \tag2 \end{aligned} E(Y1T=1)E(Y0T=0)=E(Y1Y0T=1) ATT+E(Y0T=1)E(Y0T=0) bias=Δ=E(Y1)E(Y0)(2)

3.RCT随机实验

和观测数据比,RCT实验数据符合一下条件:
( Y 0 , Y 1 ) ⊥ T    ⟺    X ⊥ T {(Y_0,Y_1)} \bot {T} \iff X \bot T (Y0,Y1)TXT Y 1 ⊥ T {Y_1} \bot {T} Y1T表示对于观测到T=0的样本,如果接受干预,其潜在结果和T=1的样本一致。即是否接受干预对潜在结果无影响(直观理解是由于 T ⊥ X T \bot X TX,T=1和T=0两组人群可比,所以施加干预得到的潜在结果一致):
E ( Y 1 ∣ T = 1 ) = E ( Y 1 ∣ T = 0 ) = E ( Y 1 ) (3) {E}(Y_1|T=1) = {E}(Y_1|T=0)= {E}(Y_1) \tag3 E(Y1T=1)=E(Y1T=0)=E(Y1)(3) E ( Y 1 ∣ T = 0 ) {E}(Y_1|T=0) E(Y1T=0)是反事实对照结果,表示如果未干预组样本接受干预的潜在结果。由于一致性假设(将在下面阐述),T=1的潜在结果和实际观测结果一致,即 E ( Y 1 ∣ T = 1 ) = Y ‾ ( 1 ) E(Y_1|T=1)=\overline Y^{(1)} E(Y1T=1)=Y(1)
由于3式成立,
Y ‾ ( 1 ) − Y ‾ ( 0 ) = Δ = E ( Y 1 ) − E ( Y 0 ) \overline Y^{(1)}-\overline Y^{(0)} = \Delta = {E}(Y_1) - {E}(Y_0) Y(1)Y(0)=Δ=E(Y1)E(Y0)

4. 获得ATE无偏估计的假设

4.1 Unconfoundedness

  • conditional ignorability ??

  • exchangeability

    • 对于RCT实验数据,exchangeability Y 1 ⊥ T {Y_1} \bot {T} Y1T,如公式3所示,表示干预T与潜在结果 Y t Y_t Yt无关;
    • 对于观测数据,conditional exchangeability Y 1 ⊥   T ∣ X {Y_1} \bot \ T|X Y1 TX
    • Unconfoundedness is an untestable assumption

4.2 Positivity

  • 0 < P ( T = 1 ∣ X ) < 1 0<P(T=1|X)<1 0<P(T=1X)<1 ,在某个x下,如果全是treatment要么全是control,则无法计算真实ATE,此时causal effect是ill-defined(且在部分概率推到中导致除0)
  • 根据贝叶斯公式,这条假设也叫Overlap between P(X|T=1) & P(X|T=0)
  • 如果某些变量违背此假设,叫positivity violation,某些情况下我们可以外推结果

4.3 Consistency

  • T = t ⇒ Y = Y ( t ) T=t \Rightarrow Y=Y(t) T=tY=Y(t) ,此假设一般默认成立,排除confounder影响后或在实验设计中,所有样本施加同样的T,结果是一致的。举例:T为是否养狗,Y为是否开心,如果只要养狗,Y就等于开心则假设成立。如果养了一只金毛T=1,结果Y=1;养了一只哈士奇T=1,,结果Y=0。说明T定义不合理,需重新设计实验。

5. Adjustment

Adjustment by regression modeling
如果X包含所有confounders(sufficient adjustment sets),则数据满足如下条件:
( Y 0 , Y 1 ) ⊥ T ∣ X {(Y_0,Y_1)} \bot {T|X} (Y0,Y1)TX 给可以理解为,给定 X X X条件下 T T T Y 1 Y_1 Y1垂直,取某个X值时,组里X都是一样,结果差异不由confounders导致,阻断了X->Y的因果路径。

通过观测数据推断因果效应(ATE)公式推导如下:在这里插入图片描述

参考资料

因果推断—原理与方法(深度好文)
Propensity Score Methods总结
Potential Outcome - Brady Neal

  • 1
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
CREATE TABLE t_talent_hign_potential ( high_potential_id int(11) NOT NULL COMMENT 'id', employee_id int(100) NOT NULL COMMENT '员工id', final_job_level_id varchar(10) NOT NULL COMMENT '最终职级id', source char(1) DEFAULT NULL COMMENT '来源,字典HIGH_POTENIAL:0-导入,1-干部考察,2-高潜曝光', org_level varchar(10) DEFAULT NULL COMMENT '所属组织层级(允许有多个值,逗号隔开),字典AT_ORG_UNIT_TYPE:1-集团,2-一级组织,3-二级组织,4-分子公司', cadre_speciality text NOT NULL COMMENT '干部特质', main_weakness text NOT NULL COMMENT '主要短板', develop_advice text NOT NULL COMMENT '发展建议', next_plan text NOT NULL COMMENT '下一步计划', at_employee_id int(11) DEFAULT NULL COMMENT 'AT对接人id', current_process varchar(255) DEFAULT NULL COMMENT '当前进展', in_pool_date datetime DEFAULT NULL COMMENT '入池时间(冗余)', evaluation_source char(1) DEFAULT NULL COMMENT '来源,字典EVALUATION_SOURCE:0-导入,1-干部考察,2-高潜曝光', ref_id int(11) DEFAULT NULL COMMENT '关联id', create_by int(11) DEFAULT NULL COMMENT '创建人id', create_time datetime DEFAULT NULL COMMENT '创建时间', update_by int(11) DEFAULT NULL COMMENT '更新人id', last_update_by datetime DEFAULT NULL COMMENT '更新时间', hign_potential_status char(1) NOT NULL COMMENT '状态:是否在池,Y是N否', PRIMARY KEY (high_potential_id) USING BTREE ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='高潜资源池表'帮我创建java代码
06-02
Sure, here's a sample Java code for creating a class representing the table "t_talent_hign_potential": ```java import java.sql.*; import java.util.ArrayList; import java.util.List; public class HighPotential { private int highPotentialId; private int employeeId; private String finalJobLevelId; private String source; private String orgLevel; private String cadreSpeciality; private String mainWeakness; private String developAdvice; private String nextPlan; private int atEmployeeId; private String currentProcess; private Timestamp inPoolDate; private String evaluationSource; private int refId; private int createBy; private Timestamp createTime; private int updateBy; private Timestamp lastUpdateBy; private String highPotentialStatus; public HighPotential(int highPotentialId, int employeeId, String finalJobLevelId, String source, String orgLevel, String cadreSpeciality, String mainWeakness, String developAdvice, String nextPlan, int atEmployeeId, String currentProcess, Timestamp inPoolDate, String evaluationSource, int refId, int createBy, Timestamp createTime, int updateBy, Timestamp lastUpdateBy, String highPotentialStatus) { this.highPotentialId = highPotentialId; this.employeeId = employeeId; this.finalJobLevelId = finalJobLevelId; this.source = source; this.orgLevel = orgLevel; this.cadreSpeciality = cadreSpeciality; this.mainWeakness = mainWeakness; this.developAdvice = developAdvice; this.nextPlan = nextPlan; this.atEmployeeId = atEmployeeId; this.currentProcess = currentProcess; this.inPoolDate = inPoolDate; this.evaluationSource = evaluationSource; this.refId = refId; this.createBy = createBy; this.createTime = createTime; this.updateBy = updateBy; this.lastUpdateBy = lastUpdateBy; this.highPotentialStatus = highPotentialStatus; } public int getHighPotentialId() { return highPotentialId; } public void setHighPotentialId(int highPotentialId) { this.highPotentialId = highPotentialId; } public int getEmployeeId() { return employeeId; } public void setEmployeeId(int employeeId) { this.employeeId = employeeId; } public String getFinalJobLevelId() { return finalJobLevelId; } public void setFinalJobLevelId(String finalJobLevelId) { this.finalJobLevelId = finalJobLevelId; } public String getSource() { return source; } public void setSource(String source) { this.source = source; } public String getOrgLevel() { return orgLevel; } public void setOrgLevel(String orgLevel) { this.orgLevel = orgLevel; } public String getCadreSpeciality() { return cadreSpeciality; } public void setCadreSpeciality(String cadreSpeciality) { this.cadreSpeciality = cadreSpeciality; } public String getMainWeakness() { return mainWeakness; } public void setMainWeakness(String mainWeakness) { this.mainWeakness = mainWeakness; } public String getDevelopAdvice() { return developAdvice; } public void setDevelopAdvice(String developAdvice) { this.developAdvice = developAdvice; } public String getNextPlan() { return nextPlan; } public void setNextPlan(String nextPlan) { this.nextPlan = nextPlan; } public int getAtEmployeeId() { return atEmployeeId; } public void setAtEmployeeId(int atEmployeeId) { this.atEmployeeId = atEmployeeId; } public String getCurrentProcess() { return currentProcess; } public void setCurrentProcess(String currentProcess) { this.currentProcess = currentProcess; } public Timestamp getInPoolDate() { return inPoolDate; } public void setInPoolDate(Timestamp inPoolDate) { this.inPoolDate = inPoolDate; } public String getEvaluationSource() { return evaluationSource; } public void setEvaluationSource(String evaluationSource) { this.evaluationSource = evaluationSource; } public int getRefId() { return refId; } public void setRefId(int refId) { this.refId = refId; } public int getCreateBy() { return createBy; } public void setCreateBy(int createBy) { this.createBy = createBy; } public Timestamp getCreateTime() { return createTime; } public void setCreateTime(Timestamp createTime) { this.createTime = createTime; } public int getUpdateBy() { return updateBy; } public void setUpdateBy(int updateBy) { this.updateBy = updateBy; } public Timestamp getLastUpdateBy() { return lastUpdateBy; } public void setLastUpdateBy(Timestamp lastUpdateBy) { this.lastUpdateBy = lastUpdateBy; } public String getHighPotentialStatus() { return highPotentialStatus; } public void setHighPotentialStatus(String highPotentialStatus) { this.highPotentialStatus = highPotentialStatus; } public static List<HighPotential> getAllHighPotentials(Connection conn) throws SQLException { List<HighPotential> highPotentials = new ArrayList<>(); String sql = "SELECT * FROM t_talent_hign_potential"; Statement stmt = conn.createStatement(); ResultSet rs = stmt.executeQuery(sql); while (rs.next()) { HighPotential hp = new HighPotential( rs.getInt("high_potential_id"), rs.getInt("employee_id"), rs.getString("final_job_level_id"), rs.getString("source"), rs.getString("org_level"), rs.getString("cadre_speciality"), rs.getString("main_weakness"), rs.getString("develop_advice"), rs.getString("next_plan"), rs.getInt("at_employee_id"), rs.getString("current_process"), rs.getTimestamp("in_pool_date"), rs.getString("evaluation_source"), rs.getInt("ref_id"), rs.getInt("create_by"), rs.getTimestamp("create_time"), rs.getInt("update_by"), rs.getTimestamp("last_update_by"), rs.getString("hign_potential_status") ); highPotentials.add(hp); } return highPotentials; } } ``` You can modify the code as per your requirements and use it to interact with the database table.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值