Deconfounded Recommendation for Alleviating Bias Amplification个人理解

Deconfounded Recommendation for Alleviating Bias Amplification

U
M
D
Y
I

实际中用户历史信息D、用户U、用户对不同类型I的表现M、物品I、预测结果Y的因果图,存在两条后门路径
U<-D->M->Y
M<-U->Y
我们需要矫正的是对U的embedding,因此不需要考虑M中介的问题(第二条后门路径)。对于第一条后门路径,可以阻断D->U或者D->M,但是M需要经由U和D来计算,其取值难以估计,不方便阻断,因此最简单的切断这条后门路径的方式是阻断D->U

U
M
Y
I
D
符号解释
u = [ u 1 , . . . , u K ] , u K ∈ R H u=[u_1,...,u_K],u_K \in \R^H u=[u1,...,uK],uKRH用户
x = [ x u , 1 , . . . , x u , K ] x=[x_{u,1},...,x_{u,K}] x=[xu,1,...,xu,K]用户特征
d u = [ p u ( g 1 ) , . . . , p u ( g N ) ] d_u=[p_u(g_1),...,p_u(g_N)] du=[pu(g1),...,pu(gN)]用户历史上对某一类型I的倾向性
m = M ( d , u ) ∈ R H m=M(d,u)\in \R^H m=M(d,u)RH用户基于历史交互的组特征
H u \mathcal{H}_u Hu交互记录的I的集合
q i = [ q g 1 i , . . . , q g n i ] ∈ R H q^i=[q_{g_1}^i,...,q_{g_n}^i]\in \R^H qi=[qg1i,...,qgni]RHI属于每个组的概率
v = [ v 1 , . . . , v N ] , v N ∈ R H v=[v_1,...,v_N],v_N\in \R^H v=[v1,...,vN],vNRH组的特征

P ( Y ∣ U = u , I = i ) = ∑ d ∈ D ∑ m ∈ M P ( d ) P ( u ∣ d ) P ( m ∣ d , u ) P ( i ) P ( Y ∣ u , i , m ) P ( u ) P ( i ) = ∑ d ∈ D ∑ m ∈ M P ( d ∣ u ) P ( m ∣ d , u ) P ( Y ∣ u , i , m ) = ∑ d ∈ D P ( d ∣ u ) P ( Y ∣ u , i , M ( d , u ) ) = P ( d u ∣ u ) P ( Y ∣ u , i , M ( d u , u ) ) P ( Y ∣ d o ( U = u ) , I = i ) = ∑ d ∈ D P ( d ∣ d o ( U = u ) ) P ( Y ∣ d o ( U = u ) , i , M ( d , d o ( U = u ) ) ) = ∑ d ∈ D P ( d ) P ( Y ∣ d o ( U = u ) , i , M ( d , d o ( U = u ) ) ) = ∑ d ∈ D P ( d ) P ( Y ∣ u , i , M ( d , u ) ) \begin{aligned} P&(Y|U=\mathbf{u},I=\mathbf{i}) \\ &=\frac{\textstyle \sum_{\mathbf{d} \in D} \sum_{\mathbf{m} \in M}P(\mathbf{d})P(\mathbf{u}|\mathbf{d})P(\mathbf{m}|\mathbf{d},\mathbf{u})P(\mathbf{i})P(Y|\mathbf{u},\mathbf{i},\mathbf{m})}{P(\mathbf{u})P(\mathbf{i})} \\ &=\textstyle \sum_{\mathbf{d} \in D} \sum_{\mathbf{m} \in M}P(\mathbf{d}|\mathbf{u})P(\mathbf{m}|\mathbf{d},\mathbf{u})P(Y|\mathbf{u},\mathbf{i},\mathbf{m})\\ &=\textstyle \sum_{\mathbf{d} \in D}P(\mathbf{d}|\mathbf{u})P(Y|\mathbf{u},\mathbf{i},M(\mathbf{d},\mathbf{u}))\\ &=P(\mathbf{d}_u|\mathbf{u})P(Y|\mathbf{u},\mathbf{i},M(\mathbf{d}_u,\mathbf{u}))\\ P&(Y|do(U=\mathbf{u}),I=\mathbf{i}) \\ & = \displaystyle \sum_{\mathbf{d} \in \mathcal{D} }P(\mathbf{d}|do(U=u))P(Y|do(U=\mathbf{u}),\mathbf{i},M(\mathbf{d},do(U=\mathbf{u}))) \\ & = \displaystyle \sum_{\mathbf{d} \in \mathcal{D} }P(\mathbf{d})P(Y|do(U=\mathbf{u}),\mathbf{i},M(\mathbf{d},do(U=\mathbf{u})))\\ & = \displaystyle \sum_{\mathbf{d} \in \mathcal{D} }P(\mathbf{d})P(Y|\mathbf{u},\mathbf{i},M(\mathbf{d},\mathbf{u})) \end{aligned} PP(YU=u,I=i)=P(u)P(i)dDmMP(d)P(ud)P(md,u)P(i)P(Yu,i,m)=dDmMP(du)P(md,u)P(Yu,i,m)=dDP(du)P(Yu,i,M(d,u))=P(duu)P(Yu,i,M(du,u))(Ydo(U=u),I=i)=dDP(ddo(U=u))P(Ydo(U=u),i,M(d,do(U=u)))=dDP(d)P(Ydo(U=u),i,M(d,do(U=u)))=dDP(d)P(Yu,i,M(d,u))
由于D的范围是无限的,对上面应用后门准则计算后的公式进行优化,只考虑交互过的D
$p_u(g_n)=\displaystyle \sum_{i \in I}p(g_n|i)p(i|u)=\frac{\sum_{i \in \mathcal{H}u}q{g_n}^i}{|\mathcal{H}_u|} $
P ( Y ∣ d o ( U = u ) , I = i ) = ∑ d ∈ D P ( d ) P ( Y ∣ u , i , M ( d , u ) ) ≈ ∑ d ∈ D P ( d ) f ( u , i , M ( d , u ) ) = f ( u , i , M ( ∑ d ∈ D P ( d ) d , u ) ) = f ( u , i , M ( d ˉ , u ) ) \begin{aligned} P&(Y|do(U=\mathbf{u}),I=\mathbf{i}) \\ & = \displaystyle \sum_{\mathbf{d} \in \mathcal{D} }P(\mathbf{d})P(Y|\mathbf{u},\mathbf{i},M(\mathbf{d},\mathbf{u})) \\ & \approx \displaystyle \sum_{\mathbf{d} \in \mathcal{D} }P(\mathbf{d})f(\mathbf{u},\mathbf{i},M(\mathbf{d},\mathbf{u})) \\ & = f(\mathbf{u},\mathbf{i},M(\displaystyle \sum_{\mathbf{d} \in \mathcal{D}}P(\mathbf{d})\mathbf{d},\mathbf{u})) \\ & = f(\mathbf{u},\mathbf{i},M(\bar{\mathbf{d}},\mathbf{u})) \\ \end{aligned} P(Ydo(U=u),I=i)=dDP(d)P(Yu,i,M(d,u))dDP(d)f(u,i,M(d,u))=f(u,i,M(dDP(d)d,u))=f(u,i,M(dˉ,u))
可以利用FM来求解 M ( d ˉ , u ) M(\bar{\mathbf{d}},\mathbf{u}) M(dˉ,u)
M ( d ˉ , u ) = ∑ a = 1 N ∑ b = 1 K p ( g a ) v a ⊙ x u , b u b = ∑ a = 1 N + K ∑ b = 1 N + K w a c a ⊙ w b c b \begin{aligned} M(\bar{\mathbf{d}},\mathbf{u}) & = \displaystyle \sum_{a=1}^N\displaystyle \sum_{b=1}^Kp(g_a)v_a\odot x_{u,b}\mathbf{u}_b\\ & =\displaystyle \sum_{a=1}^{N+K}\displaystyle \sum_{b=1}^{N+K}w_a\mathbf{c}_a\odot w_b\mathbf{c}_b \end{aligned} M(dˉ,u)=a=1Nb=1Kp(ga)vaxu,bub=a=1N+Kb=1N+Kwacawbcb
其中
w = [ d ˉ , x u ] c = [ v , u ] \begin{aligned} \mathbf{w}&=[\bar{\mathbf{d}},\mathbf{x}_u] \\ \mathbf{c}&=[\mathbf{v},\mathbf{u}] \end{aligned} wc=[dˉ,xu]=[v,u]

根据timestamp信息分为两组,运用KL分歧对用户的兴趣变化进行量化。将普通推荐系统模型与融入了去处混杂因子的模型预测结果进行融合
η u = 𝐾 𝐿 ( d u 1 ∣ d u 2 ) + 𝐾 𝐿 ( d u 2 ∣ d u 1 ) = ∑ n = 1 N P u 1 ( g n ) P u 1 ( g n ) P u 2 ( g n ) + ∑ n = 1 N P u 2 ( g n ) P u 2 ( g n ) P u 1 ( g n ) Y u , i = ( 1 − η ^ u ) ∗ Y u , i R S + η ^ u ∗ Y u , i D E C R S \begin{aligned} \eta_u&=𝐾𝐿(d_u^1|d_u^2)+𝐾𝐿(d_u^2|d_u^1) \\ &=\displaystyle \sum_{n=1}^NP_u^1(g_n)\frac{P_u^1(g_n)}{P_u^2(g_n)}+\displaystyle \sum_{n=1}^NP_u^2(g_n)\frac{P_u^2(g_n)}{P_u^1(g_n)}\\ Y_{u,i}&=(1-\hat{\eta}_u)*Y_{u,i}^{RS}+\hat{\eta}_u*Y_{u,i}^{DECRS} \end{aligned} ηuYu,i=KL(du1du2)+KL(du2du1)=n=1NPu1(gn)Pu2(gn)Pu1(gn)+n=1NPu2(gn)Pu1(gn)Pu2(gn)=(1η^u)Yu,iRS+η^uYu,iDECRS
其中(MinMaxScaler一下权重的超参数)
η ^ u = ( η u − η m i n η m a x − η m i n ) α \hat{\eta}_u=(\frac{\eta_u-\eta_{min}}{\eta_{max}-\eta_{min}})^\alpha η^u=(ηmaxηminηuηmin)α

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值