Retrofitting Analysis

Retrofitting Analysis

To figure out the process of retrofitting[1] objective updating, we do the following math.

Forward Derivation

\[ \psi(Q) = \sum_{i=1}^{n}\left[ \alpha_i||q_i-\hat{q_i}||^2 + \sum\beta||q_i-q_j||^2 \right] \\ \frac{\partial \psi(Q)}{\partial q_i} = \alpha_i(q_i-\hat{q_i}) + \sum\beta(q_i-q_j) = 0 \\ (\alpha_i+\sum\beta_{ij})q_i -\alpha_i\hat{q_i} -\sum\beta_{ij}q_j = 0 \\ q_i = \frac{\sum\beta_{ij}q_j+\alpha_i\hat{q_i}}{\sum\beta_{ij}+\alpha_i} \]

Backward Derivation

This was how I understood this updating equation.

In the paper[1], it has mentioned "We take the first derivative of \(\psi\) with respect to one qi vector, and by equating it to zero", hence we get follow idea:
\[ \frac{\partial\psi(Q)}{\partial q_i} = 0 \]

And,

\[ q_i = \frac{\sum\beta_{ij}q_j+\alpha_i\hat{q_i}}{\sum\beta_{ij}+\alpha_i} \\ \alpha_iq_i - \alpha_i\hat{q_j} + \sum\beta_{ij}q_i - \sum\beta q_j = 0 \\ \alpha_i(q_i-\hat{q_j})+ \sum\beta_{ij}(q_i-q_j) = 0 \]

Apparently,
\[ \frac{\partial\psi(Q)}{\partial q_i} = \alpha_i(q_i-\hat{q_j})+ \sum\beta_{ij}(q_i-q_j) = 0 \]

Reference

Faruqui M, Dodge J, Jauhar S K, et al. Retrofitting Word Vectors to Semantic Lexicons[J]. ACL, 2015.

转载于:https://www.cnblogs.com/fengyubo/p/11158923.html

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值