论文解惑《word2vec Parameter Learning Explained》1.1--CBOW模型中One-word context情况公式推导问题

最新推荐文章于 2022-11-26 21:38:17 发布

K_Snail

最新推荐文章于 2022-11-26 21:38:17 发布

阅读量443

点赞数 1

分类专栏：论文解惑文章标签： word2vec

本文链接：https://blog.csdn.net/l1l1l1l/article/details/102914512

版权

论文解惑专栏收录该内容

1 篇文章 0 订阅

订阅专栏

word2vec中有CBOW和Skip-Gram模型，对于两个模型中的参数如何学习的公式推导，在《word2vec Parameter Learning Explained》中有详细解释，我在阅读1.1节One-word context时对于公式(8)的推导感到不解，花了些时间，原文如下：
“Let us now derive the update equation of the weights between hidden and output layers. Take the derivative of E with regard to $j$ -th unit’s net input $u_j$ , we obtain $\frac{\partial E}{\partial u_j}=y_j-t_j:=e_j$ where $t_j=\mathbb{1}(j=j^*),\text{i.e},t_j$ will only be 1 when the $j$ -th unit is the output word, otherwise $t_j=0.$ ”
我一开始不明白是怎么推到这一步的，后来发现过程很显然：
$\begin{aligned} E & =\text{log}\sum_{j'=1}^V{\text{exp}(u_{j'})-u_{j*}} \\ e_j=\frac{\partial E}{\partial u_j} & =\frac{\text{exp}(u_j)}{\sum_{j'=1}^V{\text{exp}(u_{j'})}}-u_{j*} \\ & =y_j-u_{j*} \\ & =y_j-t_i \end{aligned}$

K_Snail

关注

1
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
论文解惑《word2vec Parameter Learning Explained》1.1--CBOW模型中One-word context情况公式推导问题

word2vec中有CBOW和Skip-Gram模型，对于两个模型中的参数如何学习的公式推导，在《word2vec Parameter Learning Explained》中有详细解释，我在阅读1.1节One-word context时对于公式(8)的推导感到不解，花了些时间，原文如下： “Let us now derive the update equation of the weig...
复制链接

扫一扫