DIEN (AAAI’19)
这篇还是阿里的工作,进一步改进DIN,不仅要建模出用户行为的多兴趣,还要建模出时序关系。用两层GRU来建模多兴趣时序关系,引入了细粒度的辅助loss,在2018年算是工业界落地最复杂的模型了:
两层GRU分别是Interest Extractor Layer和Interest Evolving Layer。再介绍之前我们先回顾一下GRU是怎么更新的:
u t = σ ( W u i t + U u h t − 1 + b u ) r t = σ ( W r i t + U r h t − 1 + b r ) h ~ t = tanh ( W h i t + r t ∘ U h h t − 1 + b h ) h t = ( 1 − u t ) ∘ h t − 1 + u t ∘ h ~ t ; \begin{aligned} &\mathbf{u}_{t}=\sigma\left(W^{u} \mathbf{i}_{t}+U^{u} \mathbf{h}_{t-1}+\mathbf{b}^{u}\right) \\ &\mathbf{r}_{t}=\sigma\left(W^{r} \mathbf{i}_{t}+U^{r} \mathbf{h}_{t-1}+\mathbf{b}^{r}\right) \\ &\tilde{\mathbf{h}}_{t}=\tanh \left(W^{h} \mathbf{i}_{t}+\mathbf{r}_{t} \circ U^{h} \mathbf{h}_{t-1}+\mathbf{b}^{h}\right) \\ &\mathbf{h}_{t}=\left(\mathbf{1}-\mathbf{u}_{t}\right) \circ \mathbf{h}_{t-1}+\mathbf{u}_{t} \circ \tilde{\mathbf{h}}_{t ;} \end{aligned} ut=σ(Wuit+Uuht−1+bu)rt=σ(Writ+Urht−1+br)h