Detection of Shilling Attack Based on Bayesian Model and User Embedding (ICTAI’2018)
直接将user embedding输入到分类器,因此需要强化一下user embedding,把更多的信息加进去。
三个loss训练user embedding。
loss1:Matrix factorization
矩阵分解正常loss,分解user-item评分矩阵 R R R
L = ∑ u ∈ m , i ∈ n ( r u , i − r ^ u , i ) + λ ( ∑ u ∥ p u ∥ 2 + ∑ i ∥ q i ∥ 2 ) L=\sum_{u \in m, i \in n}\left(r_{u, i}-\widehat{r}_{u, i}\right)+\lambda\left(\sum_{u}\left\|p_{u}\right\|^{2}+\sum_{i}\left\|q_{i}\right\|^{2}\right) L=u∈m,i∈n∑(ru,i−r
u,i)+λ(u∑∥pu∥2+i∑∥qi∥2)
loss2:User embedding
大致思路是构建一个user-user矩阵 M M M,用user embedding分解矩阵 M M M。
构建user-user 矩阵 M M M
构建user-user SPPMI (Shifted Positive Point Mutual Information),构建方式:
P M I ( u , v ) = log # ( u , v ) ⋅ ∣ D ∣ # u ⋅ # v S P P M I ( u , v ) = max { P M I ( u , v ) − log s , 0 } \begin{gathered} P M I(u, v)=\log \frac{\#(u, v) \cdot|D|}{\# u \cdot \# v} \\ S P P M I(u, v)=\max \{P M I(u, v)-\log s, 0\} \end{gathered} PMI(u,v)=log#u⋅#v#(u,v)⋅∣D∣SPPMI(u,v)=max{
PMI(u,v)−logs,0}
where # ( u , v ) \#(u, v) #(u,v) denotes the number of items that have been jointly rated by both user u u u and user v v v, # ( u ) = ∑ v # ( u , v ) \#(u)=\sum_{v} \#(u, v) #(u)=<