FM算法
FM(Factoriztion Machine,因子分解机
)是在逻辑回归基础上,增加了特征交叉项的参数学习,模型公式为:
y ^ = w 0 + ∑ i = 1 n w i x i + ∑ i = 1 n − 1 ∑ j = i + 1 n < v i , v j > x i x j \hat y=w0+\sum_{i=1}^n w_i x_i+\sum_{i=1}^{n-1}\sum_{j=i+1}^n<v_i,v_j>x_i x_j y^=w0+i=1∑nwixi+i=1∑n−1j=i+1∑n<vi,vj>xixj
其中 < v i , v j > <v_i,v_j> <vi,vj>表示两个大小为k的向量的点积,k称为FM算法的度:
v i = ( v i , 1 , v i , 2 , . . . , v i , k ) v_i=(v_{i,1},v_{i,2},...,v_{i,k}) vi=(vi,1,vi,2,...,vi,k)
v j = ( v j , 1 , v j , 2 , . . . , v j , k ) v_j=(v_{j,1},v_{j,2},...,v_{j,k}) vj=(vj,1,vj,2,...,vj,k)
< v i , v j > = ∑ f = 1 k v i , f ⋅ v j , f <v_i,v_j>=\sum_{f=1}^k v_{i,f} \cdot v_{j,f} <vi,vj>=f=1∑kvi,f⋅vj,f
交叉熵损失的变种
交叉熵损失的典型形式为:
J ( θ ) = − ∑ i = 1 m [ y i l o g ( h θ ( x i ) ) + ( 1 − y i ) l o g ( 1 − h θ ( x i ) ) ] , y = 0 , 1 J(\theta)=-\sum_{i=1}^m[y^ilog(h_{\theta}(x^i))+(1-y^i)log(1-h_{\theta}(x^i))], y=0,1 J(θ)=−i=1∑m[yilog(hθ(xi))+(1−yi)log(1−hθ(xi))],y=0,1
sigmoid函数有个特性: 1 − σ ( x ) = σ ( − x ) 1-\sigma(x)=\sigma(-x) 1−σ(x)=σ(−x),推导如下:
σ ( x ) = 1 1 + e − x = 1 e x + 1 e x = e x 1 + e x \sigma(x)=\frac{1}{1+e^{-x}}=\frac{1}{\frac{e^x +1}{e^x}}= \frac{e^x}{1+e^x} σ(x)=1+e−x1=exex+11=1+exex
σ ( x ) + σ ( − x ) = e x 1 + e x + 1 1 + e x = 1 \sigma(x)+\sigma(-x)=\frac{e^x}{1+e^x} + \frac{1}{1+e^{x}}=1 σ(x)+σ(−x)=1+exex+1+ex1=1
若样本的标签值用+1和-1表示正例和负例
时
- 当y=+1时, P ( y = + 1 ∣ x ) = σ ( y ^ ) P(y= +1|x)=\sigma(\hat y) P(y=+1∣x)=σ(y^)
- 当y=-1时, P ( y = − 1 ∣ x ) = 1 − σ ( x ) = σ ( − y ^ ) P(y= -1|x)=1-\sigma(x)=\sigma(-\hat y)