非负矩阵分解的矩阵求解

0. 带约束问题的最小化

m i n f ( x ) s . t . ( 1 ) g i ( x ) ≤ 0 , i = 1 , 2 , ⋯   , m ( 2 ) h i ( x ) = 0 , i = 1 , 2 , ⋯   , q min f(x) \\ s.t. \\ (1) g_i(x)\leq0, i=1,2,\cdots,m \\ (2) h_i(x)=0, i=1,2,\cdots,q minf(x)s.t.(1)gi(x)0,i=1,2,,m(2)hi(x)=0,i=1,2,,q
可以利用KKT条件,将问题转换为无约束最小化:
y ( x ∣ λ i , v i ) = f ( x ) + ∑ i = 1 m λ i g i ( x ) + ∑ i = 1 q v i h i ( x ) y(x|\lambda_i, v_i) = f(x) + \sum_{i=1}^m\lambda_i g_i(x) + \sum_{i=1}^q v_i h_i(x) y(xλi,vi)=f(x)+i=1mλigi(x)+i=1qvihi(x)
其中, λ i , v i \lambda_i, v_i λi,vi为拉格朗日乘数。
局部极小值解满足KKT条件:

  1. g i ( x ∗ ) ≤ 0 , i = 1 , 2 , ⋯   , m g_i(x^*)\leq 0, i=1,2,\cdots,m gi(x)0,i=1,2,,m
  2. h x ( x ) = 0 , i = 1 , 2 , ⋯   , q h_x(x)=0, i=1,2,\cdots,q hx(x)=0,i=1,2,,q
  3. λ i ≥ 0 , i = 1 , 2 , ⋯   , m \lambda_i\geq0, i=1,2,\cdots,m λi0,i=1,2,,m
  4. λ i g i ( x ) = 0 , i = 1 , 2 , ⋯   , m \lambda_ig_i(x)=0, i=1,2,\cdots,m λigi(x)=0,i=1,2,,m
  5. ∇ y ( x ∗ ∣ λ i ∗ , v i ∗ ) = 0 \nabla y(x^*|\lambda_i^*, v_i^*)=0 y(xλi,vi)=0

1. 非负矩阵分解的定义

三个矩阵:
(1)数据矩阵V,大小为 m × n m\times n m×n,其中,m为样本特征维数,n为样本个数

(2)基矩阵W,大小为 m × d m\times d m×d,其中,d为隐特征空间中样本的维数, W ≥ 0 W\geq 0 W0

(3)系数矩阵H,大小为 d × n d\times n d×n, H ≥ 0 H\geq 0 H0

NMF的目标函数为:
m i n J = ∣ ∣ V − W H ∣ ∣ F 2 s . t . W ≥ 0 , H ≥ 0 min J = ||V - WH||_F^2 \\ s.t. W\geq 0, H\geq 0 minJ=VWHF2s.t.W0,H0

2. 非负矩阵求解

J = ∣ ∣ V − W H ∣ ∣ F 2 = t r ( ( V − W H ) T ( V − W H ) ) s . t . W ≥ 0 , H ≥ 0 J = ||V - WH||_F^2=tr((V-WH)^T(V-WH)) \\ s.t. W\geq0, H\geq0 J=VWHF2=tr((VWH)T(VWH))s.t.W0,H0
假设 A = [ A i j ] A = [A_{ij}] A=[Aij], B = [ B i j ] B = [B_{ij}] B=[Bij],将约束条件代入目标J可得:
J = ∣ ∣ V − W H ∣ ∣ F 2 − ∑ i m ∑ j d A i j W i j − ∑ i d ∑ j n B i j H i j J = t r ( ( V − W H ) T ( V − W H ) ) − t r ( A T W ) − t r ( B T H ) J = ||V-WH||_F^2 - \sum_i^m\sum_j^d A_{ij}W_{ij }- \sum_i^d\sum_j^n B_{ij}H_{ij} \\ J = tr((V-WH)^T(V-WH)) - tr(A^TW) - tr(B^TH) J=VWHF2imjdAijWijidjnBijHijJ=tr((VWH)T(VWH))tr(ATW)tr(BTH)

2.1 目标函数对W求导

令:
J 1 = t r ( ( V − W H ) T ( V − W H ) ) = t r ( V T V ) − t r ( V T W H ) − t r ( H T W T V ) + t r ( H T W T W H ) J 2 = t r ( A T W ) J 3 = t r ( B T H ) J_1 = tr((V-WH)^T(V-WH))=tr(V^TV)-tr(V^TWH)-tr(H^TW^TV)+tr(H^TW^TWH) \\ J_2 = tr(A^TW) \\ J_3 = tr(B^TH) J1=tr((VWH)T(VWH))=tr(VTV)tr(VTWH)tr(HTWTV)+tr(HTWTWH)J2=tr(ATW)J3=tr(BTH)

2.1.1 计算W的导数

(1) J 1 J_1 J1 对W求导
∂ t r ( V T W H ) ∂ W = ∂ t r ( H V T W ) ∂ W = ( H V T ) T = V H T \frac{\partial tr(V^TWH)}{\partial W} = \frac{\partial tr(HV^TW)}{\partial W} =(HV^T)^T=VH^T Wtr(VTWH)=Wtr(HVTW)=(HVT)T=VHT
∂ t r ( H T W T V ) ∂ W = ∂ t r ( V H T W T ) ∂ W = V H T \frac{\partial tr(H^TW^TV)}{\partial W} = \frac{\partial tr(VH^TW^T)}{\partial W} =VH^T Wtr(HTWTV)=Wtr(VHTWT)=VHT
∂ t r ( H T W T W H ) ∂ W = ∂ t r ( W H H T W T ) ∂ W = 2 W H H T \frac{\partial tr(H^TW^TWH)}{\partial W} = \frac{\partial tr(WHH^TW^T)}{\partial W} =2WHH^T Wtr(HTWTWH)=Wtr(WHHTWT)=2WHHT
所以,
∂ J 1 ∂ W = − 2 ( V − W H ) H T \frac{\partial J_1}{\partial W} = -2(V-WH)H^T WJ1=2(VWH)HT
(2) J 2 J_2 J2对W求导
∂ J 2 ∂ W = ∂ t r ( A T W ) ∂ W = A \frac{\partial J_2}{\partial W} = \frac{\partial tr(A^TW)}{\partial W} = A WJ2=Wtr(ATW)=A
(3) J对W求导
由(1)和(2)可得,
∂ J ∂ W = − 2 ( V − W H ) H T + A \frac{\partial J}{\partial W} = -2(V-WH)H^T + A WJ=2(VWH)HT+A
(4) 考虑KKT条件
因为 A i j ( − W i j ) = 0 A_{ij}(-W_{ij})=0 Aij(Wij)=0, B i j ( − H i j ) = 0 B_{ij}(-H_{ij})=0 Bij(Hij)=0,有
A ⊙ W = O , B ⊙ H = O A \odot W = O, B\odot H = O AW=O,BH=O
(5) 求取迭代公式
∂ J ∂ W = 0 \frac{\partial J}{\partial W}=0 WJ=0,并右乘W,有
− 2 ( V − W H ) H T ⊙ W + A ⊙ W = O -2(V-WH)H^T \odot W + A \odot W = O 2(VWH)HTW+AW=O
代入(4),有
( V − W H ) H T ⊙ W = O (V-WH)H^T \odot W = O (VWH)HTW=O
可得:
( V H T ) i j ( W H H T ) i j W i j → W i j \frac{(VH^T)_{ij}}{(WHH^T)_{ij}}W_{ij} \rightarrow W_{ij} (WHHT)ij(VHT)ijWijWij

2.1.2 计算H的导数

采用与W导数类似的计算方法,可得J对H的导数为:
∂ J ∂ H = − 2 W T ( V − W H ) + B \frac{\partial J}{\partial H} = -2W^T(V-WH) + B HJ=2WT(VWH)+B
∂ J ∂ H = 0 \frac{\partial J}{\partial H}=0 HJ=0,并右乘H,有
− 2 W T ( V − W H ) ⊙ H + B ⊙ H = 0 -2W^T(V-WH)\odot H + B\odot H = 0 2WT(VWH)H+BH=0
可得 W T ( V − W H ) ⊙ H = 0 W^T(V-WH)\odot H=0 WT(VWH)H=0
所以,有:
[ W T V ] k j [ W T W H ] k j H k j → H k j \frac{[W^TV]_{kj}}{[W^TWH]_{kj}}H_{kj} \rightarrow H_{kj} [WTWH]kj[WTV]kjHkjHkj

  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值