好久没写了…在因为第九周的课程逾期而疯狂补课之后,我又开始写周总结了,这一周的课程和作业并不难,只是collaborate filtering的概念我还是感觉很迷。废话不多说,这周作业里的注意的点有两个:
1. Collaborate Filtering代价函数的梯度计算
实际上和linear regression非常相似,只不过在Collaborate Filtering中,要同时学习 x ( 1 ) . . . x ( n m ) x^{(1)}...x^{(n_m)} x(1)...x(nm)(特征)和 θ ( 1 ) . . . θ ( n u ) \theta^{(1)}...\theta^{(n_u)} θ(1)...θ(nu)(参数);并且为了方便,将系数从原来的 1 2 m \frac{1}{2m} 2m1简化成了 1 2 \frac{1}{2} 21。
代价函数的形式为:
J
(
θ
(
1
)
,
.
.
.
,
θ
(
n
u
)
,
x
(
1
)
,
.
.
.
,
x
(
n
m
)
)
=
1
2
∑
i
,
j
:
r
(
i
,
j
)
=
1
(
X
n
m
×
n
(
Θ
n
u
×
n
)
T
−
Y
n
m
×
n
u
)
2
+
λ
2
∑
i
(
x
(
i
)
)
2
+
λ
2
∑
j
(
θ
(
j
)
)
2
J(\theta^{(1)}, ..., \theta^{(n_u)}, x^{(1)}, ..., x^{(n_m)}) = \frac{1}{2}\sum_{i,j:r(i,j)=1}{(X_{n_m\times n}(\Theta_{n_u\times n})^T-Y_{n_m\times n_u})^2} + \frac{\lambda}{2}\sum_i{(x^{(i)})^2} + \frac{\lambda}{2}\sum_j{(\theta^{(j)})^2}
J(θ(1),...,θ(nu),x(1),...,x(nm))=21i,j:r(i,j)=1∑(Xnm×n(Θnu×n)T−Ynm×nu)2+2λi∑(x(i))2+2λj∑(θ(j))2
对每个参数求导,可得:
∂
J
∂
θ
k
(
j
)
=
∑
i
:
r
(
i
,
j
)
=
1
x
k
(
i
)
(
(
θ
(
j
)
)
T
x
(
i
)
−
y
(
i
,
j
)
)
+
λ
θ
k
(
j
)
\frac{\partial J}{\partial \theta^{(j)}_k} = \sum_{i:r(i,j)=1}{x^{(i)}_k((\theta^{(j)})^Tx^{(i)}-y(i,j))} + \lambda\theta_k^{(j)}
∂θk(j)∂J=i:r(i,j)=1∑xk(i)((θ(j))Tx(i)−y(i,j))+λθk(j)
∂
J
∂
x
k
(
i
)
=
∑
j
:
r
(
i
,
j
)
=
1
(
(
θ
(
j
)
)
T
x
(
i
)
−
y
(
i
,
j
)
)
+
λ
x
k
(
i
)
\frac{\partial J}{\partial x^{(i)}_k} = \sum_{j:r(i,j)=1}{((\theta^{(j)})^Tx^{(i)}-y(i,j))} + \lambda x_k^{(i)}
∂xk(i)∂J=j:r(i,j)=1∑((θ(j))Tx(i)−y(i,j))+λxk(i)
用Matlab实现如下:
prediction = X * Theta';
J = 1/2 * sum(sum((prediction.*(R == 1) - Y.*(R == 1)).^2))...
+ lambda/2 * sum(X(:).^2)...
+ lambda/2 * sum(Theta(:).^2);
Theta_grad = (prediction.*(R == 1) - Y.*(R == 1))' * X + lambda * Theta;
X_grad = (prediction.*(R == 1) - Y.*(R == 1)) * Theta + lambda * X;
2. var(x)的用法
ex8.m中求高斯分布的参数时,涉及到方差的计算,作业要求方差前面的系数是 1 m \frac{1}{m} m1,而var(x)的默认系数是 1 m − 1 \frac{1}{m-1} m−11。
var(x) % 或者var(x, 0),默认,系数为1/(m-1)
var(x, 1) % 系数为1/m