Hypothesis: \[{h_\theta }\left( x \right) = {\theta ^T}x = {\theta _0} + {\theta _1}{x_1} + {\theta _2}{x_2} + ... + {\theta _n}{x_n}\]
参数(Parameters): \[{\theta _1},{\theta _2},{\theta _3},...,{\theta _n}\]
可以用Θ向量表示上面的一系列值
损失函数(Cost function): \[J\left( {{\theta _1},{\theta _2},{\theta _3},...,{\theta _n}} \right) = \frac{1}{{2m}}\sum\limits_{i = 1}^m {{{\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)}^2}} \]
当用Θ表示时,损失函数:\[J\left( \Theta \right) = \frac{1}{{2m}}\sum\limits_{i = 1}^m {{{\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)}^2}} \]
梯度下降算法表示为:
重复(repeat){
\[{\theta _j}: = {\theta _j} - \alpha \frac{\partial }{{\partial {\theta _j}}}J\left( {{\theta _1},{\theta _2},{\theta _3},...,{\theta _n}} \right)\] (simultaneously update for every j = 0,...,n)
如果用Θ表示 \[{\theta _j}: = {\theta _j} - \alpha \frac{\partial }{{\partial {\theta _j}}}J\left( \Theta \right)\] (对于 j = 0,...,n,同时更新)
}
现在看以下部分怎么算 \[\frac{\partial }{{\partial {\theta _j}}}J\left( {{\theta _1},{\theta _2},{\theta _3},...,{\theta _n}} \right)\]
当n=1时
算法为:
repeat {
\[{\theta _0}: = {\theta _0} - \alpha \underbrace {\frac{1}{m}\sum\limits_{i = 1}^m {\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)} }_{\frac{\partial }{{\partial {\theta _0}}}J\left( \theta \right)}\]
\[{\theta _1}: = {\theta _1} - \alpha \frac{1}{m}\sum\limits_{i = 1}^m {\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)} {x^{\left( i \right)}}\]
(simultaneously update θ0, θ1)
}
当n>=1时
算法为:
repeat {
\[{\theta _j}: = {\theta _j} - \alpha \frac{1}{m}\sum\limits_{i = 1}^m {\left( {{h_\theta }\left( {{x^{\left( i \right)}}} \right) - {y^{\left( i \right)}}} \right)} x_j^{\left( i \right)}\]
(simultaneously update θj for j = 0,..., n)
}