在logistic回归中,假设我们的训练集由m个已标记样本组成:\[\{ ({x^{(1)}},{y^{(1)}}),...,({x^{(m)}},{y^{(m)}})\} \]且激活函数为sigmoid函数:\[{h_\theta }(x) = \frac{1}{
{1 + {e^{ - {\theta ^T}x}}}}\]损失函数为:\[J(\theta ) = - \frac{1}{m}\sum\limits_{i = 1}^m {[{y^{(i)}} \cdot log{h_\theta }({x^{(i)}}) + (1 - {y^{(i)}}) \cdot log(1 - {h_\theta }({x^{(i)}}))]} \]则损失函数对参数的梯度的第j个分量为:\[\begin{gathered}
{\nabla _{ {\theta _{\text{j}}}}}J(\theta ) &=& - \frac{1}{m}\sum\limits_{i = 1}^m {[{y^{(i)}} \cdot \frac{1}{ { {h_\theta }({x^{(i)}})}} \cdot ( - {h_\theta }^2({x^{(i)}})) \cdot {e^{ - {\theta ^T}{x^{(i)}}}} \cdot ( - {x^{(i)}})} \\
&& + (1 - {y^{(i)}}) \cdot \frac{1}{ {1 - {h_\theta }({x^{(i)}})}} \cdot {h_\theta }^2({x^{(i)}}) \cdot {e^{ - {\theta ^T}{x^{(i)}}}} \cdot ( - {x^{(i)}})] \\
&= & - \frac{1}{m}\sum\limits_{i = 1}^m {[{y^{(i)}}{h_\theta }({x^
{\nabla _{ {\theta _{\text{j}}}}}J(\theta ) &=& - \frac{1}{m}\sum\limits_{i = 1}^m {[{y^{(i)}} \cdot \frac{1}{ { {h_\theta }({x^{(i)}})}} \cdot ( - {h_\theta }^2({x^{(i)}})) \cdot {e^{ - {\theta ^T}{x^{(i)}}}} \cdot ( - {x^{(i)}})} \\
&& + (1 - {y^{(i)}}) \cdot \frac{1}{ {1 - {h_\theta }({x^{(i)}})}} \cdot {h_\theta }^2({x^{(i)}}) \cdot {e^{ - {\theta ^T}{x^{(i)}}}} \cdot ( - {x^{(i)}})] \\
&= & - \frac{1}{m}\sum\limits_{i = 1}^m {[{y^{(i)}}{h_\theta }({x^