Basics of Neural Network Programming - Gradient descent on m examples

最新推荐文章于 2022-06-20 12:27:32 发布

王彩旗 edwardwangcq.com

最新推荐文章于 2022-06-20 12:27:32 发布

阅读量97

点赞数

分类专栏：人工智能 # Neural Networks and Deep Learning

本文链接：https://blog.csdn.net/edward_wang1/article/details/118093675

版权

人工智能同时被 2 个专栏收录

142 篇文章 0 订阅

订阅专栏

Neural Networks and Deep Learning

32 篇文章 0 订阅

订阅专栏

This is the notes when I study the Coursera class Neural Networks & Deep Learning by Andrew Ng, section Gradient descent on m training examples. Share it with you and hope it helps

Last class, you saw how to compute derivatives and implement gradient descent with respect to just one training example for logistic regression. Now, we'll do it for m training examples.

Recap the cost function of logistic regression:

$J(w,b)=\frac{1}{m}\sum _{i=1}^{m}\pounds (a^{(i)},y^{(i)})$

And,

$a^{(i)}=\hat{y}^{(i)}=\sigma (z^{(i)})=\sigma (w^{T}x^{(i)}+b)$

According to calculus equations:

$\frac{\partial }{\partial w_{1}}J(w,b)=\frac{1}{m}\sum _{i=1}^{m}\frac{\partial }{\partial w_{1}}\pounds (a^{(i)},y^{(i)})$

$\frac{\partial }{\partial w_{2}}J(w,b)=\frac{1}{m}\sum _{i=1}^{m}\frac{\partial }{\partial w_{2}}\pounds (a^{(i)},y^{(i)})$

$\frac{\partial }{\partial b}J(w,b)=\frac{1}{m}\sum _{i=1}^{m}\frac{\partial }{\partial b}\pounds (a^{(i)},y^{(i)})$

So, above can be shown as below in figure-1 in programming:

Note that everything on figure-1 implements just one step of gradient descent. So you have to repeat everything on figure-1 multiple times in order to take multiple steps of gradient descent.

It turns out there are 2 weakness in these calculations: you need to write 2 for loops. The 1st is over the m training examples; the 2nd is over all the features. Having explicit for loops in the code makes your algorithm run less efficiency. In deep learning era, we would move to bigger and bigger data sets, so being able to implement your algorithms without using explicit for loops is really important and help you to scale to much bigger data sets. The technique vectorization allows you to get rid of these explicit for loops in the code.

<end>