Coursera | Andrew Ng (01-week-2-2.12)—更多向量化的例子

该系列仅在原课程基础上部分知识点添加个人学习笔记,或相关推导补充等。如有错误,还请批评指教。在学习了 Andrew Ng 课程的基础上,为了更方便的查阅复习,将其整理成文字。因本人一直在学习英语,所以该系列以英文为主,同时也建议读者以英文为主,中文辅助,以便后期进阶时,为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂


转载请注明作者和出处:ZJ 微信公众号-「SelfImprovementLab」

知乎https://zhuanlan.zhihu.com/c_147249273

CSDNhttp://blog.csdn.net/JUNJUN_ZHAO/article/details/78928763


More Vectorization examples 更多向量化的例子

这里写图片描述

In the previous video,you saw a few examples of how vectorization by using built in functions and by avoiding explicit for loops,allows you to speed up your code significantly.Let’s look at a few more examples.The rule of thumb to keep in mind is,when you’re programming your new networks,or when you’re programming just a regression,whenever possible avoid explicit for-loops.And it’s not always possible to never use a for-loop,but when you can use a built in function or find some other way to compute whatever you need,you’ll often go faster than if you have an explicit for-loop.

在前面几个视频中,你们看到如何向量化,如何使用内置函数避免使用显示 for 循环,可以让程序运行速度显著加快,我们再来看几个例子,要记住 经验法则是,当你编写新的网络时,或者你做的只是回归,那么一定要尽量避免 for循环,能不用就不用,如果你可以使用一个内置函数,或者找出其他办法去计算循环,通常会比直接用 for 循环更快。

Let’s look at another example.If ever you want to compute a vector u as the product of the matrix A,and another vector v, then the definition of our matrix multiplyis that your u_i is equal to sum over j, A_ij, v_j.That’s how you define u_i.And so the non-vectorized implementation of this would be to set u equals NP.Zeros, it would be n by 1.For I, and so on. For j, and so on..And then u[i]+=A[i][j]·v[j]So now, this is two for-loops, looping over both I and j.So, that’s a non-vectorized version,the vectorized implementation which is to say u equals np dot (A,v) And the implementation on the right, the vectorized version,now eliminates two different for-loops,and it’s going to be way faster.

我们来看另一个例子,如果你想计算,一个向量 u 作为矩阵A,和另一个向量 v 的乘积 矩阵乘法的定义就是,就是ui等于 A_ij v_j 对j求和,这是 ui 的定义,所以这些计算的非向量表示,就是令 u 等于np.zeros(n,1),对i循环 对 j 循环,然后u[i]+=A[i][j]·v[j],现在 这是一个双重 for 循环 对指标 i 和 j 循环,所以这是一个非向量化的版本,这个向量化实现就是说u=np.dot(A,v),而在右边的实现 就是向量化版本,消除了两个不同的 for 循环,速度会更快。

这里写图片描述

Let’s go through one more example.Let’s say you already have a vector, v in memory and you want to apply the exponential operation on every element of this vector v.So you can put u equals the vector, that’s e to the v1,e to the v2, and so on, down to e to the vn.So this would be a non-vectorized implementation,which is at first you initialize u to the vector of zeros.And then you have a for-loop that computes the elements one at a time.But it turns out that Python and NumPy have many built-in functions that allow you to compute these vectors with just a single call to a single function.So what I would do to implement this is import numpy as np,and then what you just call u = np dot Exp(v).

我们再看一个例子,假设你内存里已经有一个向量 v,如果你想做指数运算,作用到向量 v 的每个元素,你可以令 u 等于那个向量 这是ev1 ev2 一直到 evn ,所以这是一个非向量化的实现,一开始你让 u 初始化成全 0 向量,然后你有一个 for 循环,一次计算一个元素,但事实上 python 的numpy里面有很多内置函数,可以让你计算这些向量,你只需要调用单个函数,所以我去实现的时候 会导入import numpy as np,这样你就可以调用u=np.exp(v)

这里写图片描述

And so, notice that,whereas previously you had that explicit for-loop,with just one line of code here,just v as an input vector u as an output vector,you’ve gotten rid of the explicit for-loop,and the implementation on the right will be much faster that the one needing an explicit for-loop.In fact, the NumPy library has many of the vector value functions.So np.log will compute the element-wise log,np.Abs computes the absolute value,np.maximum computes the element-wise maximum to take the max of every element of v with 0. v**2 just takes the element-wise square of each element of v. 1/v takes the element-wise inverse, and so on.So, whenever you are tempted to write a for-loop take a look,and see if there’s a way to call a NumPy built-in function to do it without that for-loop.

要注意,之前你有这个显式 for 循环,这里只需一行代码,v 作为输入向量 u作为输出向量,你已经去掉了显式 for循环,右边的代码实现会快得多,这需要一个显式的 for 循环,实际上 numpy 库,有很多向量值函数,np.log 会逐个元素计算log,np.Abs会计算绝对值,np.maximum计算所有元素中的最大值,求出v中所有元素和0之间相比的最大值,v**2就是 v 中每个元素的平方,1/v就是每个元素求倒数 等等,所以每当你想写一个for循环时,应该看看可不可以调用numpy,用内置函数计算 而不是用for循环。

So, let’s take all of these learnings and apply it to our logistic regression gradient descent implementation,and see if we can at least get rid of one of the two for-loops we had.So here’s our code for computing the derivatives for logistic regression, and we had two for-loops.One was this one up here, and the second one was this one.So in our example we had nx equals 2,but if you had more features than just 2 features then you’d need have a for-loop over dw_1, dw_2, dw_3, and so on.So its as if there’s actually a for j equals 1, to n_x.So we’d like to eliminate this second for-loop.dw_j gets updated.

所以我们看看学到的这些技巧 怎么运用到,logistic 回归梯度下降算法实现中来,看看是否可以去掉两个 for 循环中的一个,这是我们用来计算,logistic 回归导数的程序,我们有两个 for 循环,一个在上面这里 第二个是这里,所以在我们的例子中 有n_x=2,如果你的特征不止两个的话,你需要用for循环处理 dw_1 , dw_2 ,dw_3 等等,所以这里其实有个for j=1 to n_xdw_j就更新了,所以我想去掉这第二个 for循环。

That’s what we’ll do on this slide.So the way we’ll do so is that instead of explicitly initializing dw_1, dw_2, and so on to zeros,we’re going to get rid of this and instead make dw a vector.So we’re going to set dw equals np.Zeros,and let’s make this a nx by 1, dimensional vector.Then, here, instead of this for loop over the individual components,we’ll just use this vector value operation, dw+=x(i)dz(i) And then finally, instead of this,we will just have dw divides equals m.So now we’ve gone from having two for-loops to just one for-loop.We still have this one for-loop that loops over the individual training examples.

这就是我们上一张幻灯片做的,做法就是,这里我们不会显式地把dw_1 ,dw_2 等等初始化成0,我们要去掉这个循环 把dw变成一个向量,我们令dw=np.Zeros,然后我们把这个变成n_x×1维向量,然后在这里我们不需要对单个分量用for循环,我们用这个向量值操作,就是 dw+=x(i)dz(i) ,最后 我们不用这个,我们就用dw/=m,现在我们从两个for循环化简成一个for循环,我们这里还有一个for循环,对单独的训练例子循环。

这里写图片描述

So I hope this video gave you a sense of vectorization.And by getting rid of one for-loop your code will already run faster.But it turns out we could do even better.So the next video will talk about how to vectorize logistic regression even further.And you see a pretty surprising result,that without using any for-loops,without needing a for-loop over the training examples,you could write code to process the entire training sets.So, pretty much all at the same time.So, let’s see that in the next video.

所以我希望本视频能给你向量化的概念,同时去掉一个for循环之后,你的代码运行速度会大大加快,事实证明我们还可以做得更好,所以下一个视频将谈论,如何进一步向量化 logistic 回归,你会看到一个非常惊人的结果,就是没有任何for循环,对所有训练样本不用任何for循环,你可以写代码一次处理整个训练集,基本上同时处理,我们来看看下一个视频。


重点总结:

所有 m 个样本的线性输出 Z 可以用矩阵表示:

Z=wTX+b

Z = np.dot(w.T,X) + b
A = sigmoid(Z)

逻辑回归梯度下降输出向量化

  • dZ 对于 m 个样本,维度为1,m,表示为:

    dZ=AY

  • db可以表示为:

db=1mmi=1dz(i)

 db = 1/m*np.sum(dZ)
  • dw可表示为:

dw=1mXdZT

dw = 1/m*np.dot(X,dZ.T)

参考文献:

[1]. 大树先生.吴恩达Coursera深度学习课程 DeepLearning.ai 提炼笔记(1-2)– 神经网络基础


PS: 欢迎扫码关注公众号:「SelfImprovementLab」!专注「深度学习」,「机器学习」,「人工智能」。以及 「早起」,「阅读」,「运动」,「英语 」「其他」不定期建群 打卡互助活动。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值