Coursera | Andrew Ng (01-week-2-2.12)—更多向量化的例子

最新推荐文章于 2018-08-10 15:07:04 发布

ZJ_Improve

最新推荐文章于 2018-08-10 15:07:04 发布

阅读量529

点赞数

分类专栏：深度学习 | 吴恩达-01.神经网络和深度学习深度学习 | 吴恩达文章标签：吴恩达深度学习网易

本文链接：https://blog.csdn.net/JUNJUN_ZHAO/article/details/78928763

版权

深度学习 | 吴恩达同时被 2 个专栏收录

129 篇文章 19 订阅

订阅专栏

深度学习 | 吴恩达-01.神经网络和深度学习

40 篇文章 2 订阅

订阅专栏

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂

转载请注明作者和出处：ZJ 微信公众号-「SelfImprovementLab」

知乎：https://zhuanlan.zhihu.com/c_147249273

CSDN：http://blog.csdn.net/JUNJUN_ZHAO/article/details/78928763

More Vectorization examples 更多向量化的例子

这里写图片描述

In the previous video,you saw a few examples of how vectorization by using built in functions and by avoiding explicit for loops,allows you to speed up your code significantly.Let’s look at a few more examples.The rule of thumb to keep in mind is,when you’re programming your new networks,or when you’re programming just a regression,whenever possible avoid explicit for-loops.And it’s not always possible to never use a for-loop,but when you can use a built in function or find some other way to compute whatever you need,you’ll often go faster than if you have an explicit for-loop.

在前面几个视频中，你们看到如何向量化，如何使用内置函数避免使用显示 for 循环，可以让程序运行速度显著加快，我们再来看几个例子，要记住经验法则是，当你编写新的网络时，或者你做的只是回归，那么一定要尽量避免 for循环，能不用就不用，如果你可以使用一个内置函数，或者找出其他办法去计算循环，通常会比直接用 for 循环更快。

Let’s look at another example.If ever you want to compute a vector u as the product of the matrix A,and another vector v, then the definition of our matrix multiplyis that your u_i is equal to sum over j, A_ij, v_j.That’s how you define u_i.And so the non-vectorized implementation of this would be to set u equals NP.Zeros, it would be n by 1.For I, and so on. For j, and so on..And then u[i]+=A[i][j]·v[j]So now, this is two for-loops, looping over both I and j.So, that’s a non-vectorized version,the vectorized implementation which is to say u equals np dot (A,v) And the implementation on the right, the vectorized version,now eliminates two different for-loops,and it’s going to be way faster.

我们来看另一个例子，如果你想计算，一个向量 $u$ 作为矩阵 $A$ ，和另一个向量 $v$ 的乘积矩阵乘法的定义就是，就是 $u_i$ 等于 A_ij v_j 对j求和，这是 $u_i$ 的定义，所以这些计算的非向量表示，就是令 $u$ 等于np.zeros(n,1)，对 $i$ 循环对 $j$ 循环，然后u[i]+=A[i][j]·v[j]，现在这是一个双重 for 循环对指标 i 和 j 循环，所以这是一个非向量化的版本，这个向量化实现就是说u=np.dot(A,v)，而在右边的实现就是向量化版本，消除了两个不同的 for 循环，速度会更快。

这里写图片描述

Let’s go through one more example.Let’s say you already have a vector, v in memory and you want to apply the exponential operation on every element of this vector v.So you can put u equals the vector, that’s e to the v1,e to the v2, and so on, down to e to the vn.So this would be a non-vectorized implementation,which is at first you initialize u to the vector of zeros.And then you have a for-loop that computes the elements one at a time.But it turns out that Python and NumPy have many built-in functions that allow you to compute these vectors with just a single call to a single function.So what I would do to implement this is import numpy as np,and then what you just call u = np dot Exp(v).

我们再看一个例子,假设你内存里已经有一个向量 v，如果你想做指数运算，作用到向量 v 的每个元素，你可以令 u 等于那个向量这是 $e^{v1}$ ， $e^{v2}$ 一直到 $e^{vn}$ ，所以这是一个非向量化的实现，一开始你让 $u$ 初始化成全 0 向量，然后你有一个 for 循环，一次计算一个元素，但事实上 python 的numpy里面有很多内置函数，可以让你计算这些向量，你只需要调用单个函数，所以我去实现的时候会导入import numpy as np，这样你就可以调用u=np.exp(v)。

这里写图片描述

And so, notice that,whereas previously you had that explicit for-loop,with just one line of code here,just v as an input vector u as an output vector,you’ve gotten rid of the explicit for-loop,and the implementation on the right will be much faster that the one needing an explicit for-loop.In fact, the NumPy library has many of the vector value functions.So np.log will compute the element-wise log,np.Abs computes the absolute value,np.maximum computes the element-wise maximum to take the max of every element of v with 0. v**2 just takes the element-wise square of each element of v. 1/v takes the element-wise inverse, and so on.So, whenever you are tempted to write a for-loop take a look,and see if there’s a way to call a NumPy built-in function to do it without that for-loop.

要注意，之前你有这个显式 for 循环，这里只需一行代码，v 作为输入向量 u作为输出向量，你已经去掉了显式 for循环，右边的代码实现会快得多，这需要一个显式的 for 循环，实际上 numpy 库，有很多向量值函数，np.log 会逐个元素计算log，np.Abs会计算绝对值，np.maximum计算所有元素中的最大值，求出v中所有元素和0之间相比的最大值，v**2就是 v 中每个元素的平方，1/v就是每个元素求倒数等等，所以每当你想写一个for循环时，应该看看可不可以调用numpy，用内置函数计算而不是用for循环。

So, let’s take all of these learnings and apply it to our logistic regression gradient descent implementation,and see if we can at least get rid of one of the two for-loops we had.So here’s our code for computing the derivatives for logistic regression, and we had two for-loops.One was this one up here, and the second one was this one.So in our example we had nx equals 2,but if you had more features than just 2 features then you’d need have a for-loop over dw_1, dw_2, dw_3, and so on.So its as if there’s actually a for j equals 1, to n_x.So we’d like to eliminate this second for-loop.dw_j gets updated.

所以我们看看学到的这些技巧怎么运用到，logistic 回归梯度下降算法实现中来，看看是否可以去掉两个 for 循环中的一个，这是我们用来计算，logistic 回归导数的程序，我们有两个 for 循环，一个在上面这里第二个是这里，所以在我们的例子中有n_x=2，如果你的特征不止两个的话，你需要用for循环处理 dw_1 , dw_2 ,dw_3 等等，所以这里其实有个for j=1 to n_x，dw_j就更新了，所以我想去掉这第二个 for循环。

That’s what we’ll do on this slide.So the way we’ll do so is that instead of explicitly initializing dw_1, dw_2, and so on to zeros,we’re going to get rid of this and instead make dw a vector.So we’re going to set dw equals np.Zeros,and let’s make this a nx by 1, dimensional vector.Then, here, instead of this for loop over the individual components,we’ll just use this vector value operation, $dw+=x^{(i) }dz^{(i)}$ And then finally, instead of this,we will just have dw divides equals m.So now we’ve gone from having two for-loops to just one for-loop.We still have this one for-loop that loops over the individual training examples.

这就是我们上一张幻灯片做的，做法就是，这里我们不会显式地把dw_1 ,dw_2 等等初始化成0，我们要去掉这个循环把dw变成一个向量，我们令dw=np.Zeros，然后我们把这个变成n_x×1维向量，然后在这里我们不需要对单个分量用for循环，我们用这个向量值操作，就是 $dw+=x^{(i) }dz^{(i)}$ ，最后我们不用这个，我们就用dw/=m，现在我们从两个for循环化简成一个for循环，我们这里还有一个for循环，对单独的训练例子循环。

这里写图片描述

So I hope this video gave you a sense of vectorization.And by getting rid of one for-loop your code will already run faster.But it turns out we could do even better.So the next video will talk about how to vectorize logistic regression even further.And you see a pretty surprising result,that without using any for-loops,without needing a for-loop over the training examples,you could write code to process the entire training sets.So, pretty much all at the same time.So, let’s see that in the next video.

所以我希望本视频能给你向量化的概念，同时去掉一个for循环之后，你的代码运行速度会大大加快，事实证明我们还可以做得更好，所以下一个视频将谈论，如何进一步向量化 logistic 回归，你会看到一个非常惊人的结果，就是没有任何for循环，对所有训练样本不用任何for循环，你可以写代码一次处理整个训练集，基本上同时处理，我们来看看下一个视频。

重点总结：

所有 m 个样本的线性输出 $Z$ 可以用矩阵表示：

$Z = w^{T}X+b$

Z = np.dot(w.T,X) + b
A = sigmoid(Z)

逻辑回归梯度下降输出向量化

$dZ$ 对于 $m$ 个样本，维度为 $（1,m）$ ，表示为：

$dZ = A - Y$
db可以表示为：

db=1m∑mi=1dz(i) $db = \dfrac{1}{m}\sum_{i=1}^{m}dz^{(i)}$

 db = 1/m*np.sum(dZ)

dw可表示为：

dw=1mX⋅dZT $dw = \dfrac{1}{m}X\cdot dZ^{T}$

dw = 1/m*np.dot(X,dZ.T)

参考文献：

[1]. 大树先生.吴恩达Coursera深度学习课程 DeepLearning.ai 提炼笔记（1-2）– 神经网络基础

PS: 欢迎扫码关注公众号：「SelfImprovementLab」！专注「深度学习」，「机器学习」，「人工智能」。以及「早起」，「阅读」，「运动」，「英语」「其他」不定期建群打卡互助活动。

ZJ_Improve

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Coursera | Andrew Ng (01-week-2-2.12)—更多向量化的例子

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ Coursera 课程 |deeplearning.ai |网易云课堂转载请注明作者
复制链接

扫一扫

专栏目录