Coursera | Andrew Ng (01-week-2-2.15)—Python 中的广播

最新推荐文章于 2022-11-07 11:10:35 发布

ZJ_Improve

最新推荐文章于 2022-11-07 11:10:35 发布

阅读量579

点赞数

分类专栏：深度学习 | 吴恩达-01.神经网络和深度学习深度学习 | 吴恩达文章标签： python 网易吴恩达深度学习

本文链接：https://blog.csdn.net/JUNJUN_ZHAO/article/details/78950072

版权

深度学习 | 吴恩达同时被 2 个专栏收录

129 篇文章 19 订阅

订阅专栏

深度学习 | 吴恩达-01.神经网络和深度学习

40 篇文章 2 订阅

订阅专栏

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂

转载请注明作者和出处：ZJ 微信公众号-「SelfImprovementLab」

知乎：https://zhuanlan.zhihu.com/c_147249273

CSDN：http://blog.csdn.net/JUNJUN_ZHAO/article/details/78950072

Boadcasting in Python （Python 中的广播）

(字幕来源：网易云课堂)

这里写图片描述

In the previous video, I mentioned that broadcasting is another technique that you can use to make your Python code run faster.In this video, let’s delve into how broadcasting in Python actually works.Let’s suppose today broadcasting with an example.In this matrix, I’ve shown the number of calories from carbohydrates, proteins, and fats in 100 grams of four different foods.So for example, a 100 grams of apples turns out,has 56 calories from carbs, and much less from proteins and fats.Whereas, in contrast, a 100 grams of beef has 104 calories from protein and 135 calories from fat.

在前面的视频中，我提到过，广播是一种手段，可以让你的 Python 代码段执行得更快，在这个视频中让我们深入研究一下，Python 中的广播是如何实际运作，假设我们用一个例子来讲广播，在这个矩阵中我们列出了，来自 100 克碳水化合物蛋白质和脂肪的卡路里数量，这四种不同食物的卡路里，所以比如说 100 克苹果的热量，有 56 卡来自碳水化合物远远少于蛋白质和脂肪，而相反，100 克的牛肉，有 104 卡来自蛋白质和 135 卡来自脂肪。

这里写图片描述

Now, let’s say your goal is to calculate the percentage of calories from carbs,proteins and fats for each of the four foods.So, for example, if you look at this column and add up the numbers in that column you get that 100 grams of apple has 56 plus 1.2 plus 1.8 so that’s 59 calories.And so as a percentage the percentage of calories from carbohydrates in an apple would be 56 over 59,that’s about 94.9%.So most of the calories in an apple come from carbs,whereas in contrast, most of the calories of beef come from protein and fat and so on.So the calculation you want is really to sum up each of the four columns of this matrix to get the total number of calories in 100 grams of apples,beef, eggs, and potatoes.

现在，比如说你的目标是，计算四种食物中卡路里有多少百分比，来自碳水化合物蛋白质和脂肪，比如，你看这一列，将整列数字加起来就得到，100 克苹果有 56+1.2+1.8，所以总共是 59 卡，然后苹果中来自碳水化合物，卡路里的百分比是 56/59，大概是 94.9%，所以苹果中大部分热量都来自碳水化合物，而相比之下大多数牛肉的热量，都来自蛋白质和脂肪，你要做的计算其实是，对矩阵这四列求和，得到100 克以下食物的卡路里总量: 苹果，牛肉鸡蛋和土豆。

And then to divide throughout the matrix,so as to get the percentage of calories from carbs, proteins and fats for each of the four foods.So the question is, **can you do this without an explicit for-loop?**Let’s take a look at how you could do that.What I’m going to do is show you how you can set,say this matrix equal to three by four matrix A.And then with one line of Python code we’re going to sum down the columns.So we’re going to get four numbers corresponding to the total number of calories in these four different types of foods,100 grams of these four different types of foods.And I’m going to use a second line of Python code to divide each of the four columns by their corresponding sum.

然后让整个矩阵各列除以总量，得到卡路里占的百分比，四种食物中来自碳水化合物蛋白质和脂肪热量的百分比各占多少，所以问题是,你可以不用显式 for 循环做吗?，我们看看应该怎么做，我这里要给你介绍的是如何设置..，比如令这个矩阵等于 $3*4$ 矩阵 $A$ ，然后用一行 Python 代码，我们会对各列求和，我们得到四个数字，对应四种不同事物的卡路里总量，这 100 克四种不同类型食物的热量总量，然后我用第二行 Python 代码，让四列每一列都除以对应的和。

这里写图片描述

If that verbal description wasn’t very clearly,hopefully it will be clearer in a second when we look in the Python code.So here we are in the Jupiter notebook.I’ve already written this first piece of code to prepopulate the matrix A with the numbers we had just now,so we’ll hit shift enter and so there’s the matrix A.And now here are the two lines of Python code.First, we’re going to compute cal = A.sum ..axis equals 0 means to sum vertically.We’ll say more about that in a little bit.

如果口头描述你们听得不太清楚，希望当你看到 Python 代码时，马上就懂了，所以这是我们的 Jupyter 笔记本，我们写了第一段代码了，把我们刚才的数字填入矩阵A，然后 Shift+Enter 那矩阵 A 就弄好了，然后这是两行 Python 代码，首先我们要计算 cal = A.sum ..，axis=0 意味着竖直相加，我们稍后再说一遍。

import numpy as np

# 2.15 Boradcasting in python

A = np.mat([[56.0, 0.0, 4.4, 68.0],
            [1.2, 104.0, 52.0, 8.0],
            [1.8, 135.0, 99.0, 0.9]])
print(A)

And then print cal.So we’ll sum vertically.Now 59 is the total number of calories in the apple,239 was the total number of calories in the beef and the eggs and potato and so on.And then with a compute percentage equals A/cal.Reshape 1,4.Actually we want percentages, so multiply by 100 here.And then let’s print percentage.Let’s run that.And so that command we’ve taken the matrix A and divided it by this one by four matrix.And this gives us the matrix of percentages.So as we worked out kind of by hand just now in the apple there was a first column 94.9% of the calories are from carbs.Let’s go back to the slides.So just to repeat the two lines of code we had,this is what have written out in the Jupiter notebook.To add a bit of detail this parameter,axis equals zero means that you want Python to sum vertically.

然后 print cal，我们会在竖直方向求和，现在 59 是苹果中的总卡路里数，239 是牛肉中的总卡路里数，还有鸡蛋和马铃薯之类的，然后我们要计算百分比 percentage=A/cal.reshape(1,4) ，我们其实要得到的是百分比所以这里要乘以 100，然后我们 print percentage，我们跑一下，所以那个命令我们用了矩阵 A，让它除以这个 1×4 矩阵，然后就得到了百分比矩阵，我们手算出了苹果的情况，第一列有 94.9% 卡来自碳水化合物，我们回到幻灯片上来，就重复这两行代码，就在 jupyter 笔记本里写的，这里参数还要加一些细节，这个轴等于 0 意味着我希望 Python 在竖直方向求和。

# axis=0 代表竖直方向相加
cal = A.sum(axis=0)
print(cal)
# [[  59.   239.   155.4   76.9]]

# print("cal.reshape(1,4)",cal.reshape(1,4))
# A/cal 相当于（换算百分比） 100* （56/59） = 94.915 
# A 矩阵中的每一个元素，与当前所在列的总和相除
# cal 根据上面的计算本身就是 1 *4 矩阵，所以cal.reshape(1,4) 这个可以不用
percentage = 100 * A / (cal.reshape(1, 4))
print('percentage=', percentage)

# [[ 94.91525424   0.           2.83140283  88.42652796]
#  [  2.03389831  43.51464435  33.46203346  10.40312094]
#  [  3.05084746  56.48535565  63.70656371   1.17035111]]

So if this is axis 0 this means to sum vertically,where as the horizontal axis is axis 1.So be able to write axis = 1 or sum horizontally instead of sum vertically.And then this command here,this is an example of Python broadcasting where you take a matrix A.So this is a three by four matrix and you divide it by a one by four matrix.And technically, after this first line of codes cal, the variable cal,is already a one by four matrix.So technically you don’t need to call reshape here again,so that’s actually a little bit redundant.But when I’m writing Python codes if I’m not entirely sure what matrix,whether the dimensions of a matrix I often would just call a reshape command just to make sure that it’s the right column vector or the row vectoror whatever you want it to be.

这是轴 0 意味着竖直相加，而水平轴是轴 1，我们可以写成是 axis = 1，这样就可以水平求和,而不是竖直求和，然后这里的命令，是Python 广播的另一个例子，当你取矩阵 A，这是一个 3×4 矩阵，你让它除以一个1×4矩阵，技术上在这第一行代码之后变量 cal，已经是一个 1×4 矩阵了，所以技术上你不需要调用reshape，这实际上有点多余，但是当我编写 Python 代码时，如果不完全确定用什么矩阵，不确定矩阵的尺寸，我会经常调用reshape命令,确保它是正确的列向量或行向量，或者你想要的任何形式。

这里写图片描述

The reshape command is a constant time.It’s a order one operation that’s very cheap to call.So don’t be shy about using the reshape command to make sure that your matrices are the size you need it to be.Now, let’s explain in greater detail how this type of operation works, right?We had a three by four matrix and we divided it by a one by four matrix.So, how can you divide a three by four matrix by a one by four matrix? Or by one by four vector?Let’s go through a few more examples of broadcasting.If you take a 4 by 1 vector and add it to a number,what Python will do is take this number and auto-expand it into a four by one vector as well, as follows.And so the vector 1, 2, 3, 4 plus the number 100 ends up with that vector on the right.

reshape命令经常会用到，这是 $o(1)$ 操作成本很低，所以不要害怕使用reshape命令，来确保你的矩阵形状是你想要的，现在我们详细解释一下，这种运算是怎么执行的，我们有一个 $3×4$ 矩阵，我们让它除以一个 $1×4$ 矩阵，那么你怎么让一个 $3×4$ 矩阵，除以 $1×4$ 矩阵呢? 这个 $1×4$ 向量，我们再来看几个广播的例子，如果你取一个 $4×1$ 向量让它和一个数字相加，什么 Python 会做的是将这个数字自动展开，变为一个 $1×4$ 向量就像这样，所以向量1 2 3 4，加上数字100 最终等于右边这个向量。

这里写图片描述

You’re adding a 100 to every element,and in fact we use this form of broadcasting where that constant was the parameter b in an earlier video.And this type of broadcasting works with both column vectors and row vectors,and in fact we use a similar form of broadcasting earlier with the constant we’re adding to a vector being the parameter b in logistic regression.Here’s another example.Let’s say you have a two by three matrix and you add it to this one by n matrix.So the general case would be if you have some matrix here and you add it to a 1 by n matrix.What Python will do is copy the matrix m,times to turn this into m by n matrix,so instead of this one by three matrix it’ll copy it twice in this example to turn it into this.

你就往每个元素加上 100，事实上我们使用这种形式的广播，其中常数是之前视频里的参数 $b$ ，而这种广播，对列向量和行向量一样有用，事实上我们之前已经使用了类似的广播形式，就是我们往向量加上一个常数的时候，在 $logistic$ 回归中就是参数 $b$ ，这里是另一个例子，我们说你有一个 $2×3$ 矩阵，然后你让它加上一个 $1×n$ 矩阵，那么一般情况下如果这里你有个矩阵，并将它加上 $1×n$ 矩阵，什么 Python 会做的是复制矩阵 m 次，把它变成 $m×n$ 矩阵，所以这不再是一个 $1×3$ 矩阵，python 会复制两次把它变成这个形式。

Also, two by three matrix and we’ll add these so you’ll end up with the sum on the right, okay?So you taken, you added 100 to the first column,added 200 to second column, added 300 to the third column.And this is basically what we did on the previous slide,except that we use a division operation instead of an addition operation.So one last example, whether you have a m by n matrix and you add this to a m by 1 vector, or m by 1 matrix.Then just copy this n times horizontally.So you end up with an m by n matrix.So as you can imagine you copy it horizontally three times.And you add those.So when you add them you end up with this.So we’ve added 100 to the first row and added 200 to the second row.Here’s the more general principle of broadcasting in Python.

所以 $2×3$ 矩阵然后我们让它们相加，最后你会得到右边的和对吧?，所以你拿了.. 你让第一列加上 100，第二列加上 200 第三列加上300，这基本上是我们在上一张幻灯片中所做的，不过我们用了一个除法运算，而不是加法运算，所以最后一个例子无论你有没有 $m×n$ 矩阵，你都让它加上一个 $m×1$ 向量或者 $m×1$ 矩阵，然后水平复制 n 次，最后你会得到一个 $m×n$ 矩阵，你可以想象一下水平复制三次，然后加起来，当你让它们相加的时候就会得到这个，所以你往第一行加了 100，第二行加了 200，在 Python 广播中有一些通用规则。

这里写图片描述

If you have an m by n matrix and you add or subtract or multiply or divide with a 1 by n matrix,then this will copy it n times into a m by n matrix.And then apply the addition, subtraction,and multiplication of division element wise.If conversely, you were to take the m by n matrix and add, subtract, multiply,divide by a m by 1 matrix, then also this would copy it now n times.And turn that into a m by n matrix and then apply the operation element wise.Just one of the broadcasting, which is if you have a m by 1 matrix,so that’s really a column vector like , and you add,subtract, multiply or divide by a real number.So maybe a 1 by 1 matrix.So such as that plus 100, then you end up copying this real number n times until you’ll also get another m by 1 matrix.

如果你有一个 $m×n$ 矩阵然后你加上或者减去，乘以或除以一个 $1×n$ 矩阵，那么 python 就会把它复制 n 次变成 $m×n$ 矩阵，然后再逐元素做加法减法，乘法和除法，如果相反，你拿一个 $m×n$ 矩阵加上减去乘以，或者除以 $m×1$ 矩阵那么这也会复制 n 次，把它变成一个 $m×n$ 矩阵，然后逐元素应用操作，这是其中一种广播就是如果你有个 $m×1$ 矩阵，这其实是个列向量然后你让它加上，减去乘以或除以一个实数，所以也许是 1×1 矩阵，所以这样加上 100 最后你就是，把这个实数复制 n 次直到你得到另一个 $m×1$ 矩阵。

这里写图片描述

And then you perform the operation such as addition on this example element-wise.And something similar also works for row vectors.The fully general version of broadcasting can do even a little bit more than this.If you’re interested you can read the documentation for NumPy,and look at broadcasting in that documentation.That gives an even slightly more general definition of broadcasting.But the ones on the slide are the main forms of broadcasting that you end up needing to use when you implement a neural network.Before we wrap up, just one last comment,which is for those of you that are used to programming in either MATLAB or Octave,if you’ve ever used the MATLAB or Octave function bsxfun in neural network programming bsxfun does something similar, not quite the same.

然后执行运算，比如说这个例子中逐元素做加法，类似的东西也适用于行向量，所以广播的一般版本，还可以做到更多，如果你有兴趣可以阅读 NumPy 的文档，并在文档里搜索 broadcasting，这也许是，更广义的广播，幻灯片上的这些是，在你实现神经网络算法时，主要用到的广播形式，在我们结束之前最后讲一句，对于你们，习惯用 MATLAB 或 Octave 编程的同学，如果你曾经使用 MATLAB 或 Ocatvae 函数 bsxfun，在神经网络编程中，bsxfun 做的事情很类似但不完全相同。

But it is often used for similar purpose as what we use broadcasting in Python for.But this is really only for very advanced MATLAB and Octave users,if you’ve not heard of this, don’t worry about it.You don’t need to know it when you’re coding up neural networks in Python.So, that was broadcasting in Python.I hope that when you do the programming homework that broadcasting will allow you to not only make a code run faster,but also help you get what you want done with fewer lines of code.Before you dive into the programming excercise,I want to share with you just one more set of ideas,which is that there’s some tips and tricks that I’ve found reduces the number of bugs in my Python code and that I hope will help you too.So with that, let’s talk about that in the next video.

但它通常用于类似的目的，就像我们在 Python 中使用广播一样，但这函数真的只是，非常厉害的 MATLAB 和 Octave 用户才会用到，如果你还没听说过不用担心，你不需要知道那些，用 Python 编码神经网络就好了，这就是 Python 中的广播，我希望你们在做编程作业时，广播能够让你的，代码运行速度更快，也希望能帮到你，写更少的代码来实现你的目标，在你进行编程练习之前，我想和大家分享一套想法，就是我发现的一些技巧，可以减少 Python 代码中的错误数量，我希望也会帮助你，所以我们在下一个视频中谈谈这个。

PS: 欢迎扫码关注公众号：「SelfImprovementLab」！专注「深度学习」，「机器学习」，「人工智能」。以及「早起」，「阅读」，「运动」，「英语」「其他」不定期建群打卡互助活动。

ZJ_Improve

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Coursera | Andrew Ng (01-week-2-2.15)—Python 中的广播

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ Coursera 课程 |deeplearning.ai |网易云课堂转载请注明作者
复制链接

扫一扫