Coursera | Andrew Ng (01-week-2-2.7&2.8)—计算图&计算图的导数计算

最新推荐文章于 2023-02-07 20:31:14 发布

ZJ_Improve

最新推荐文章于 2023-02-07 20:31:14 发布

阅读量651

点赞数

分类专栏：深度学习 | 吴恩达-01.神经网络和深度学习深度学习 | 吴恩达文章标签：吴恩达深度学习网易

本文链接：https://blog.csdn.net/JUNJUN_ZHAO/article/details/78897414

版权

深度学习 | 吴恩达同时被 2 个专栏收录

129 篇文章 19 订阅

订阅专栏

深度学习 | 吴恩达-01.神经网络和深度学习

40 篇文章 2 订阅

订阅专栏

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂

转载请注明作者和出处：ZJ 微信公众号-「SelfImprovementLab」

知乎：https://zhuanlan.zhihu.com/c_147249273

CSDN：http://blog.csdn.net/JUNJUN_ZHAO/article/details/78897414

2.7 Computation Graph （计算图）

计算图&计算图的导数计算
(字幕来源：网易云课堂)

Probably say that the computations of a neural network,are all organized in terms of a forward path or a forward propagation step,in which we compute the output of the neural network followed by a backward pass or a back complication step,which we use to compute gradients or compute derivatives.the computation graph explains why it is organized this way,in this video we’ll go through an example,in order to illustrate the computation graph.let’s use a simpler example than logistic regression or a informal neural network.

可以说，一个神经网络的计算，都是按照前向或反向传播过程来实现的。首先计算出神经网络的输出，紧接着进行一个反向传输操作，后者我们用来计算出对应的梯度或者导数。这个流程图解释了为什么用这样的方式这样实现，在这个视频中我们将看一个例子，为了阐明这个计算过程，举一个比 logistic 回归更加简单的、不那么正式的，神经网络的例子。

let’s say that we’re trying to compute a function J,which is a function of three variables a b and c,and let’s say that function is three times a plus B times C,computing this function actually has three distinct steps,the first is you need to compute,what is B times C,and let’s say we store that in a variable called u,so U is equal to B times C,and then you might compute v is equal a times u (a+u),so let’s say you know this is V,and then finally your output J is 3 times V,so this is your final function J you trying to compute,we can take these three steps,and draw them in a computation graph as follows,let’s say I draw your three variables a B and C here,so the first thing we did was compute u equals B times C.

我们尝试计算函数 $J$ ， $J$ 是三个变量 a b c 的函数，这个函数是 $3(a+b*c)$ ，计算这个函数实际上有三个不同的步骤，第一个首先是，计算 $b$ 乘以 $c$ ，我们把它储存在变量 $u$ 中，因此 $u=b*c$ ，然后计算 $v=a+u$ (原说法有误)，这就是 $v$ ，最后输出 $J$ 就是 $3*v$ ，这就是要计算的函数 $J$ ，我们可以把这三步，画成如下的流程图，我先在这画三个变量 a b c，第一步就是计算 $u=b*c$ 。

I’m going to put a rectangular box around that,and so the inputs of that are B and C,and then you might have V equals a plus u,so the inputs to that ah so the inputs to,that are u which we just computed together with a,and then finally we have $J$ equals three times v,so as I can for example a equals five B equals 3,and C equals two then u equals BC would be six,V equals a plus u be five plus six and eleven, $J$ is three times v so $J$ is equal to 33,and indeed hope you can verify that you know,this is a three times five plus three times two,and ifyou expand that out,you know you actually get thirty three is the value of $J$ ,so the computation graph comes in handy,when there is some distinguished or some special output variable,such as $J$ in this case that you want to optimize,and in the case of the logistic regression, $J$ is of course the cost function that we’re trying to minimize,and what we’ve seen in this little example is that,through a left-to-right pause you can compute the value of $J$ .

我在这周围放个矩形框，它的输入是 $b$ 和 $c$ ，接着第二步 $v=a+u$ ，这个的输入就是，刚才计算出来的 $u$ 还有 $a$ ，最后一步 $J=3*v$ ，举个例子 $a=5$ $b=3$ ， $c=2$ $u=bc$ 就是6， $v=a+u$ 就是 5+6 就是 11， $J$ 是三倍的 $v$ 因此 $J$ 就等于 33，你们自己可以验证以下，这是 $3*（5+3*2）$ ，如果你把它算出来，实际上得到 33 就是 $J$ 的值，这个流程图用起来很方便，有不同的或者一些特殊的输出变量时，比如 $J$ 也是我们想要优化，在 $logistic$ 回归中， $J$ 是想要最小化的成本函数，可以看出通过一个，从左向右的过程你可以计算出 $J$ 的值。

and what we’ll see in the next couple slides is that,in order to compute derivatives,Opa right to left pass like this,kind of going in the opposite direction as the blue arrows,that would be most natural for computing the derivatives,so the recap the computation graph,organizes a computation with this blue arrow left to right computation,lets defer to the next video,how you can do the backward red arrow,right to left computation of the derivatives,let’s go on to the next video.

在接下来的幻灯片中我们会看到，为了计算导数，从右到左的这个过程，和这个蓝色箭头的过程相反，这会是用于计算导数最自然的方式，因此概括一下流程图，是用蓝色箭头画出来的从左到右的计算，看看下一个视频怎么做，这个反向红色箭头画的，也就是从右到左的导数计算，让我们继续下一个视频。

2.8 Derivatives with a Computation Graph

计算图的导数计算

In the last video, we worked through an example of,using a computation graph to compute the function $J$ .Now, let’s take a cleaned up version,of that computation graph and show how you can use it,to figure out derivative calculations for that function $J$ .So, here’s a computation graph.Let’s say you want to compute,the derivative of $J$ with respect to v. So, what is that?Well, this says if we were to,take this value of v and change it a little bit,how would the value of $J$ change?Well, $J$ is defined as three times v,and right now v is equal to 11.So, if we’re to pump up v by a little bit to 11.001,then $J$ which has three vs and currently 33 will get pumped up to 33.003.So, here we’ve increased v by .001 and the net result of that is that $J$ goes up three times as much.So the derivative of $J$ with respect to v is equal to three,because the increase in $J$ is three times the increase in v.

在上个视频中我们看了一个例子，使用流程图来计算函数 $J$ ，现在我们清理一下，流程图的描述看看你如何利用它，计算出函数 $J$ 的导数，所以这是一个流程图，假设你要计算， $J$ 对 $v$ 的导数那怎么算呢?好比如说我们要，把这个v值拿过来改变一下，那么 $J$ 的值会怎么变呢?，所以定义上 $J$ 是 $3v$ ，现在 $v$ 等于 11，所以如果你让v增加一点点比如到11.001，那么 $J$ 是 $3v$ ，现在 33 就增加到 33.003，所以我这里 $v$ 增加了 0.001 然后，最终结果是 $J$ 上升到原来的三倍，所以 $J$ 对 $v$ 的导数就等于 3，因为对于任何v的增量 $J$ 都会有三倍增量。

and, in fact, this is very analogous to,the example we had in the previous video,where we had f(a) equals 3a.,and so, we then derive,that df(a)/da which was slightly simplified,and slightly sloppy notation,you can read as df/da was equal to three.So, instead, here we have $J$ equals 3v,and so $dJ/dv$ is equal to three,with here $J$ playing the role of f,and v playing the role of a in,this previous example that we had right from an earlier video.In the terminology of backpropagation what we’ve seen is that,if you want to compute,the derivative of this final output variable,which uses variable you care most about,with respect to v,then we’re done sort of one step of backpropagation,so the called one step backwards in this graph.

而且这类似于，我们在上一个视频中的例子，我们有 $f(a)=3a$ ，然后我们推导出，那个df(a)/da 就是稍微化简之后的，有点随便的写法，你可以看成 $df/da=3$ ，所以这里我们有 $J$ =3v，所以 $dJ/dv$ 就等于 3，这里 $J$ 扮演了f的角色， $v$ 扮演了 $a$ 的角色，在之前的视频里的例子，在反向传播算法中的术语我们看到，如果你想计算，最后输出变量的导数，使用你最关心的变量，对 $v$ 的导数，那么我们就做完了一步反向传播，在这个流程图中是一个反向步。

Now, let’s look at another example.What is $dJ/da$ ?In other words, if we pump up the value of a,how does that affect the value of $J$ ?Well, let’s go through the example.variable a is equal to five.So let’s pump it up to 5.001.The net impact of that is that v which was a plus U,so that was previous 11,this we can increase to 11.001.and then we’ve already seen as abovethat $J$ now gets bumped up to 33.003.So, what we’ve seen is that if you increase a by 0.001, $J$ increases by 0.003.and by increase a I mean if you were to take this value 5 and just plug in the new value,then the change to a will propagate to the right of the computation graph.So that $J$ ends up being 33.003.and so, the increase to $J$ is three times the increase to a.That means this derivative is equal to three.

我们来看另一个例子， $dJ/da$ 是多少呢?换句话说如果我们提高 $a$ 的数值，对 $J$ 的数值有什么影响?好我们看看这个例子，变量 $a=5$ ，我们让它增加到5.001，那么对 $v$ 的影响就是 $a+u$ ，之前是 11，现在变成 11.001，我们从上面看到，现在 $J$ 就变成 33.003了，所以我们看到的是如果你让 $a$ 增加 0.001， $J$ 增加 0.003，那么增加 $a$ 我是说，如果你把这个 5 换成某个新值，那么 $a$ 的改变量，就会传播到流程图的最右，所以 $J$ 最后是 33.003，所以 $J$ 的增量是 3 乘以 $a$ 的增量，意味着这个导数是 3。

One way to break this down is to say that if you change a then that would change v,and through changing v,that would change $J$ .and so, the net change to the value of $J$ ,when you bump up the value,when you nudge the value of a up a little bit is that,first, by changing a you end up increasing v. Well,how much does v increase?It is increased by an amount that’s determined by dv/da and then the change in v will cause the value of $J$ to also increase.So, in Calculus this is actually called the chain rule,that’s if a affects v affects $J$ ,then the amount that $J$ changes when you nudge a is the product of how much v changes when you nudge a,times how much $J$ changes when you nudge v.So in Calculus again this is called the chain rule.What we saw from this calculation is that if you increase a by 0.001,v changes by the same amount.So dv/da is equal to one.

要解释这个计算过程其中一种方式是，就是如果你改变了 $a$ 那也会改变 $v$ ，通过改变v，也会改变 $J$ ，所以 $J$ 值的净变化量，当你提升这个值，当你把 $a$ 值提高一点点这就是 $J$ 的变化量，首先 $a$ 增加了 $v$ 也会增加， $v$ 增加多少呢?，增加了一个量，这取决于 $dv/da$ 然后 $v$ 的变化，导致 $J$ 也在增加，所以这在微积分里实际上叫链式法则，如果 $a$ 影响到 $v$ 影响到 $J$ ，那么当你让 $a$ 变大时 $J$ 的变化量，就是当你改变 $a$ 时 $v$ 的变化量乘以，改变v时 $J$ 的变化量，在微积分里这叫链式法则，我们从这个计算中看到，如果你让 $a$ 增加 0.001， $v$ 也会变化相同的大小，所以 $dv/da$ 就等于1。

So in fact if you plug in what we have worked up previously on d $J$ /dv is equal to three and dv/da is equal to one,so the product of this, three times one.That actually gives you the correct value that $dJ/da$ is equal to three.This little illustration shows how by having computed $dJ/dv$ had this derivative with respect to this variable,it can then help you to compute $dJ/da$ .and so, that’s another step of this backward calculation.

事实上如果你代入进去我们之前算过， $\dfrac{dJ}{dv }$ 等于 3， $\dfrac{dv}{da}$ 等于 1，所以这个乘积 3×1，实际上就给出了正确答案， $\dfrac{dJ}{da}$ 就等于 3，这张小图表示了如何计算， $\dfrac{dJ}{dv }$ 就是这个对这个变量的导数，它可以帮你计算 $\dfrac{dJ}{da}$ ，所以这是另一步反向传播计算。

I just want to introduce one more new notational convention,which is that when you’re writing codes to implement backpropagation,there usually be some final output variable that you really care about,a final output variable that you really care about or that you want to optimize.and in this case, this final output variable is j.It’s really the last note in your computation graph.and so, a lot of computations will be trying to compute the derivative of that find the output variable.So d of this final output variable with respect to some other variable.Let me just call that, d var.

现在我想介绍一个新的符号约定，当你编程实现反向传播时，通常会有一个最终输出值是你要关心的，最终的输出变量，你真正想要关心或者说优化的，在这种情况下最终的输出变量是 $J$ ，就是流程图里最后一个符号，所以有很多计算尝试，计算输出变量的导数，所以 $d$ 输出变量对某个变量的导数，我们就用 $dvar$ 命名。

So, a lot of the computations you have would be to compute the derivative of the final output variable,letter $J$ in this case,with various intermediate variable such as a, b, c, u, v.and when you implement this in software,what do you call this variable name?One thing you could do is, in Python,you could write a very long variable name,d Final Output var over a d var.But that’s a very long variable name.We could call this $dJ$ , d var.But because you’re always taking derivatives respect to $dJ$ ,respect to this final output variable.

所以在很多计算中你需要，计算最终输出结果的导数，在这个例子里是 $J$ ，还有各种中间变量比如a b c u v，当你在软件里实现的时候，变量名叫什么?，你可以做的一件事是在 Python 中，你可以写一个很长的变量名，比如d FinalOutputvar 除以 d var，但这个变量名有点长，我们就用 $dJ/dvar$ ，但因为你一直对 $dJ$ 求导，对这个最终输出变量求导。

I’m going to introduce a new notation, where in code,when you’re computing this thing in the code you write,we’re just going to use the variable name dvarin order to represent that quantity.Okay? So dvar in the code you write,will represent the derivative of the final output variable you care about such as j,sometimes the last L with respect to the various intermediate quantities you’re computing in your code.So this thing here in your code,you use dv to denote this value.So dv would be equal to three and your code represents this as a da,which is we also figured out to be equal to three.Okay? So we’ve done backpropagation partiallythrough this computation graph.

我这里要介绍一个新符号在程序里，当你编程的时候在代码里，我们就使用变量名 $dvar$ ，来表示那个量，好所以在程序里是 $dvar$ ，表示导数，你关心的最终变量 $J$ 的导数，有时最后是 $L$ ，对代码中各种中间量的导数，所以代码里这个东西，你用 $dv$ 表示这个值，所以 $dv$ 就等于 3 你的代码表示就是 $da$ ，这也等于 3，好所以我们通过这个流程图，部分完成的后向传播算法。

let’s go through the rest of this example on the next slide.So let’s go to clean up a copy of the computation graph.and just to recap,what we’ve done so far, is go backward here and figured out that dv is equal to three.and again, the definition of dv,that’s just a variable name of the code is really d, j, d, v.I figured out that da is equal to three and again,da is the variable name in your code and that’s really the value of dJ, da. Have a sort of hand wave,how you have gone backwards on these two edges, like so.Now, let’s keep computing derivatives.Let’s look at the value, u.So what is dJ, du?

我们在下一张幻灯片看看这个例子剩下的部分。我们清理出一张新的流程图。我们回顾一下，到目前为止我们一直在往回传播，并算出 $dv$ 等于 3。再次 $dv$ 的定义是，就是一个变量名在代码里是 dJ dv。我发现 $da=3$ 再次，da是代码里的变量名，其实代表 $dJ/da$ 的值。大概手算了一下，两条线怎么计算反向传播。好我们继续计算导数。我们看看这个值 $u$ 。那么 $dJ/du$ 是多少呢?

Well, through a similar calculation as what we did before,now we start off with u equals six.If you bump up u to 6.001,then v which is previous 11,goes up to 11.001,and so j goes from 33 to 33.003.and so the increase in j is 3x, so this is equal.and the analysis for u is very similar to the analysis we did for a.This is actually computed as dJ, dv times dv, du.With this, we had already figured out was three,and this turns out to be equal to one.So we’ve got one more step of back propagation,we end up computing that du is also equal to three,and du is of course, just as dJ, du.

好通过和之前类似的计算，现在我们从 $u=6$ 出发。如果你令 $u$ 增加到 6.001，那么 $v$ 之前是 11，现在变成 11.001 了， $J$ 就从 33 变成 33.003。所以 $J$ 增量是3倍所以这就等于。你对 $u$ 的分析很类似对 $a$ 的分析。实际上这计算起来就是 $\dfrac{dJ}{dv}· \dfrac{dv}{du}$ 。有了这个我们就可以算那个结果是 3，这个结果是 1。所以我们还有一步反向传播，我们最终计算出 $du$ 也等于 3，这 $du$ 当然了就是 $dJ/du$

Now, we just step through one last example in detail.So what is dJ, dv?Imagine if you are allowed to change the value of b and you want to tweak b a little bit in order to minimize or maximize the value of j.So what is the derivative, what’s the slope of this function j when you change the value of b a little bit?It turns out that,using the chain rule for calculus,this can be written as the product of two things,is dJ, du times du, dv.and the reasoning is,if you change b a little bit, so b goes to 3 to, say, 3.001.The way it’ll affect j is,it will first affect u.So how much does it affect u?

现在我们仔细看看最后一个例子， $\dfrac{dJ}{dv}$ 呢?想像一下如果你改变了 $b$ 的值你想要，然后变化一点让 $J$ 值达到最大，那么导数是什么呢? 这个 $J$ 函数的斜率，当你稍微改变 $b$ 值之后，事实上，使用微积分的链式法则，这可以写成两者的乘积，就是 $\dfrac{dJ}{du}·\dfrac{du}{dv}$ ，理由是，如果你改变 $b$ 一点点所以 $b$ 变成比如说 3.001，它影响 $J$ 的方式是，首先会影响 $u$ ，它对 $u$ 的影响有多大?

Well, u is defined as b times c, right?So this will go from six when b is equal to three,to now, or 6.002.Right? Because c is equal to two, in our example here.and so this tells us that, du, db is equal to two,because when you pump up b by .001,u increase twice as much.So du, db, this is equal to two.and now, we know that u has gone up twice as much as b has gone up.Well, what is dJ, du?We’ve already figured out thatthis is equal to three and so by multiplying these two parts,we find that dJ,db is equal to six.

好 $u$ 的定义是 $b·c$ ，所以 $b=3$ 时这是 6，现在就变成 6.002 了，对吧因为在我们的例子中 $c$ 等于 2，所以这告诉我们 $\dfrac{du}{db}$ 等于 2，当你让 $b$ 增加 0.001时， $u$ 就增加两倍，所以 $\dfrac{du}{db}$ 这等于 2，现在我想 $u$ 已经增加量是 $b$ 的两倍，那么 $\dfrac{dJ}{du}$ 是多少?，我们已经弄清楚了，这等于 3 所以让这两部分相乘，我们发现 $\dfrac{dJ}{db}$ 等于 6。

and again, here’s the reasoning for the second part of the argument, which is,we want to know when u goes up by .002,how does that affect j?The fact that dJ, du is equal to three,that tells us that when u goes up by .002,j goes up three times as much.So j should go up by .006, right?That comes from a fact that dJ, du is equal to three.and if you check the math in detail,you will find that,if b becomes 3.001,then u becomes 6.002,v becomes 11.002, so that’s a plus u, that’s five plus u.and then j, which is equal to three times v,that answer being equal to 33.006.Right? and so that’s how you get that dJ, db is equal to six.and to fill that in, this is if we go backwards,so this is db is equal to six and db really is the Python code variable name for the dJ, db.and I won’t go through the last example in great detail but it turns out that,if you also compute how dJ, da,this turns out to be dJ, du times du, da and this turns out to be nine.Just turns out to be three times three.I won’t go through that example in detail.Through this last step,it is possible to derive that d_c is equal to 9.

好这就是推导第二部分的推导其中，我们想知道 $u$ 增加 0.002，会对J有什么影响，实际上 $\dfrac{dJ}{du}$ 等于 3，这告诉我们 $u$ 增加 0.002 之后， $j$ 上升了3 倍，那么 $j$ 应该上升 0.006 对吧，这可以从 $\dfrac{dJ}{du}=3$ 推导出来，如果你仔细看看这些数学内容，你会发现，如果 $b$ 变成 3.001，那么 $u$ 就变成 6.002， $v$ 变成 11.002 所以这是 $a+u$ 这是 $5+u$ ，然后 $j$ 就等于 $3v$ ，所以答案就是 33.006，对吧? 这就是如何得到 $\dfrac{dJ}{db}=6$ ，为了填进去如果我们反向走的话，这是 $db$ 等于 6 而 $db$ 其实是，Python 代码中的变量名表示 $\dfrac{dJ}{db}$ ，我不会很详细的介绍最后一个例子，但事实上，如果你同时算算 $\dfrac{dJ}{da}$ ，结果这是 $\dfrac{dJ}{du}$ ，乘以 $\dfrac{du}{da}$ 这结果是 9，是 3×3，我不会详细说明这个例子，在最后一步，我们可以推出 $d_c$ 等于 9。

So the key takeaway from this video,from this example is that,when computing derivatives in computing all of these derivatives,the most efficient way to do so,is through a right to left computation following the direction of the red arrows.and in particular, we’ll first compute the derivatives respect to v,and then that becomes useful for computing the derivative respect a,and the derivative respect to u.and then, derivative respect to u, for example,this term over here and this term over here, those, in turn,become useful for computing the derivative respect to b,and the derivative respect to c. So that wasa computation graph and how there’s a forward or left to right calculation to compute the cost functions such as j,do you might want to optimize.and a backwards or a right to left calculation to compute derivatives.

所以这个视频的要点是，对于那个例子，当计算所有这些导数时，最有效率的办法是，从右到左计算，跟着这个红色箭头走，特别是当我们第一次计算对 $v$ 的导数时，之后在计算对 $a$ 导数就可以用到，对 $u$ 的导数，然后对 $u$ 的导数比如说，这个项和这里这个项，可以帮助计算对 $b$ 的导数，然后对 $c$ 的导数所以这是，一个计算流程图就是正向，或者说从左到右的计算来计算成本函数 j <script type="math/tex" id="MathJax-Element-3585">j</script>，你可能需要优化的函数，然后反向从右到左计算导数。

If you’re not familiar with calculus or the chain rule,I know some of those details are gone by really quickly.But if you didn’t follow all the details, don’t worry about it.In the next video, we’ll go over this again,in the context of logistic regression,and show you exactly what you need to do,in order to implement the computations you need,to compute derivatives through the logistic regression model.

如果你不熟悉微积分或链式法则，我知道这里有些细节讲的很快，但如果你没跟上所有细节也不用怕，在下一个视频中我会再过一遍，在 logistic 回归的背景下过一遍，并给你们介绍需要做什么，才能编写代码，实现 logistic 回归模型中的导数计算。

PS: 欢迎扫码关注公众号：「SelfImprovementLab」！专注「深度学习」，「机器学习」，「人工智能」。以及「早起」，「阅读」，「运动」，「英语」「其他」不定期建群打卡互助活动。

确定要放弃本次机会？
福利倒计时
: :

立减 ¥
普通VIP年卡可用
立即使用

ZJ_Improve

关注关注

0
点赞

踩

1

收藏

觉得还不错? 一键收藏

0
评论

Coursera | Andrew Ng (01-week-2-2.7&2.8)—计算图&计算图的导数计算

该系列仅在原课程基础上部分知识点添加个人学习笔记，或相关推导补充等。如有错误，还请批评指教。在学习了 Andrew Ng 课程的基础上，为了更方便的查阅复习，将其整理成文字。因本人一直在学习英语，所以该系列以英文为主，同时也建议读者以英文为主，中文辅助，以便后期进阶时，为学习相关领域的学术论文做铺垫。- ZJ Coursera 课程 |deeplearning.ai |网易云课堂转载请注明作者
复制链接

扫一扫

专栏目录

Coursera-Ng-Deep-Learning-Specialization:笔记本快速搜索

05-09

第2周：建立逻辑回归模型，构建为浅层神经网络实现ML算法的主要步骤，包括进行预测，导数计算和梯度下降。实现高效计算，高度矢量化的模型版本。了解如何使用反向传播思维方式为逻辑回归计算导数。熟悉Python和...

Coursera-ML-AndrewNg-master.zip

07-14

《Coursera-ML-AndrewNg-master.zip》这个压缩包文件包含了由吴恩达(Andrew Ng)教授在Coursera平台上的机器学习课程的核心内容。这门课程是人工智能（AI）领域的重要基石，特别是机器学习（ML）部分，涵盖了广泛的...

参与评论您还未登录，请先登录后发表或查看评论

2.8 计算图的导数计算

Claroja

12-25 717

这里需要注意的是在写代码的时候我们会用dv代表dJdv\frac{dJ}{dv}dvdJ,默认的我们都不写输出函数dJdJdJ.

神经网络和深度学习-第二周神经网络基础-第八节：计算图的导数计算

geekidentity

01-01 341

本系列博客是吴恩达(Andrew Ng)深度学习工程师课程笔记。全部课程请查看吴恩达(Andrew Ng)深度学习工程师课程目录上一节中，我们使用计算图来计算函数JJ，现在我们理清一下计算图的描述，看看我们如何利用它计算出函数JJ的导数。下图是一个流程图，假设你要计算JJ对vv的导数dJdv\frac{dJ}{dv}，比如我们改变vv值那么JJ的值怎么呢？定义上JJ是3v3v，现在v=11v=11

深度学习笔记（三）计算图及其导数运算方法

Mr.zwX

12-03 613

通过上文logistic regression的讲解知道，正向运算可以计算输出结果，而反向运算可以计算褪影梯度或导数，从而调整参数。通过简单的运算式引出计算图的组成，从而引出深度学习中前向传播和反向传播的运算方法。一、计算图与正向传播假设函数J(a,b,c)=3(a+bc).J(a,b,c)=3(a+bc).J(a,b,c)=3(a+bc).按照运算顺序我们令，u=bc,v=a+u,J=3v.u=bc,v=a+u,J=3v.u=bc,v=a+u,J=3v. ps.如果学过高等数学中的多元微积分，那.

2.8 计算图的导数计算-深度学习-Stanford吴恩达教授

ygl_9913的博客

02-07 241

视频笔记

Coursera-ML-AndrewNg-Notes-master.rar

09-08

《Coursera-ML-AndrewNg-Notes-master》是一份源自GitHub的开源学习资源，它主要涵盖了斯坦福大学教授Andrew Ng在Coursera平台上开设的机器学习课程的笔记。这份资料是学习者们共享和交流的产物，旨在帮助学员更好地...

Coursera-ML-AndrewNg-Notes markdown.rar

06-27

Coursera-ML-AndrewNg-Notes markdown.rar 总结的很不错，电脑需要安装看markdown的工具，配合Coursera课程，深度学习

Coursera-ML-AndrewNg-Notes-master.zip

05-31

"Coursera-ML-AndrewNg-Notes-master.zip"这个压缩包文件，正是对这门课程的深度学习笔记与Python实现的集合，旨在帮助学习者更高效地理解和掌握机器学习的核心概念与算法。首先，我们要明白机器学习的基本定义：...

铁素体含量图 WRC 1992

03-02

用于计算铁素体含量

吴恩达深度学习 —— 2.8 计算图的导数计算

然后就去远行

11-26 585

上一节中，看了一个例子，使用流程图来计算函数J，现在我们清理一下流程图的描述，看看如何利用它计算出函数J的导数。下图是一个流程图，假设要计算J对v的导数，怎么计算呢？把v值拿过来，改变一下，那么J的值会怎么变呢？定义上，J=3vJ=3vJ=3v，现在v=11，如果让v增加一点点，如果到11.001，那么J的值就变为33.003,。当v增加0.001，最终结果是J上升到原来的三倍，所以J对v的导数...

Deep Leaning 学习笔记（3）—— 计算图导数计算逻辑回归

Mr.Zhang的笔记本

06-14 285

计算图导数通过两组特征，x1,x2以及对应的w1,w2，还有偏差b，调整这些参数来减少最终预测的损失L(a,y) 代码实现逻辑回归大致流程

深度学习入门笔记（三）：求导和计算图

种树最好的时间是10年前，其次是现在！！！

09-17 6408

声明 1）该文章整理自网上的大牛和机器学习专家无私奉献的资料，具体引用的资料请看参考文献。 2）本文仅供学术交流，非商用。所以每一部分具体的参考资料并没有详细对应。如果某部分不小心侵犯了大家的利益，还望海涵，并联系博主删除。 3）博主才疏学浅，文中如有不当之处，请各位指出，共同进步，谢谢。 4）此属于第一版本，若有错误，还需继续修正与增删。还望大家多多指点。大家都共享一点点，一起为祖国科研的推进添...

deeplearning.10计算图的导数计算

疯子的梦想@的博客

09-04 88

计算图的导数计算计算图（流程图）导数计算计算图（流程图）导数计算下图所示，这是一个流程图，假设要计算J对V的导数（dJ/dv），怎么算呢？这里当v=11的时候，J=33，当V增加0.001，变成11.001时，J变成33.003，增加了三倍，所以dJ=J1-J0=33.003-33=0.003,dv=v1-v0=11.001-11=0.001。那么dJ/dv就等于3。那么计算J对a的导数，同理（dJ/da）也可以采取上述dJ/dv的思路来计算。最后dJ/da=3。在微积分中这叫链式法则，即a影响v

第二周神经网络基础：2.8 计算图的导数计算

xpj8888的博客

03-16 85

3.深度学习入门笔记：求导和计算图

qq_43703185的博客

05-17 243

原文链接：https://blog.csdn.net/TeFuirnever/article/details/100900465 欢迎关注WX公众号：【程序员管小

神经网络与深度学习——神经网络基础——计算图的导数运算

kazuhura的博客

08-06 295

核心是求导数以及链式法则求出最终输出变量对中间变量的导数并存储，一步一步向回推，通过链式法则求出对输入变量的导数 Ø 注：编程时我们用dvar具体表示最终变量对某一变量的导数

coursera-ml-andrewng-notes-master.zip

最新发布

06-27

Coursera-ml-andrewng-notes-master.zip是一个包含Andrew Ng的机器学习课程笔记和代码的压缩包。这门课程是由斯坦福大学提供的计算机科学和人工智能实验室（CSAIL）的教授Andrew Ng教授开设的，旨在通过深入浅出的...

“相关推荐”对你有帮助么？

非常没帮助

没帮助

一般

有帮助

非常有帮助

提交