Neural Networks - Examples and intuitions II

摘要: 本文是吴恩达 (Andrew Ng)老师《机器学习》课程,第九章《神经网络学习》中第70课时《例子与直觉理解II》的视频原文字幕。为本人在视频学习过程中记录下来并加以修正,使其更加简洁,方便阅读,以便日后查阅使用。现分享给大家。如有错误,欢迎大家批评指正,在此表示诚挚地感谢!同时希望对大家的学习能有所帮助.
————————————————
In this video, I’d like to working through our example to show how a neural network can compute nonlinear hypotheses.

In the last video, we saw how a neural network can be used to compute the functions x_{1} AND x_{2}, and the function x_{1} OR x_{2}, when x_{1} and x_{2} are binary, that is, what they take on values is 0 and 1. We can also have a network to compute negation, that’s to compute the function NOT x_{1}. Let me just write down the ways associated to this network. We have only one input feature x_{1} in this case, and the bias unit +1. And if I associate this with the weights, +10 and -20, then my hypotheses is computing this h_\theta (x)=g(10-20x_1). So when x_{1} is equal to 0, my hypotheses will be computing g(10-20*0) which is g(10), and so that’s approximately 1. And when x_{1} equals 1, this will be g(-10), which is therefore approximately equal to 0. And if you look at what these values are, that’s essentially the NOT x_{1} function. So to include negations, the general idea is to put a large negative weight in front of the variable you want to negate, so if it’s -20, multiplied by x_{1}, that’s the general idea of how you end up negating x_{1}. And so in an example that I hope you will figure out yourself, if you want to compute a function like this (NOT x_1) AND (NOT x_2), you know, well, part of that would probably be putting large negative weights in front of x_{1} and x_{2}. But it should be feasible to get a neural network with just one output unit to compute this as well. So, this logical function (NOT x_1) AND (NOT x_2) is going to be equal to 1, if and only  if x_1=x_2=0. So, this is a logical function, that is NOT x_{1} means x_{1} must be 0, and NOT x_{2} that means x_{2} must be equal to 0 as well. So, this logical function is equal to 1, if and only if x_1=x_2=0. And hopefully, you should be able to figure out how to make a small neural network to compute this logical function as well.

Now, taking the 3 pieces that we have, put together to network, the network for computing x_1 AND x_2, and the network for computing (NOT x_1) AND (NOT x_2), and one last network for computing x_1 OR x_2. We should be able to put these 3 pieces together to compute this x_1 XNOR x_2 function. And just remind you if this was x_1, x_2, this function that we want to compute would have negative examples here and here and we’d have positive examples there and there. So, clearly, we’ll need a nonlinear decision boundary in order to separate the positive and negative examples. Let’s draw the network. I’m going to take my input plus 1, x_1, x_2, and create my first hidden unit there. I’m going to call this a^{(2)}_1, because it’s my first hidden unit. And I’m going to copy the weights over from the red network that’s x_1 AND x_2 networks. So now -30, 20, 20. Next, let me create a second hidden unit, which I’m going to call a^{(2)}_2, that is the second hidden unit of layer two. And I’m going to copy over the Cyan network in the middle, so I have the weights 10, -20, -20. And, so let’s pull some of the true table values. For the Red network, we know it was computing x_1 AND x_2. And so this will be approximately 0, 0, 0, 1 depending the values of x_1 and x_2. And for a^{(2)}_2, that’s the Cyan network, well we know the function (NOT x_1) AND (NOT x_2) then outputs 1, 0, 0, 0. So, for the 4 values of x_1 and x_2. Finally, I’m going to create my output node, my output unit that is a^{(3)}_1. This is what will output h_\theta (x), and I’m going to copy over the OR network for that and I’m going to need a plus 1 bias unit here, so I draw that in. And I’m going to copy over the weights from the green networks, so, -10, 20, 20. And we know earlier that this computes the OR function. So, let’s go on the truth table entries. For the first entry is 0 OR 1, which is gonna be 1. Then next 0 OR 0, which is 0. 0 OR 0, which is 0, 1 or 0, and that is to 1. And thus, h_\theta (x) is equal to 1, when either both x_1 and x_2 are 0, or when x_1 and x_2 are both 1. And concretely, h_\theta (x) outputs 1 exactly at these 2 locations, and it outputs 0 otherwise. And thus, with this neural network, which has an input layer, one hidden layer and one output layer, we end up with a nonlinear decision boundary that computes this XNOR function. And the more general intuition is that in the input layer, we just had all inputs. Then we had a hidden layer, which computed some slightly more complex functions of the inputs that is shown here, these are slightly more complex functions. And then by adding yet another layer, we end up an even more complex nonlinear neural networks can compute pretty complicated functions, that when you have multiple layers, you have relatively simple function of the inputs the second layer, but the third layer can build on it to compute even more complex functions, and then the layer after that can compute even more complex functions.

To wrap up this video, I want to show you a fun example of an application of a neural network that capture this intuition of the deeper layers computing more complex features.

I want to show you the video that I got from a good friend of mine, Yann LeCun. Yann is a professor at New York University, NYU, and he was one of the early pioneers of neural network research, and he’s sort of a legend in the field. Now and his ideas are used in all sorts of products and applications throughout the world now. So, I want to show you a video from some of his early work in which who was using a neural network to recognize handwriting, to do handwritten digit recognition. You might remember early in this class, at the start of this class, I said that one of early successes of neural networks was trying to use it to read zip codes, to help us send mail alone, so, read postal codes. So, this is one of the attempts, this is one of the algorithms used to address that problem. In the video, I’ll show you this area here is the input area that shows a handwritten character shown to the network. This column here shows a virtualization of the features computed by sort of the first hidden layer of the network. And so the first hidden layer, this visualization shows different features, different edges and lines and so on detected. This is the visualization of the next hidden layer. It’s kind of hard to see, how to understand deeper hidden layers. And that’s the visualization of what the next hidden layer is computing. You probably have a hard time seeing what’s going on much beyond the first hidden layer. But then finally, all of these learned features get fed to the output layer, and shown over here is the final answers, this is the final predictive value for what handwritten digit the neural network things that is being shown. So, let’s take a look at the video. So, I hope you enjoyed the video. And that this hopefully gave you some intuition about the sorts of pretty complicated functions neural networks can learn in which it takes this input, this image, just takes this input the raw pixels, and the first hidden layer, computes some set of features, the next hidden layer computes even more complex features, and these features can then be used by essentially the final layer of logistic regression classifiers to make accurate predictions about what are the numbers that the network sees.

<end>

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值