Introduction to Deep Learning - Why is deep learning taking off?

This is the notes when study the class "Introduction to Deep Learning", section "Why deep learning is just now taking off?" by Andrew Ng.


If the basic technical idea behind deep learning and neural networks have been around for decades, why are they only just now taking off? In this class, let's go over some of the main drivers behind the rise of deep learning. This will help you better spot the best opportunities within your own organization to apply these to.

Let's plot a figure where the horizontal axis is the amount of labeled data (x,y) we have for a task. On the vertical axis we plot the performance of the learning algorithms such as the accuracy of our spam classifier or our Ad click predictor or the accuracy of our neural network for figuring out the position of other cars for our self-driving car. It turns out if you plot the performance of a traditional learning algorithm like support vector machine or logistic regression as a function of the amount of data you have, you might get a curve that looks like the red line. The performance improves for a while as you add more data, but after a while the performance pretty much plateaus. As if they din't know what to do with huge amount of data. What happened in our society over the last 20 years is that for a lot of applications we just accumulated a lot more data more than traditional learning algorithms were able to effectively take advantage of. For neural network, it turns out that:

  • If you train a small neural network, then its performance maybe looks like the yellow line
  • If you train a medium-size neural net, its performance would be a little better as the cyan line
  • If you train a very large neural net, then its performance usually keeps getting better and better as the green line

So couple of observations:

  • If you want to hit this very high level of performance marked by the black dot, then you need two things. First, often you need to be able to train a big neural network in order to take advantage of the huge amount of data. Second, you need a lot of data. So we often say that scale has been driven deep learning progress. By "scale" I mean the size of the neural network, as well as scale of the data. Today, one of the most reliable way to get better performance in the neural network is often either train a bigger neural nework or throw more data at it. But that only work to a point because eventually you'll run out of data or eventually the neural network is so big that it takes too long to train. But just improving scale has actually taken us a long way in the world of deep learning
  • In the region of smaller training sets, the relative ordering of the algorithms is actually not very well defined. So, if you don't have a lot of training data, it's often up to your skill and other details of the algorithms that determine the performance. So it's quite possible that SVM could do better than neural network there.

In the early days in their modern rising of deep learning, it was scaled data and scale of computation (higher performance of CPU/GPU etc.) that enabled us to make a lot of progress. In the last several years, we have seen tremendous algorithmic innovation that made neural networks run much faster. One of the huge breakthroughs in neural networks has been switching from a sigmoid function to a ReLU function. There has been quite a lot of examples like this where we changed the algorithm because it allows the code to run much faster and allows us to train bigger neural networks.

The other reason that fast computation is important is that it turns out the process of training your neural network is very iterative. Ofen you have an idea for a neural network architecture, so you implement your idea in code. Then you run an experiment which tells you how well your neural network does. Then you could need go back to change the details of the neural network, and then you go around the circle over and over. When your neural networks takes a long time to train, it just takes a long time to go around this cycle. There's huge difference in your productivity building effective neural networks when you can have an idea and try it and see it worked in ten minutes or maybe almost a day versus if you train your neural networks for a month. Fast computation has really helped in terms of speeding up the rate at which you can get an experiment result back. This has really helped both practitioners of neural networks as well as researchers in deep learning. All these had also been a huge boom to the entire deep learning research community which has been incredible inventing new algorithms and making non-stop progress on that front.

So, these are some of the forces powering the rise of deep learning. The good news is that these forces are still working powerfully to make deep learning even better. Take data, society is still throwing up more and more digital data. Or take computation, with the rise of the specialized hardware like GPUs and faster networking and many other hardwares. I'm quite confident that our ability to build very large neural networks will keep on getting better. And take algorithms, we hope deep learning research communities still be continuously phenomenal at innovating on the algorithms front. So because of these, I think that we can be optimistic the deep learning will keep on getting better for many years to come.

<end>

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值