Coursera | Andrew Ng (01-week-1-1.4)—为什么深度学习会兴起?

该系列仅在原课程基础上部分知识点添加个人学习笔记,或相关推导补充等。如有错误,还请批评指教。在学习了 Andrew Ng 课程的基础上,为了更方便的查阅复习,将其整理成文字。因本人一直在学习英语,所以该系列以英文为主,同时也建议读者以英文为主,中文辅助,以便后期进阶时,为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂


转载请注明作者和出处:微信公众号-「SelfImprovementLab」

知乎https://zhuanlan.zhihu.com/c_147249273

CSDNhttp://blog.csdn.net/JUNJUN_ZHAO/article/details/78799587


为什么深度学习会兴起?

Why is Deep Learning taking off?

If the basic technical idea is behind deep learning behind neural networks have been around for decades,why are they only just now taking off?
in this video let’s go over some of the main drivers behind the rise of deep learning ,because I think this will help you better spot the best opportunities within your own organization to apply these to.

如果在 深度学习 和神经网络背后的基本技术理念已经有好几十年了,为什么现在才突然流行起来呢?在这个视频,来看一些让深度学习流行起来的主要因素,这将会帮助你在自己的组织中,发现好机会来应用这些东西。

over the last few years a lot of people have asked me “Andrew,why is deep learning suddenly working so well?” ,and when I answer the question ,this is usually the picture I draw for them.let’s say we plot a figure where on the horizontal axis /’æksɪs/,we plot the amount of data we have for a task ,and let’s say on the vertical axis we plot the performance on above ,learning algorithms such as the accuracy of our spam classifier ,or our ad click predictor,or the accuracy of our neural net for figuring out the position of other cars for our self-driving car.

在过去的几年里,很多人问我 “Andrew 为什么深度学习突然这么厉害了?“我回答的时候,通常给他们画个图 。画一条水平轴上,代表完成任务的数据数量,垂直轴代表机器学习算法的性能,比如,垃圾邮件过滤的准确率、广告点击预测的准确率、用于无人驾驶中判断其他车辆位置的神经网络的准确率。

taking off

it turns out if you plot the performance of a traditional learning algorithm like support vector machine,or logistic regression as a function of the amount of data you have,you might get a curve that looks like this,where the performance improves for a while as you add more data,but after a while the performance you know pretty much plateaus right suppose your horizontal lines enjoy that very well,you know was it they didn’t know what to do with huge amounts of data,

根据图像可以发现,把传统机器学习算法的表现,比如说支持向量机,或 logistic 回归,作为数据量的一个函数,你可能得到这样的曲线 ,它的性能一开始增加数据时会上升,但是一段时间之后,它的性能进入平台期,假设水平轴拉的很长很长,那是因为这些模型无法处理海量数据,

and what happened in our society over the last 20 years ,maybe is that for a lot of problems we went from ,having a relatively small amount of data,to having you know often a fairly large amount of data ,and all of this was thanks to the digitization of a society,where so much human activity is now in the digital realm,we spend so much time on the computers on websites on mobile apps,and activities on digital devices creates data,and thanks to the rise of inexpensive cameras,built into our cell phones accelerometers,all sorts of sensors in the Internet of Things.we also just have been collecting more and more and more data.

而过去 20 年在我们的社会中,我们遇到的很多问题,早期只有相对较少的数据量,多亏了数字化社会,现在收集海量数据轻而易举。我们人类花了很多时间在数字王国中,在电脑上、在网站上 在手机软件上,数字设备上的活动都能创造数据,同时也归功于便宜的相机,被内置到移动电话,还有加速仪,以及物联网中的各种传感器,我们收集到了越来越多的数据。

so over the last 20 years for a lot of applications we just accumulate a lot more data,more than traditional learning algorithms,were able to effectively take advantage of,and what neural network lead turns out that,if you train a small neural net,then this performance maybe looks like that,if you train a somewhat larger neural net,that’s called as a medium-sized neural net to performance in something a little bit better,and if you train a very large neural net,then it’s the form and often just keeps getting better and better.so couple of observations, one is if you want to hit this very high level of performance then you need two things.

过去 20年,很多应用中我们收集到了大量的数据,远超过传统学习算法能发挥作用的规模,神经网络模型的话,我们发现,如果你训练一个小型的神经网络,那么性能可能会像这样,训练一个稍微大一点的神经网络,一个中等规模的神经网络,性能表现也会更好一些,如果你训练一个非常大的神经网络,性能就会是这样,还会越来越好,注意到两点,一点是,如果你想达到这么高的性能水平有两个条件。

first often you need to be able to train a big enough neural network,in order to take advantage of the huge amount of data,and second you need to be out here on the x axes /’æksiːz/,you do need a lot of data,so we often say that scale has been driving deep learning progress,and by scale I mean both of the size of neural network we need just a neural network,with a lot of hidden units, a lot of parameters, a lot of connections as well as scale of the data.

第一个是需要训练一个规模足够大的神经网络,以发挥数据规模量巨大的优点,另外,要到 x 轴的这个位置需要很多的数,因此我们经常说,规模一直在推动深度学习的进步。说到“规模” 我指的不仅是神经网络的规模,我们需要一个有许多隐藏单元的神经网络,有许多的参数,许多的连接,而且还有数据“规模”。

in fact,today one of the most reliable ways to get better performance in the neural network is often to either train a bigger network or throw more data at it,and that only works up to a point,because eventually you run out of data or eventually then your network is so big that it takes too long to train,but just improving scale has actually taken us a long way in the world of deep learning,in order to make this diagram a bit more technically precise and just add a few more things.I wrote the amount of data on the x-axis technically this is amount of labeled data where by labeled data I mean training examples we have both the input x and the label y.

事实上,要在神经网络上获得更好的表现,在今天最可靠的手段往往就是要么训练一个更大的神经网,要么投入更多的数据,这只能在一定程度上起作用。因为最终你耗尽了数据,或者最终你的网络规模太大,需要的训练时间太久,但提升规模已经让我们在深度学习的世界中获得了大量进展,为了使这个图从技术上更准确一点,我还要加点说明,我在x轴下面已经写明了的数据量,技术点说,这是“带标签的数据量”,带标签的数据,在训练样本时,我们有输入 x 和标签 y 。

I went to introduce a little bit of notation ,that we’ll use later in this course ,we’re going to use lowercase alphabet m ,to denote (表示) the size of my training sets,or the number of training examples ,this lowercase m,so that’s the horizontal axis ,couple other details to this figure,in this regime of smaller training sets,the relative ordering of the algorithms,is actually not very well defined.so,if you don’t have a lot of training data is often up to your skill at hand engineering features,that determines the performance.

我介绍一点符号约定,这在后面的课程中都会用到,我们使用小写的字母 m,表示训练集的规模,或者说训练样本的数量,这个小写字母 m,这就是水平轴图像,还有其他细节,训练集不大的这一块来说,各种算法的性能相对排名并不是很确定,训练集不大的时候效果会取决于你手工设计的组件,会决定最终的表现。

so it’s quite possible that,if someone training an SVM,is more motivated to hand engineer features and someone training even larger than that ,may be in this small training set regime, the SVM could do better,so you know in this region to the left of the figure,the relative ordering between the algorithms is not that well defined,and performance depends much more on your skill at hand engineer features ,and other normal details of the algorithms,and there’s only in this some big data regions very large training sets ,very large M regions in the right,that we more consistently see largely nerual nets,dominating the other approaches and so,if any of your friends ask you,why are neural net as you know taking off,I would encourage you to draw this picture for them as well.

因此很有可能假设有些人训练出了一个 SVM,可能是因为手工设计组件很厉害,有些人训练的规模会大一些,可能训练集不大的时候,SVM表现更好,在这个图形区域的左边,各种算法之间优劣并不是定义得很明确,最终的性能更多取决于手工设计组件的技能以及算法处理方面的一些细节,只有在大数据领域,非常庞大的训练集,也就是在右边 m 非常大时,我们才能见到神经网络稳定地领先其它算法,如果某个朋友问你为什么神经网络这么流行,我鼓励你也给他们画这样一个图像。

so I will say that,in the early days in their modern rise of deep learning,it was scaled data and scale of computation,just our ability to train very large neural networks,either on a **CPU or GPU,**that enabled us to make a lot of progress,but increasingly especially in the last several years,we’ve seen **tremendous algorithmic innovatio**n as well,so I also don’t want to understate that,interestingly many of the algorithmic innovations,have been about trying to make neural networks run much faster,so as a concrete example ,one of the huge breakthroughs in neural networks has been switching from a sigmoid function,**which looks like this to a ReLU function,**which we talked about briefly in an early video.

可以这么说,在深度学习崛起的初期,是数据和计算能力规模的进展,训练一个特别大的神经网络的能力,无论是在 CPU 还是 GPU 上,是这些发展才让我们取得了巨大的进步。但是渐渐地,尤其是在最近这几年,我们也见证了算法方面的极大创新,我也不想忽略算法方面的巨大贡献。有趣的是,许多算法方面的创新,都为了让神经网络运行得更快。举一个具体的例子,神经网络方面的一个巨大突破是 ,从 sigmoid 函数,转换到这样的 ReLU 函数,这个函数我们在之前的视频里提到过。

这里写图片描述

that looks like this,if you don’t understand the details of what about to say,don’t worry about it,but it turns out that one of the problems of using sigmoid functions in machine learning is that,these regions here where the slope of the function,the gradient is nearly zero,and so learning becomes really slow,because when you implement gradient descent and gradient is zero,the parameters just change very slowly,and so learning is very slow,whereas by changing the what’s called the activation function,the neural network to use this function,called the ReLU function of the rectified linear unit ReLU.

形状就像这样,如果你无法理解刚才我说的某个细节也不需要担心,但使用 sigmoid 函数,机器学习问题是在这个区域 sigmoid 函数的斜率,梯度会接近 0,所以学习会变得非常缓慢,因为用梯度下降法时,梯度接近0时,参数会变化得很慢,学习也会变得很慢,而通过改变激活函数,神经网络用这个函数,修正线性单元 ReLU。

the gradient is equal to one,for all positive values of input right,and so the gradient is much less likely to gradually shrink to zero,and the gradient here the slope of this line,is zero on the left,but it turns out that just by switching to the sigmoid function to the ReLU function,has made an algorithm called gradient descent work much faster,and so this is an example,of maybe relatively simple algorithm innovation,but ultimately the impact of this algorithmic innovation,was it really help computation,so there remains quite a lot of examples like this,of where we change the algorithm,because it allows that code to run much faster,and this allows us to train bigger neural networks,or to do so within reasonable amount of time,even when we have a large network with a lot of data,the other reason that fast computation is important is that,it turns out the process of training your network,it is very intuitive often. you have an idea for a neural network architecture,and so you implement your idea and code.

它的梯度,对于所有为正值的输入,输出都是1,因此梯度不会逐渐趋向 0,而这里的梯度,这条线的斜率,在这左边是 0,我们发现,只需将 sigmod 函数转换成ReLU函数,便能够使得“梯度下降法”运行得更快,这就是一个例子,有点简单的算法创新的例子,但是最终算法创新所带来的影响,是增加计算速度,有很多像这样的例子,我们通过改变算法,使得代码运行得更快,这也使得我们能够训练规模更大的神经网络,或者在合理的时间内完成计算,即使在数据量很大,网络也很大的场合,快速计算很重要的另一个原因是,训练神经网络的过程,很多时候是凭直觉的,你有了新想法,关于神经网络架构的想法,然后你写代码实现你的想法。

Implementing your idea then lets you run an experiment,which tells you how well your neural network does,and then by looking at it you go back to change the details of your neural network,and then you go around this circle over and over,and when your neural network takes a long time to train,it just takes a long time to go around this cycle,and there’s a huge difference in your productivity building effective neural networks,when you can have an idea and try it,and see the work in ten minutes,or maybe almost a day worthes if you’ve to train your neural network for a month,which sometimes does happened,because you get a result back you know in ten minutes or maybe in a day,you should just try a lot more ideas,and be much more likely to discover a neural network,and it works well for your application,and so faster computation has really helped,

你实现自己的想法,然后跑一下实验,可以告诉你,你的神经网络效果有多好,知道结果之后再回去改,去改你的神经网络中的一些细节,然后你不断重复这个循环,当你的神经网络需要很长时间去训练,需要很长时间才能走一圈循环的话,在实现神经网络时迭代速度对你的效率影响巨大,如果你有一个想法直接去试,10 分钟后就能看到结果,或者最多花上一天。如果你训练你的神经网络 用了一个月的时间,有时候确实需要那么久,如果你能很快得到结果,比如 10 分钟或者一天内,你就可以尝试更多的想法,那你就很可能发现,适合你的应用的神经网络,所以计算速度的提升,真的有帮助提高迭代速度。

in terms of speeding up the rate,at which you can get an experimental result back,and this has really helped both practitioners of neuro networks,as well as researchers working at deep learning iterate much faster,and improve your ideas much faster,and so all this has also been a huge boom,to the entire deep learning research community,which has been incredible,inventing new algorithms and making nonstop progress on that front,so these are some of the forces powering the rise of deep learning,but the good news is that,these forces are still working powerfully,to make deep learning even better.

让你更快地得到实验结果,这也同时帮助了神经网络的从业人员,和有关项目的研究人员在深度学习的工作中迭代得更快,也能够更快地改进你的想法,所有这些都极大推动了整个深度学习社区的研究,快到令人难以置信,人们一直在发明新的算法,持续不断地进步,是这些力量支持了深度学习的崛起,但好消息是,这些力量还在不断发挥作用,让深度学习更进一步。

Take Data. society is still throwing up one more digital data,or take computation with the rise of specialized hardware like GPUs,and faster networking many types of hardware.I’m actually quite confident,that our ability to build very large neural networks,or should a computation point of view will keep on getting better,and take algorithms, wo hope deep learning research communities though continuously phenomenal at innovating on the algorithms front,so because of this I think that we can be optimistic answer the optimistic, the deep learning will keep on getting better for many years to come,so that let’s go on to the last video of the section,where we’ll talk a little bit more,about what you learn from this course.

我们看数据,我们的社会还在产生更多的数字化数据,我们看计算 GPU 这类专用硬件还在继续发展,网络速度更快,各种硬件更快,我很有信心,我们实现超级大规模神经网络的能力,或者从计算能力这个角度看 ,也在继续进展,我们看算法,我希望深度学习研究社区,能在算法上持续创新,基于这些,我们可以乐观地回答,深度学习,还会继续进步很多年,让我们继续看最后一个课程视频,我们会谈到,通过这门课你能学到什么。


PS: 欢迎扫码关注公众号:「SelfImprovementLab」!专注「深度学习」,「机器学习」,「人工智能」。以及 「早起」,「阅读」,「运动」,「英语 」「其他」不定期建群 打卡互助活动。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Coursera-ml-andrewng-notes-master.zip是一个包含Andrew Ng的机器学习课程笔记和代码的压缩包。这门课程是由斯坦福大学提供的计算机科学和人工智能实验室(CSAIL)的教授Andrew Ng教授开设的,旨在通过深入浅出的方式介绍机器学习的基础概念,包括监督学习、无监督学习、逻辑回归、神经网络等等。 这个压缩包中的笔记和代码可以帮助机器学习初学者更好地理解和应用所学的知识。笔记中包含了课程中涉及到的各种公式、算法和概念的详细解释,同时也包括了编程作业的指导和解答。而代码部分包含了课程中使用的MATLAB代码,以及Python代码的实现。 这个压缩包对机器学习爱好者和学生来说是一个非常有用的资源,能够让他们深入了解机器学习的基础,并掌握如何运用这些知识去解决实际问题。此外,这个压缩包还可以作为教师和讲师的教学资源,帮助他们更好地传授机器学习的知识和技能。 ### 回答2: coursera-ml-andrewng-notes-master.zip 是一个 Coursera Machine Learning 课程的笔记和教材的压缩包,由学生或者讲师编写。这个压缩包中包括了 Andrew Ng 教授在 Coursera 上发布的 Machine Learning 课程的全部讲义、练习题和答案等相关学习材料。 Machine Learning 课程是一个介绍机器学习的课程,它包括了许多重要的机器学习算法和理论,例如线性回归、神经网络、决策树、支持向量机等。这个课程的目标是让学生了解机器学习的方法,学习如何使用机器学习来解决实际问题,并最终构建自己的机器学习系统。 这个压缩包中包含的所有学习材料都是免费的,每个人都可以从 Coursera 的网站上免费获取。通过学习这个课程,你将学习到机器学习的基础知识和核心算法,掌握机器学习的实际应用技巧,以及学如何处理不同种类的数据和问题。 总之,coursera-ml-andrewng-notes-master.zip 是一个非常有用的学习资源,它可以帮助人们更好地学习、理解和掌握机器学习的知识和技能。无论你是机器学习初学者还是资深的机器学习专家,它都将是一个重要的参考工具。 ### 回答3: coursera-ml-andrewng-notes-master.zip是一份具有高价值的文件,其中包含了Andrew NgCoursera上开授的机器学习课程的笔记。这份课程笔记可以帮助学习者更好地理解掌握机器学习技术和方法,提高在机器学习领域的实践能力。通过这份文件,学习者可以学习到机器学习的算法、原理和应用,其中包括线性回归、逻辑回归、神经网络、支持向量机、聚类、降维等多个内容。同时,这份笔记还提供了很多代码实现和模板,学习者可以通过这些实例来理解、运用和进一步深入研究机器学习技术。 总的来说,coursera-ml-andrewng-notes-master.zip对于想要深入学习和掌握机器学习技术和方法的学习者来说是一份不可多得的资料,对于企业中从事机器学习相关工作的从业人员来说也是进行技能提升或者知识更新的重要资料。因此,对于机器学习领域的学习者和从业人员来说,学习并掌握coursera-ml-andrewng-notes-master.zip所提供的知识和技能是非常有价值的。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值