Coursera | Andrew Ng (03-week1-1.8)—为什么是人的表现?

该系列仅在原课程基础上部分知识点添加个人学习笔记,或相关推导补充等。如有错误,还请批评指教。在学习了 Andrew Ng 课程的基础上,为了更方便的查阅复习,将其整理成文字。因本人一直在学习英语,所以该系列以英文为主,同时也建议读者以英文为主,中文辅助,以便后期进阶时,为学习相关领域的学术论文做铺垫。- ZJ

Coursera 课程 |deeplearning.ai |网易云课堂


转载请注明作者和出处:ZJ 微信公众号-「SelfImprovementLab」

知乎https://zhuanlan.zhihu.com/c_147249273

CSDNhttp://blog.csdn.net/junjun_zhao/article/details/79150678


1.8 Why human level performence? (为什么是人的表现?)

(字幕来源:网易云课堂)

这里写图片描述

In the last few years, a lot more machine learning teams have been talking about comparing the machine learning systems to human level performance.Why is this? I think there are two main reasons.First is that because of advances in deep learning,machine learning algorithms are suddenly working much better and so it has become much more feasible in a lot of application areas for machine learning algorithms to actually become competitive with human-level performance.Second, it turns out that the workflow of designing and building a machine learning system, the workflow is much more efficient when you’re trying to do something that humans can also do.So in those settings, it becomes natural to talk about comparing or trying to mimic human-level performance.

在过去的几年里 更多的机器学习团队一直在讨论,如何比较机器学习系统和人类的表现,为什么呢?我认为有两个主要原因首先是因为深度学习系统的进步机器学习算法突然变得更好了,在许多机器学习的应用领域已经开始见到算法已经可以威胁到人类的表现了其次 事实证明 当你试图让机器做人类能做的事情时,可以精心设计机器学习系统的工作流程,让工作流程效率更高,所以在这些场合比较人类和机器是很自然的,或者你要让机器模仿人类的行为。

Let’s see a couple examples of what this means.I’ve seen on a lot of machine learning tasks that as you work on a problem over time,so the x-axis, time, this could be many months or even many years over which some team or some research community is working on a problem.Progress tends to be relatively rapid as you approach human level performance.But then after a while, the algorithm surpasses human-level performance and then progress and accuracy actually slows down.And maybe it keeps getting better but after surpassing human level performance it can still get better, but performance,the slope of how rapid the accuracy’s going up, often that slows down.And the hope is it achieves some theoretical optimum level of performance.And over time, as you keep training the algorithm,maybe bigger and bigger models on more and more data,the performance approaches but never surpasses some theoretical limit,which is called the Bayes optimal error.

这里写图片描述

我们来看几个这样的例子,我看到很多机器学习任务中 当你在一个问题上付出了很多时间之后,所以 x 轴是时间 这可能是很多个月甚至是很多年,在这些时间里 一些团队或一些研究小组正在研究一个问题,当你开始往人类水平努力时 进展是很快的,但是过了一段时间 当这个算法表现比人类更好时那么进展和精确度的提升就变得更慢了,也许它还会越来越好 但是,在超越人类水平之后 它还可以变得更好 但性能增速,准确度上升的速度这个斜率 会变得越来越平缓,我们都希望能达到理论最佳性能水平,随着时间的推移 当您继续训练算法时,可能模型越来越大 数据越来越多,但是性能无法超过某个理论上限,这就是所谓的贝叶斯最优错误率。

So Bayes optimal error, think of this as the best possible error.And that’s just no way for any function mapping from x to y to surpass a certain level of accuracy.So for example, for speech recognition,if x is audio clips, some audio is just so noisy it is impossible to tell what is in the correct transcription.So the perfect error may not be 100%.Or for cat recognition.Maybe some images are so blurry, that it is just impossible for anyone or anything to tell whether or not there’s a cat in that picture.So, the perfect level of accuracy may not be 100%.And Bayes optimal error, or Bayesian optimal error, or sometimes Bayes error for short,is the very best theoretical function for mapping from x to y.That can never be surpassed.So it should be no surprise that this purple line,no matter how many years you work on a problem you can never surpass Bayes error, Bayes optimal error.And it turns out that progress is often quite fast until you surpass human level performance.And it sometimes slows down after you surpass human level performance.And I think there are two reasons for that,for why progress often slows down when you surpass human level performance.One reason is that human level performance is for many tasks not that far from Bayes’ optimal error.People are very good at looking at images and telling if there’s a cat or listening to audio and transcribing it.So, by the time you surpass human level performance maybe there’s not that much head room to still improve.But the second reason is that so long as your performance is worse than human level performance,then there are actually certain tools you could use to improve performance that are harder to use once you’ve surpassed human level performance.

这里写图片描述

所以贝叶斯最优错误率 一般认为是理论上可能达到的最优错误率,就是说没有任何办法设计出,一个 x 到 y 的函数 让它能够超过一定的准确度,例如 对于语音识别来说,如果 x 是音频片段 有些音频就是这么嘈杂,基本不可能知道说的是什么,所以完美的准确率可能不是 100%,或者对于猫图识别来说,也许一些图像非常模糊 不管是人类还是机器,都无法判断该图片中是否有猫,所以 完美的准确度可能不是 100%,而贝叶斯最优错误率 有时写作 Bayesian 或者省略 optimal,就是从 x 到 y 映射的理论最优函数,永远不会被超越,所以你们应该不会感到意外这紫色线,无论你在一个问题上工作多少年,你永远不会超越贝叶斯错误率 贝叶斯最佳错误率,事实证明 机器学习的进展往往相当快,直到你超越人类的表现之前一直很快,当你超越人类的表现时 有时进展会变慢,我认为有两个原因为什么当你超越人类的表现时 进展会慢下来一个原因是人类水平在很多任务中,离贝叶斯最优错误率已经不远了,人们非常擅长看图像,分辨里面有没有猫 或者听写音频,所以 当你超越人类的表现之后,也许没有太多的空间继续改善了,但第二个原因是只要你的表现比人类的表现更差,那么实际上可以使用某些工具来提高性能,一旦你超越了人类的表现 这些工具就没那么好用了

So here’s what I mean.For tasks that humans are quite good at,and this includes looking at pictures and recognizing things, or listening to audio,or reading language, really natural data tasks humans tend to be very good at.For tasks that humans are good at, so long as your machine learning algorithm is still worse than the human, you can get labeled data from humans.That is you can ask people, ask or hire humans, to label examples for you so that you can have more data to feed your learning algorithm.Something we’ll talk about next week is manual error analysis.But so long as humans are still performing better than any other algorithm,you can ask people to look at examples that your algorithm’s getting wrong,and try to gain insight in terms of why a person got it right but the algorithm got it wrong.And we’ll see next week that this helps improve your algorithm’s performance.And you can also get a better analysis of bias and variance which we’ll talk about in a little bit.But so long as your algorithm is still doing worse then humans you have these important tactics for improving your algorithm.Whereas once your algorithm is doing better than humans,then these three tactics are harder to apply.So, this is maybe another reason why comparing to human level performance is helpful,especially on tasks that humans do well.And why machine learning algorithms tend to be really good at trying to replicate tasks that people can do and kind of catch up to and maybe slightly surpass human level performance.In particular, even though you know what is bias and what is variance it turns out that knowing how well humans can do well on a task can help you understand better how much you should try to reduce bias and how much you should try to reduce variance.I want to show you an example of this in the next video.

这里写图片描述

我的意思是这样,对于人类相当擅长的任务,包括看图识别事物 听写音频,或阅读语言 人类一般很擅长处理这些自然数据,对于人类擅长的任务 只要你的机器学习算法,比人类差 你就可以从让人帮你标记数据,你可以让人帮忙 或者花钱请人帮你标记例子,这样你就有更多的数据可以喂给学习算法下周我们会讨论 人工错误率分析,但只要人类的表现比任何其他算法都要好,你就可以让人类看看你算法处理的例子 知道错误出在哪里,并尝试了解,为什么人能做对 算法做错,下周我们会看到 这样做有助于提高算法的性能,你也可以更好地分析偏差和方差,我们稍后会谈一谈。但是只要你的算法仍然比人类糟糕,你就有这些重要策略可以改善算法,而一旦你的算法做得比人类好,这三种策略就很难利用了,所以这可能是另一个,和人类表现比较的好处,特别是在人类做得很好的任务上,为什么机器学习算法往往很擅长,模仿人类能做的事情,然后赶上甚至超越人类的表现,特别是 即使你知道偏差是多少 方差是多少,知道人类在特定任务上能做多好,可以帮助你更好地了解你应该重点尝试,减少偏差 还是减少方差,我想在下一个视频中给你一个例子。


重点总结:

这里写图片描述

这里写图片描述

参考文献:

[1]. 大树先生.吴恩达Coursera深度学习课程 DeepLearning.ai 提炼笔记(3-1)– 机器学习策略(1)


PS: 欢迎扫码关注公众号:「SelfImprovementLab」!专注「深度学习」,「机器学习」,「人工智能」。以及 「早起」,「阅读」,「运动」,「英语 」「其他」不定期建群 打卡互助活动。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
### 回答1: Coursera-ml-andrewng-notes-master.zip是一个包含Andrew Ng机器学习课程笔记和代码的压缩包。这门课程是由斯坦福大学提供的计算机科学和人工智能实验室(CSAIL)的教授Andrew Ng教授开设的,旨在通过深入浅出的方式介绍机器学习的基础概念,包括监督学习、无监督学习、逻辑回归、神经网络等等。 这个压缩包中的笔记和代码可以帮助机器学习初学者更好地理解和应用所学的知识。笔记中包含了课程中涉及到的各种公式、算法和概念的详细解释,同时也包括了编程作业的指导和解答。而代码部分包含了课程中使用的MATLAB代码,以及Python代码的实现。 这个压缩包对机器学习爱好者和学生来说是一个非常有用的资源,能够让他们深入了解机器学习的基础,并掌握如何运用这些知识去解决实际问题。此外,这个压缩包还可以作为教师和讲师的教学资源,帮助他们更好地传授机器学习的知识和技能。 ### 回答2: coursera-ml-andrewng-notes-master.zip 是一个 Coursera Machine Learning 课程的笔记和教材的压缩包,由学生或者讲师编写。这个压缩包中包括了 Andrew Ng 教授在 Coursera 上发布的 Machine Learning 课程的全部讲义、练习题和答案等相关学习材料。 Machine Learning 课程是一个介绍机器学习的课程,它包括了许多重要的机器学习算法和理论,例如线性回归、神经网络、决策树、支持向量机等。这个课程的目标是让学生了解机器学习的方法,学习如何使用机器学习来解决实际问题,并最终构建自己的机器学习系统。 这个压缩包中包含的所有学习材料都是免费的,每个人都可以从 Coursera 的网站上免费获取。通过学习这个课程,你将学习到机器学习的基础知识和核心算法,掌握机器学习的实际应用技巧,以及学会如何处理不同种类的数据和问题。 总之,coursera-ml-andrewng-notes-master.zip 是一个非常有用的学习资源,它可以帮助人们更好地学习、理解和掌握机器学习的知识和技能。无论你是机器学习初学者还是资深的机器学习专家,它都将是一个重要的参考工具。 ### 回答3: coursera-ml-andrewng-notes-master.zip是一份具有高价值的文件,其中包含了Andrew NgCoursera上开授的机器学习课程的笔记。这份课程笔记可以帮助学习者更好地理解掌握机器学习技术和方法,提高在机器学习领域的实践能力。通过这份文件,学习者可以学习到机器学习的算法、原理和应用,其中包括线性回归、逻辑回归、神经网络、支持向量机、聚类、降维等多个内容。同时,这份笔记还提供了很多代码实现和模板,学习者可以通过这些实例来理解、运用和进一步深入研究机器学习技术。 总的来说,coursera-ml-andrewng-notes-master.zip对于想要深入学习和掌握机器学习技术和方法的学习者来说是一份不可多得的资料,对于企业中从事机器学习相关工作的从业人员来说也是进行技能提升或者知识更新的重要资料。因此,对于机器学习领域的学习者和从业人员来说,学习并掌握coursera-ml-andrewng-notes-master.zip所提供的知识和技能是非常有价值的。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值