Facebook AI Director Yann LeCun on His Quest to Unleash Deep Learning and Make Machines Smarter

原创 2015年03月01日 11:35:26



Explaining Deep Learning … in Eight Words

IEEE Spectrum: We read about Deep Learning in the news a lot these days. What’s your least favorite definition of the term that you see in these stories?

  • Yann LeCun: My least favorite description is, “It works just like the brain.” I don’t like people saying this because, while Deep Learning gets an inspiration from biology, it’s very, very far from what the brain actually does. And describing it like the brain gives a bit of the aura of magic to it, which is dangerous. It leads to hype; people claim things that are not true. AI has gone through a number of AI winters because people claimed things they couldn’t deliver.

Spectrum: So if you were a reporter covering a Deep Learning announcement, and had just eight words to describe it, which is usually all a newspaper reporter might get, what would you say?

  • LeCun: I need to think about this. [Long pause.] I think it would be “machines that learn to represent the world.” That’s eight words. Perhaps another way to put it would be “end-to-end machine learning.” Wait, it’s only five words and I need to kind of unpack this. [Pause.] It’s the idea that every component, every stage in a learning machine can be trained.

Spectrum: Your editor is not going to like that.

  • LeCun: Yeah, the public wouldn’t understand what I meant. Oh, okay. Here’s another way. You could think of Deep Learning as the building of learning machines, say pattern recognition systems or whatever, by assembling lots of modules or elements that all train the same way. So there is a single principle to train everything. But again, that’s a lot more than eight words.

Spectrum: What can a Deep Learning system do that other machine learning systems can’t do?

  • LeCun: That may be a better question. Previous systems, which I guess we could call “shallow learning systems,” were limited in the complexity of the functions they could compute. So if you want a shallow learning algorithm like a “linear classifier” to recognize images, you will need to feed it with a suitable “vector of features” extracted from the image. But designing a feature extractor “by hand” is very difficult and time consuming.
    An alternative is to use a more flexible classifier, such as a “support vector machine” or a two-layer neural network fed directly with the pixels of the image. The problem is that it’s not going to be able to recognize objects to any degree of accuracy, unless you make it so gigantically big that it becomes impractical.

Spectrum: It doesn’t sound like a very easy explanation. And that’s why reporters trying to describe Deep Learning end up saying…

LeCun: …that it’s like the brain.


  • LeCun认为深度学习会比普通的线性分类器获得更好的效果,并且以图像为例来进行说明。但是深度学习的可解释性并不是很好,有时shallow learning algorithm 可以获得较好解释性。此外,shallow learning algorithm较为经典,各种实用工具已经存在,能满足基本需求。
  • 深度学习和神经网络有关,但是和人脑的概念还是相距很远
  • 简要什么是机器学习,后面会再扩展介绍

A Black Box With 500 Million Knobs

Spectrum: Part of the problem is that machine learning is a surprisingly inaccessible area to people not working in the field. Plenty of educated lay people understand semi-technical computing topics, like, say, the PageRank algorithm that Google uses. But I’d bet that only professionals know anything detailed about linear classifiers or vector machines. Is that because the field is inherently complicated?

  • LeCun: Actually, I think the basics of machine learning are quite simple to understand. I’ve explained this to high-school students and school teachers without putting too many of them to sleep.

  • “Imagine a box with 500 million knobs, 1,000 light bulbs, and 10 million images to train it with. That’s what a typical Deep Learning system is.”A pattern recognition system is like a black box with a camera at one end, a green light and a red light on top, and a whole bunch of knobs on the front. The learning algorithm tries to adjust the knobs so that when, say, a dog is in front of the camera, the red light turns on, and when a car is put in front of the camera, the green light turns on. You show a dog to the machine. If the red light is bright, don’t do anything. If it’s dim, tweak the knobs so that the light gets brighter. If the green light turns on, tweak the knobs so that it gets dimmer. Then show a car, and tweak the knobs so that the red light get dimmer and the green light gets brighter. If you show many examples of the cars and dogs, and you keep adjusting the knobs just a little bit each time, eventually the machine will get the right answer every time.

  • The interesting thing is that it may also correctly classify cars and dogs it has never seen before. The trick is to figure out in which direction to tweak each knob and by how much without actually fiddling with them. This involves computing a “gradient,” which for each knob indicates how the light changes when the knob is tweaked.

  • Now, imagine a box with 500 million knobs, 1,000 light bulbs, and 10 million images to train it with. That’s what a typical Deep Learning system is.

Spectrum: I assume that you use the term “shallow learning” somewhat tongue-in-cheek; I doubt people who work with linear classifiers consider their work “shallow.” Doesn’t the expression “Deep Learning” have an element of PR to it, since it implies that what is “deep” is what is being learned, when in fact the “deep” part is just the number of steps in the system?

  • LeCun: Yes, it is a bit facetious, but it reflects something real: shallow learning systems have one or two layers, while deep learning systems typically have five to 20 layers. It is not the learning that is shallow or deep, but the architecture that is being trained.


  • 解释了下深度学习的概念,感觉这个记者还是很地道的,确实很多人不知道机器学习,深度学习,即使像pagerank这样的算法原理也不知道。
  • LeCun也说了:shallow learning algorithm is not shallow, it’s the architecture that is being trained is shallow.

The Pursuit of Beautiful Ideas (Some Hacking Required)

Spectrum: The standard Yann LeCun biography says that you were exploring new approaches to neural networks at a time when they had fallen out of favor. What made you ignore the conventional wisdom and keep at it?

  • LeCun: I have always been enamored of the idea of being able to train an entire system from end to end. You hit the system with essentially raw input, and because the system has multiple layers, each layer will eventually figure out how to transform the representations produced by the previous layer so that the last layer produces the answer. This idea—that you should integrate learning from end to end so that the machine learns good representations of the data—is what I have been obsessed with for over 30 years.

Spectrum: Is the work you do “hacking,” or is it science? Do you just try things until they work, or do you start with a theoretical insight?

  • LeCun: It’s very much an interplay between intuitive insights, theoretical modeling, practical implementations, empirical studies, and scientific analyses. The insight is creative thinking, the modeling is mathematics, the implementation is engineering and sheer hacking, the empirical study and the analysis are actual science. What I am most fond of are beautiful and simple theoretical ideas that can be translated into something that works.

  • I have very little patience for people who do theory about a particular thing simply because it’s easyvery little patience—particularly if they dismiss other methods that actually work empirically, just because the theory is too difficult. There is a bit of that in the machine learning community. In fact, to some extent, the “Neural Net Winter” during the late 1990s and early 2000s was a consequence of that philosophy; that you had to have ironclad theory, and the empirical results didn’t count. It’s a very bad way to approach an engineering problem.

  • “What I am most fond of are beautiful and simple theoretical ideas that can be translated into something that works.”

  • But there are dangers in the purely empirical approach too. For example, the speech recognition community has traditionally been very empirical, in the sense that the only stuff that’s being paid attention to is how well you are doing on certain benchmarks. And that stifles creativity, because to get to the level where if you want to beat other teams that have been at it for years, you need to go underground for four or five years, building your own infrastructure. That’s very difficult and very risky, and so nobody does it. And so to some extent with the speech recognition community, the progress has been continuous but very incremental, at least until the emergence of Deep Learning in the last few years.

Spectrum: You seem to take pains to distance your work from neuroscience and biology. For example, you talk about “convolutional nets,” and not “convolutional neural nets.” And you talk about “units” in your algorithms, and not “neurons.”

  • LeCun: That’s true. Some aspects of our models are inspired by neuroscience, but many components are not at all inspired by neuroscience, and instead come from theory, intuition, or empirical exploration. Our models do not aspire to be models of the brain, and we don’t make claims of neural relevance. But at the same time, I’m not afraid to say that the architecture of convolutional nets is inspired by some basic knowledge of the visual cortex. There are people who indirectly get inspiration from neuroscience, but who will not admit it. I admit it. It’s very helpful. But I’m very careful not to use words that could lead to hype. Because there is a huge amount of hype in this area. Which is very dangerous.


  • 关于第一个问题忍不住吐曹一番:主流学术界那时不搞神经网络现在又开始搞,趋利本质无疑。想想first principle 就行。“I think it’s important to reason from first principles rather than by analogy. The normal way we conduct our lives is we reason by analogy. [With analogy] we are doing this because it’s like something else that was done, or it is like what other people are doing. [With first principles] you boil things down to the most fundamental truths…and then reason up from there.”


  • Markdown和扩展Markdown简洁的语法
  • 代码块高亮
  • 图片链接和图片上传
  • LaTex数学公式
  • UML序列图和流程图
  • 离线写博客
  • 导入导出Markdown文件
  • 丰富的快捷键


  • 加粗 Ctrl + B
  • 斜体 Ctrl + I
    • 引用 Ctrl + Q
    • 插入链接 Ctrl + L
    • 插入代码 Ctrl + K
    • 插入图片 Ctrl + G
    • 提升标题 Ctrl + H
    • 有序列表 Ctrl + O
    • 无序列表 Ctrl + U
    • 横线 Ctrl + R
    • 撤销 Ctrl + Z
    • 重做 Ctrl + Y


Markdown 是一种轻量级标记语言,它允许人们使用易读易写的纯文本格式编写文档,然后转换成格式丰富的HTML页面。 —— [ 维基百科 ]


本编辑器支持 Markdown Extra ,  扩展了很多好用的功能。具体请参考Github.


Markdown Extra 表格语法:

项目 价格
Computer $1600
Phone $12
Pipe $1


项目 价格 数量
Computer 1600 元 5
Phone 12 元 12
Pipe 1 元 234


Markdown Extra 定义列表语法:
定义 A
定义 B
定义 C

定义 D




def somefunc(param1='', param2=0):
    '''A docstring'''
    if param1 > param2: # interesting
        print 'Greater'
    return (param2 - param1 + 1) or None
class SomeClass:
>>> message = '''interpreter
... prompt'''






使用MathJax渲染LaTex 数学公式,详见math.stackexchange.com.

  • 行内公式,数学公式为:Γ(n)=(n1)!nN
  • 块级公式:


更多LaTex语法请参考 这儿.

UML 图:


Created with Raphaël 2.1.2张三张三李四李四嘿,小四儿, 写博客了没?李四愣了一下,说:忙得吐血,哪有时间写。


Created with Raphaël 2.1.2开始我的操作确认?结束yesno
  • 关于 序列图 语法,参考 这儿,
  • 关于 流程图 语法,参考 这儿.





用户可以选择 把正在写的博客保存到服务器草稿箱,即使换浏览器或者清除缓存,内容也不会丢失。



  1. 目前,本编辑器对Chrome浏览器支持最为完整。建议大家使用较新版本的Chrome。
  2. IE9以下不支持
  3. IE9,10,11存在以下问题
    1. 不支持离线功能
    2. IE9不支持文件导入导出
    3. IE10不支持拖拽文件导入

  1. 这里是 脚注内容.


卷积神经网络-截自Deep Learning by Yann LeCun


Mobileye's quest to put Deep Learning inside every new car

Mobileye's quest to put Deep Learning inside every new car In Amnon Shashua's vision of the...

A Review on Deep Learning Techniques Applied to Semantic Segmentation(译)-(1)

摘要   图像语义分割越来越受到计算机视觉和机器学习的研究人员的热爱。越来越多新兴的应用领域需要精确地和高效的分割机制:自动驾驶,室内导航,甚至虚拟或增强现实系统等。这种需求几乎与计算机视觉等相关领...

《On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima》-ICLR2017文章阅读

这篇文章探究了深度学习中一个普遍存在的问题——使用大的batchsize训练网络会导致网络的泛化性能下降(文中称之为Generalization Gap)。文中给出了Generalization Ga...

Notes on NNDL(Neural Networks and Deep Learning)

关于代价函数的假设: (1) 代价函数可以被写成一个 在每个训练样本 xx上的代价函数CxC_x的均值C=1n∑xCxC=\frac{1}{n}\sum_{x}C_x. 反向传播实际上是对一个独立...

LeCun、Bengio和Hinton综述论文《deep learning》

2015年,DL界三大神(Yann LeCun,Yoshua Bengio & Geoffrey Hinton),为了纪念人工智能60周年,合作在Nature上发表深度学习的综述性文章。原文地址:De...

A Review on Deep Learning Techniques Applied to Semantic Segmentation 阅读笔记

深度学习技术在语义分割中的应用综述 摘要 这篇文章主要是对各种应用场景下,使用深度学习进行语义分割的方法的综述。论文中介绍的主要方面包括: 1)该领域的术语和必要的背景知识 2)主要数据集和竞赛的介绍...