What's Wrong with Deep Learning?

YaneLeCun

 

Deep learningmethods have had a profound impact on a number of areas in recent years,including natural image understanding and speech recognition. Other areas seemon the verge of being similarly impacted, notably natural language processing,biomedical image analysis, and the analysis of sequential signals in a varietyof application domains. But deep learning systems, as they exist today, havemany limitations.

First, they lackmechanisms for reasoning, search, and inference. Complex and/or ambiguousinputs require deliberate reasoning to arrive at a consistent interpretation.Producing structured outputs, such as a long text, or a label map for imagesegmentation, require sophisticated search and inference algorithms to satisfycomplex sets of constraints. One approach to this problem is to marry deeplearning with structured prediction (an idea first presented at CVPR 1997).While several deep learning systems augmented with structured predictionmodules trained end to end have been proposed for OCR, body pose estimation,and semantic segmentation, new concepts are needed for tasks that require morecomplex reasoning.

Second, they lackshort-term memory. Many tasks in natural language understanding, such asquestion-answering, require a way to temporarily store isolated facts.Correctly interpreting events in a video and being able to answer questionsabout it requires remembering abstract representations of what happens in thevideo. Deep learning systems, including recurrent nets, are notoriouslyinefficient at storing temporary memories. This has led researchers to proposeneural nets systems augmented with separate memory modules, such as LSTM,Memory Networks, Neural Turing Machines, and Stack-Augmented RNN. While theseproposals are interesting, new ideas are needed.

Lastly, they lackthe ability to perform unsupervised learning. Animals and humans learn most ofthe structure of the perceptual world in an unsupervised manner. While theinterest of the ML community in neural nets was revived in the mid-2000s byprogress in unsupervised learning, the vast majority of practical applicationsof deep learning have used purely supervised learning. There is little doubtthat future progress in computer vision will require breakthroughs inunsupervised learning, particularly for video understanding, But whatprinciples should unsupervised learning be based on?

Preliminary works ineach of these areas pave the way for future progress in image and videounderstanding.

 

译:


最近几年里,深度学习在包括自然图像理解和语音识别多个领域产生深远的影响。一些其他区域似乎也有类似的影响,特别是在自然语言处理,生物医学图像分析,连续信号分析等各种应用领域。但是直到今日,深度学习本身还是有很多得局限性。

 

第一,它们缺乏推理,调查和推理的机制。复杂的和(不明确的输入要求深思熟虑的推理从而获得一致性的解释要产生结构化输出,如长文本,或图像分割的标记图,需要复杂的搜索和推理算法,以满足复杂的约束集。处理这一问题的一种法是结合结构预测深度学习(想法首次出现CVPR 1997)。虽然一些深度学习系统通过端到端训练的结构化预测模型来增强已经应用到了ORC身体姿态估计以及语义分割上,但是对于要求更复杂推理的任务,新的概念是必要的。


第二,它们缺乏短期记忆。在自然语言理解许多任务,如问题回答,需要一种方法来临时存储孤立的因素。正确解释视频中的事件,并能够回答有关它的问题,就需要记住的视频中发生的事情的抽象表示。深度学习系统,包括递归神经网络(RNN,是出了名的低效存储临时记忆。这使得研究人员提出了通过增加独立存储模块,如LSTMMemory Networks, Neural Turing Machines, 以及Stack-Augmented RNN.来增强神经网络。这些建议是有趣的,新的思路也是需要的


最后,它们缺乏执行监督学习的能力。动物和人类在学习大多数感性世界的结构时,是使用无监督的方式。虽然机器学习社区对神经网络的热情的复苏是在无监督学习获得进步的2000年代中期(应该是2006hintonRBN)但是深度学习实际应用中绝大多数都使用纯粹的监督学习。毫无疑问,计算机视觉今后的进展需要无监督学习突破,特别是视频理解方面。无监督学习应该基于什么?

在这些领域的初步工作,为图像和视频理解在未来的进步铺平了道路。

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值