作者: Yann lecun, Yoshua Bengio, Geoffery Hinton
日期: 2015.5
类型: Review
来源: Nature(Journal)
章节总结和一些句子摘抄
Abstract
Deep learning allows models to learn representation of data automatically. It ha brought break throughs in processing images, vedio, speech, audio.
Introduction
- Many applications which used to make use of traditional machine learning technolgy are turning to deep learning now.
- Conventional machine learning techniques were limited in their ability to process raw data. To desigh a feature extractor which transforms raw data into feature vector needs careful engineering and considerable domain expertise.
- For classification tasks, higher layers of representation amplify important aspects of input and supress irrelevant variations.
- Deep learing is making major advances in solving problems that resist the best attempts of the artifical intelligence community for many years.
- Deep learning can easily take advantages of increases in the amount of available computation and data.
supervised learning
- With multiple non-linear layers, say a depth of 5 to 20, a system can implement extreme intricate fuctions of its inputs that are simultaneiously sensitive to munite details while insensitive to irrelevant variations.
Backpropagation to train multilayer architectures
- As long as the modules are relatively smooth functions of their inputs and internal weights, one can compute gradients using the back propagation procedure.
- In the late 1990s, neural nets and backpropatation are widely forsaken. It was commonly thought that simple gradient descent would get trapped in poor local minima.
- For small data sets, unsupervised pre-training helps to prevent overfitting.
Convolutional neural networks
- For key ideas behind convnets that take advantages of natural signals: local connections, shared weights, pooling and the use of many layers.
- First, In images, local groups of values are often highly correlated.
- Second, the local stastics in image or other signals are invariant to location.
- The pooling reduce the dimension of the representations, and make it insensitive to irrelevant variations like position, illuminiation, distortion.
- The convolutional and pooling layers in ConvNets are directly inspired by the classic notions of simple cells and complex cells in visual neuroscience.
Image understanding with deep convolutional networks
- Despite the success in 2000s, ConvNets are forsaken by the mainstream computer vision and machine learning communities until the ImageNet competition in 2012. When deep convolutional networks were applied to a huge data set and greatly surpassed the best approaches.
- This success came from the efficient use of GPUS, ReLUS, a new regularization technique called dropout and the technique of data augmentation.
Distributed representation and language processing
- The standard approach to statistical modelling of language did not exploit the distributed representations, such as N-grams.
- Neural language model can use semantic relations among sequences of words, because the associate each word with a vector of real valued features, and semantically related words end up close to each other in that vector space.
Recurrent neural networks
- LSTM networks use special hiden units to remember inputs for a long time.
- Neural Turing Machine: the netword is augmented by a ‘tape-like’ memory that RNN can choose to read from or write to.(神经图灵机)
- Memory networks: a regular networks is augmented by a kind of associative memory.
The future of deep learning
- Unsupervised learing: human and animal learning are largely unsupervised, we discover the stucture of the world by observing it, not by told the name of every object.
- Imitating human vison: End to end trained and combine ConvNets with RNNs that use reinforcement learning to decide where to look.(attention mechanism?)
- Natural Language understanding: Use RNNs to understand sentences or whole documents much better through learning strategies for selectively attending to one part at a time.