作者: G.E.Hinton, R.R.Salakhutinov
日期: 2006.7
类型: reports
来源: Science(Journal)
1 Problems
Biggest problem
Reduce the dimensionality of data.
other problem
Gradient descent can be used for fine-tuing ‘autoencoder’ only if the initial weights are close to a good solution.
2 The previous work
- principal components analysis(PCA)
- It’s a linear method.
- give much worse reconstrucions than proposed autoencoder.
- latent semantic analysis(LSA)(a well-known document retrieval method based on PCA)
- local linear embedding(a nonlinear dimensionality reduction algorithm)
3 The author’s solution
They devise a nonlinear generalization of PCA that uses an adaptive, multilayer “encoder” network to convert high-dimensional data into a low-dimensional code, and a similar “decoder” network to recover the data from the code.
Hypotheses
None
Methodology
- Restricted Boltzman Machine(RBM, 受限玻尔兹曼机)
Experiment
- On synthetic data set
- On hand-written digits MNIST data set
- On Olivetti face data set
- On documents, 804414 newswire stories.
Data sets
- a synthetic data set containing imags of “curves”
- Olivetti face dataset
Are the data sets sufficient?
yes
Advantages
- The autoencoder outperform PCA in three reconstruction tasks: synthetic images of “curves”, hand-written digits in MNIST data set, Olivetti face data set.
- Out performed latent semantic analysis(LSA), local linear embedding in document encoding task.
Weakness
The author did not point it’s weakness, I think maybe:
- need pre-training.
Application
- data dimensionality reduction
- classification
- regression
4 What is the author’s next step?
The author did not propose next step.
Do you agree with the author?
yes
5 What do other researchers say about his work?
-
[2015-Nature-Volodymyr .et.al - Human-level control through deep reinforcement learning]
Notably, recent advances in deep neural networks[9–11], in which several layers of nodes are used to build up progressively more abstract representations of the data, have made it possible for artificial neural networks to learn concepts such as object categories directly from raw sensory data. -
[2014-Journal of Machine Learning Research -Nitish.et.al - Dropout: A Simple Way to Prevent Neural Networks fromOverfitting]
Neural networks can be pretrained using stacks of RBMs (Hinton and Salakhutdinov, 2006),
autoencoders (Vincent et al., 2010) or Deep Boltzmann Machines (Salakhutdinov and Hinton, 2009). -
[2015-ICLR -Diederik.et.al - ADAM: A METHOD FOR STOCHASTIC OPTIMIZATION]
SGD proved itself as an efficient and effective optimization method that was central in many machine learning success stories, such as recent advances in deep learning (Deng et al., 2013; Krizhevsky et al., 2012; Hinton & Salakhutdinov, 2006; Hinton et al., 2012a; Graves et al., 2013).