In the previous tutorial, I discussed the use of deep networks to classify nonlinear data. In addition totheir ability to handle nonlinear data, deep networks also have a special strength in their flexibility whichsets them apart from other tranditional machine learning models: we can modify them in many ways tosuit our tasks. In the following, I will discuss three most common modifications:
Unsupervised learning and data compression via autoencoders which require modifications in the lossfunction,
Translational invariance via convolutional neural networks which require modifications in the networkarchitecture,
Variable-sized sequence prediction via recurrent neural networks which require modifications in thenetwork architecture.
The flexibility of neural networks is a very powerful property. In many cases, these changes lead to greatimprovements in accuracy compared to basic models that we discussed in the previous tutorial.In the last part of the tutorial, I will also explain how to parallelize the training of neural networks.This is also an important topic because parallelizing neural networks has played an important role in thecurrent deep learning movement.
2 Autoencoders
One of the first important results in Deep Learning since early 2000 was the use of Deep Belief Networks [15]to pretrain deep networks. This approach is based on the observation that random initialization is abad idea, and that pretraining each layer with an unsupervised learning algorithm can allow for betterinitial weights. Examples of such unsupervised algorithms are Deep Belief Networks, which are based onRestricted Boltzmann Machines, and Deep Autoencoders, which are based on Autoencoders. Althoughthe first breakthrough result is related to Deep Belief Networks, similar gains can also be obtained laterby Autoencoders [4]. In the following section, I will only describe the Autoencoder algorithm because it issimpler to understand.