Semi-Supervised Learning
In the paradigm of semi-supervised learning, both unlabeled examples from P(x) and labeled examples from P (x, y) are used to estimate P (y | x) or predict y from x.
- In the context of deep learning, semi-supervised learning usually refers to learning a representation h = f (x). The goal is to learn a representation so that examples from the same class have similar representations.
- Unsupervised learning can provide useful cues for how to group examples in representation space. Examples that cluster tightly in the input space should be mapped to similar representations.
- A long-standing variant of this approach is the application of principal components analysis as a pre-processing step before applying a classifier (on the projected data).
Instead of having separate unsupervised and supervised components in the model, one can construct models in which a generative model of either P (x) or P(x, y) shares parameters with a discriminative model of P(y | x).
One can then trade-off the supervised criterion − log P(y | x) with the unsupervised or generative one (such as − log P (x) or − log P(x, y)).
The generative criterion then expresses a particular form of prior belief about the solution to the supervised learning problem, namely that the structure of P (x) is connected to the structure of P(y | x) in a way that is captured by the shared parametrization.