1、为什么引入HCRF?
It is well known that models which include latent, or hidden-state, structure may be more expressive than fully observable models, and can often find relevant substructure in a given domain.
However, they are limited in that they cannot capture intermediate structures using hidden-state variables.
Differently, an HCRF models the distribution P(c, h/x) directly, where c is a category and h is an intermediate hidden variable modeled as a markov random field globally conditioned on observation x. The parameters µ of the model are trained discriminatively to optimize P(c/x).
The main limitation of latent generative approaches is that they require a model of local features given underlying variables, and generally presume independence of the observations
Again, a significant difference between their approach and ours is that we do not perform a pre-selection
of discriminative parts, but rather incorporate such a step during training
In our approach category labels are observed, but an additional layer of subordinate labels are
learned. These intermediate hidden variables model the latent structure of the input domain; our model
defines the joint probability of a class label and hidden state labels conditioned on the observations, with
dependencies between the hidden variables expressed by an undirected graph.