an introduction to conditional random fields

最新推荐文章于 2019-09-07 23:18:18 发布

weixin_33712881

最新推荐文章于 2019-09-07 23:18:18 发布

阅读量113

点赞数

原文链接：http://www.cnblogs.com/kevinGaoblog/p/3880687.html

版权

1.Structured prediction methods are essentially a combination of classification and graphical modeling.

2.They combine the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features.

3.The input x is divided into feature vectors {x0,x1, . . . ,xT }. Each xs contains various information about the word at position s, such as its identity, orthographic features such as prefixes and suffixes, membership in domain-specific lexicons, and information in semantic databases such as WordNet.

4.CRFs are essentially a way of combining the advantages of discriminative classification and graphical modeling, combining the ability to compactly model multivariate outputs y with the ability to leverage a large number of input features x for prediction.

5.The difference between generative models and CRFs is thus exactly analogous to the difference between the naive Bayes and logistic regression classifiers. Indeed, the multinomial logistic regression model can be seen as the simplest kind of CRF, in which there is only one output variable.

6.The insight of the graphical modeling perspective is that a distribution over very many variables can often be represented as a product of local functions that each depend on a much smaller subset of variables. This factorization turns out to have a close connection to certain conditional independence relationships among the variables — both types of information being easily summarized by a graph. Indeed, this relationship between factorization, conditional independence, and graph structure comprises much of the power of the graphical modeling framework: the conditional independence viewpoint is most useful for designing models, and the factorization viewpoint is most useful for designing inference algorithms.

7.The principal advantage of discriminative modeling is that it is better suited to including rich, overlapping features.

8.In principle, it may not be clear why these approaches should be so different, because we can always convert between the two methods using Bayes rule. For example, in the naive Bayes model, it is easy to convert the joint p(y)p(x|y) into a conditional distribution p(y|x). Indeed, this conditional has the same form as the logistic regression model (2.9). And if we managed to obtain a “true” generative model for the data, that is, a distribution p∗(y,x) = p∗(y)p∗(x|y) from which the data were actually sampled, then we could simply compute the true p∗(y|x), which is exactly the target of the discriminative approach. But it is precisely because we never have the true distribution that the two approaches are different in practice. Estimating p(y)p(x|y) first, and then computing the resulting p(y|x) (the generative approach)yields a different estimate than estimating p(y|x) directly. In other words, generative and discriminative models both have the aim of stimating p(y|x), but they get there in different ways.

转载于:https://www.cnblogs.com/kevinGaoblog/p/3880687.html

weixin_33712881

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
an introduction to conditional random fields

1.Structured prediction methods are essentially a combination of classification and graphical modeling.2.They combine the ability of graphical models to compactly model multivariate data with the ab...
复制链接

扫一扫