点互信息

最新推荐文章于 2024-03-20 23:38:10 发布

weixin_34112181

最新推荐文章于 2024-03-20 23:38:10 发布

阅读量111

点赞数

点互信息

Pointwise mutual information (PMI), or point mutual information, is a measure of association used in information theory andstatistics.

The PMI of a pair of outcomes x and y belonging to discrete random variables X and Y quantifies the discrepancy between the probability of their coincidence given their joint distribution and their individual distributions, assuming independence.

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

The mutual information (MI) of the random variables X and Y is the expected value of the PMI over all possible outcomes (w.r.t. the joint distribution

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

http://www.eecis.udel.edu/~trnka/CISC889-11S/lectures/philip-pmi.pdf

Information-theory approach to find

collocations

– Measure of how much one word tells us about the

other. How much information we gain

– Can be negative or positive

Problems with PMI

• Bad with sparse data

– Suppose some words only occur once, but appear

together

– Get very high score PMI score

– Consider our word clouds. High PMI score might

not necessarily indicate importance of bigram

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

点互信息由互信息而来

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

Finally,

will increase if

is fixed but

decreases.

这就是一个不好的地方如果联系紧密必然一同出现 p(x|y) 那么取决于p(x)的值大小越不常见的x 值越大假设 p(y|x)=1 完全相同共现就就取决于变量的出现频度了只出现一次分数最高偏爱稀有低频情况

Bad with word dependence

– Suppose two words are perfectly dependent on

eachother

– Whenever one occurs, the other occurs

– I(x, y) = log (1 / P(y))

– So the rarer the word is, the higher the PMI is

– High PMI score doesn't mean high word

dependence (could just mean rarer words)

– Threshold on word frequencies

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

可以看做局部一个点的互信息

考虑互信息

来自 <http://en.wikipedia.org/wiki/Mutual_information>

It can take positive or negative values, but is zero if X and Y areindependent. PMI maximizes when X and Y are perfectly associated, yielding the following bounds:

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

例子

x	y	p(x, y)
0	0	0.1
0	1	0.7
1	0	0.15
1	1	0.05

Using this table we can marginalize to get the following additional table for the individual distributions:

	p(x)	p(y)
0	.8	0.25
1	.2	0.75

With this example, we can compute four values for

. Using base-2 logarithms:

pmi(x=0;y=0)	−1
pmi(x=0;y=1)	0.222392421
pmi(x=1;y=0)	1.584962501
pmi(x=1;y=1)	−1.584962501

(For reference, the mutual information

would then be 0.214170945)

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

和互信息的相似处

Where

is the self-information, or

来自 <http://en.wikipedia.org/wiki/Pointwise_mutual_information>

正规化的pmi npmi

Pointwise mutual information can be normalized between [-1,+1] resulting in -1 (in the limit) for never occurring together, 0 for independence, and +1 for complete co-occurrence.