http://en.wikipedia.org/wiki/Fisher's_exact_test
超几何分布:
drawn | not drawn | total | |
white | k | m-k | m |
black | n-k | N+k-m-n | N-m |
totals | n | N-n | N |
N个里有m个white,事件A={N里取n次,刚好取到k个white},事件A发生的概率:
分母:N里取n个的状态个数,
分子:n里有k个是whiet,n-k个black的状态个数
P(X=k1)+P(X=k2)+...=1 k1,k2... all possible k values
Fisher Exact Test:
n个里有a+b个dieting的,事件A={n里取a+c个,其中刚好有a个dieting的},事件A发生的概率:
men | women | total | |
dieting | a | b | a + b |
not dieting | c | d | c + d |
totals | a + c | b + d | n
|
自己理解:认为总数是n, a+c为样本大小,从总体拿出这么多样本,共有choose(n,a+c)种情况。
样本里有a个dieting的,有choose(a+b,a)种情况 ,有c个非dieting的有choose(c+d,c)种情况。
分母:n里取a+c个的状态个数,
分子:a+c里有a个是dieting,a+c-a个not dieting的状态个数
2.
Exact Tests The Hypergeometric Distribution To understand Fisher Exact test, a review of the hypergeometric distribution is first helpful. Here is a typical example. A box of chocolates contains 20 (N=20) pieces. Eight of them are known to be caramels (M=8), and the remaining 12 pieces are nuts (N-M=12). If a person selects 4 pieces (sample size n=4) at random, what is the distribution of the number of caramels in the sample? n=4 - sample size M=8 - total number of caramels N=20 - total number of chocolates The distribution for the number of caramels in the sample of 4 can range from 0 to 4 and the probability of 0, 1, 2, 3, or 4 caramels will occur is: No. in Sample, Probability 0 0.1022 = choose(8,0)*choose(12,4)/choose(20,4) 1 0.3633 = choose(8,1)*choose(12,3)/choose(20,4) 2 0.3814 = choose(8,2)*choose(12,2)/choose(20,4) 3 0.1387 = choose(8,3)*choose(12,1)/choose(20,4) 4 0.0144 = choose(8,4)*choose(12,0)/choose(20,4) Total 1.0000 The probabilities must sum to 1 as the total shows.