Simple, fast implementation of Fisher’s exact test. . For example, for the following table:
o | Having the property | Not having the property |
---|---|---|
Selected | 12 | 5 |
Not selected | 29 | 2 |
Perhaps we are interested in whether there is any difference of property in selected vs. non-selected groups, then we can do the Fisher’s exact test.
def fish_test(sample_hit, pop_hit, sample_count, root_count):
### sample_hit: 该样本中基因属于该term下面的个数
### pop_hit: 该物种的所有基因属于该term下面的个数
### sample_count: 样本中基因的个数
### root_count: 该物种在bp/cc/mf root 下基因的个数
sample_hit = int(sample_hit)
pop_hit = int(pop_hit)
sample_count = int(sample_count)
root_count = int(root_count)
sample_nhit = sample_count - sample_hit
pop_nhit = root_count - pop_hit
n1,n2,n3,n4 = (sample_hit, pop_hit - sample_hit,
sample_nhit, pop_nhit - sample_nhit)
p = abs(pvalue(n1,n2,n3,n4).right_tail)
return p
Index | Pathway Name | Pathway ID | Pvalue | Pvalue_adjusted | Genes | Count | Pop Hit | List_Total | Background Genes | Class | |
---|---|---|---|---|---|---|---|---|---|---|---|
1 | ABC transporters | hsa02010 | 2.50e-19 | 4.71e-17 | ABCA6|1.00 ABCC8|1.00 ABCG2|1.00 ABCG8|1.00 ABCB5|1.00 ABCB6|1.00 ABCC9|1.00 ABCC11|1.00 ABCA1|1.00 ABCA7|1.00 ABCA9|1.00 ABCA12|1.00 ABCB8|1.00 ABCB9|1.00 ABCG4|1.00 ABCG5|1.00 | 16 | 45 | 98 | 7057 | Environmental Information Processing | |
2 | Fatty acid metabolism | hsa01212 | 3.09e-11 | 2.91e-09 | ACADSB|1.00 SCD|1.00 ACOX1|1.00 ACSL3|1.00 ACSL4|1.00 ACSL1|1.00 ACSL5|1.00 ACACA|1.00 ACADL|1.00 ACADM|1.00 ACSBG1|1.00 | 11 | 48 | 98 | 7057 | Metabolism |