BIOS14:Statistic Exercise 4

最新推荐文章于 2021-09-10 23:13:43 发布

Chix1996

最新推荐文章于 2021-09-10 23:13:43 发布

阅读量233

点赞数

分类专栏： R 生信

本文链接：https://blog.csdn.net/qq_43063824/article/details/104272859

版权

生信同时被 2 个专栏收录

14 篇文章 0 订阅

订阅专栏

8 篇文章 0 订阅

订阅专栏

a) Is there any difference between the sexes regarding the propensity to jump off cliffs and flap the ears? (ignore color for the moment) (2p)

I will the hypothesis that elephants of two sexes are equally likely to jump off cliffs and flap the ears with Frequency tests.

Firstly, data import and exploration

# data import and exploration
elejump = read.csv('elejump.csv')
str(elejump)
# reshape the data
f_yes = sum(elejump$sex=='female' & elejump$jumpflap == 1)
f_no = sum(elejump$sex=='female' & elejump$jumpflap == 0)
m_yes = sum(elejump$sex=='male' & elejump$jumpflap == 1)
m_no = sum(elejump$sex=='male' & elejump$jumpflap == 0)
g_yes = sum(elejump$colour=='grey' & elejump$jumpflap == 1)
g_no = sum(elejump$colour=='grey' & elejump$jumpflap == 0)
p_yes = sum(elejump$colour=='pink' & elejump$jumpflap == 1)
p_no = sum(elejump$colour=='pink' & elejump$jumpflap == 0)
f_grey = sum(elejump$sex == 'female' & elejump$colour == 'grey')
f_pink = sum(elejump$sex == 'female' & elejump$colour == 'pink')
m_grey = sum(elejump$sex == 'male' & elejump$colour == 'grey')
m_pink = sum(elejump$sex == 'male' & elejump$colour == 'pink')

sex_jump = data.frame(female = c(f_yes, f_no), male = c(m_yes, m_no), row.names = c('Yes', 'No'))
color_jump = data.frame(grey = c(g_yes, g_no), pink = c(p_yes, p_no), row.names = c('Yes', 'No'))
sex_col = data.frame(female = c(f_grey, f_pink), male = c(m_grey, m_pink), row.names = c('grey', 'pink'))

barplot(as.matrix(sex_jump), beside = T, legend = T, ylim = c(0, 25))

在这里插入图片描述

Check assumptions of frequency tests.

The first two assumptions are “input data must be counts” and “Categories must be mutually exclusive” which are obviously fulfilled.

So I need to test “For chi-square tests and G-tests the expected counts per category must be > 5”.

#Check assumptions
fdata = sex_jump
fexp = matrix(NA, nrow(fdata), ncol(fdata))
for(i in 1:nrow(fdata)) {
  for(j in 1:ncol(fdata)) {
    fexp[i,j] = (((sum(fdata[i,]))/sum(fdata))*((sum(fdata[,j]))/sum(fdata))*(sum(fdata)))
  }
}
> fexp
     [,1] [,2]
[1,]   10   15
[2,]   10   15

#Chi-squared test
> chisq.test(sex_jump)

	Pearson's Chi-squared test with Yates' continuity correction

data:  sex_jump
X-squared = 4.0833, df = 1, p-value = 0.04331

There seems to be a significant association between sex and tendency to jump off cliffs and flap the ears.

#Fisher's exact test
> fisher.test(sex_jump)

	Fisher's Exact Test for Count Data

data:  sex_jump
p-value = 0.04208
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.0608841 0.9595294
sample estimates:
odds ratio 
 0.2556412

The likely to jump off cliffs and flap the ears is about 4 times for male elephants compared to female elephants.

#G-test
> GTest(sex_jump)

	Log likelihood ratio (G-test) test of independence without
	correction

data:  sex_jump
G = 5.4507, X-squared df = 1, p-value = 0.01956

Also gives a significant output.

#Illustrate results
barplot(as.matrix(sex_jump), xlab="Elephants sexes", names=c("Female","Male"), ylab="Counts", beside = T, legend = c('Jump', 'Not jump'), ylim = c(0, 25))

在这里插入图片描述

Conclusion, Male elephant is about 4 times likely than female elephants to jump off cliffs and flap the ears.

b) Ignoring sex, does color seem to influence the jumping behavior? (2p)

Same as part a), also use frequency tests to test that there is no difference between different elephants color to ump off cliffs and flap the ears.

# plot data
barplot(as.matrix(color_jump), beside = T, legend = T, ylim = c(0, 25))

在这里插入图片描述

#Check assumptions
fdata = color_jump
fexp = matrix(NA, nrow(fdata), ncol(fdata))
for(i in 1:nrow(fdata)) {
  for(j in 1:ncol(fdata)) {
    fexp[i,j] = (((sum(fdata[i,]))/sum(fdata))*((sum(fdata[,j]))/sum(fdata))*(sum(fdata)))
  }
}
> fexp
     [,1] [,2]
[1,]   12   13
[2,]   12   13

#Chi-squared test
> chisq.test(color_jump)

	Pearson's Chi-squared test with Yates' continuity correction

data:  color_jump
X-squared = 28.926, df = 1, p-value = 7.519e-08

#Fisher's exact test
> fisher.test(color_jump)

	Fisher's Exact Test for Count Data

data:  color_jump
p-value = 1.148e-08
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.001097566 0.095029572
sample estimates:
odds ratio 
0.01417599 

# G-test
> GTest(color_jump)

	Log likelihood ratio (G-test) test of independence without
	correction

data:  color_jump
G = 36.95, X-squared df = 1, p-value = 1.212e-09

G-test and T-test both give the significant outputs, and from Fisher’s exact test, we get the conclusion that pink elephants is about 70 times likely than grey elephants to do the jumping behavior.

#Illustrate results
barplot(as.matrix(color_jump), xlab="Elephants' skin color", names=c("Grey","Pink"), ylab="Counts", beside = T, legend = c('Jump', 'Not jump'), ylim = c(0, 30))

在这里插入图片描述

c) Is there any difference in color between the sexes? (2p)

Assume there is no difference in color between sexes. Test it with frequency test.

# lot data
barplot(as.matrix(sex_col), beside = T, legend = T, ylim = c(0, 25))

在这里插入图片描述

#Check assumptions
> fdata = sex_col
> fexp = matrix(NA, nrow(fdata), ncol(fdata))
> for(i in 1:nrow(fdata)) {
+   for(j in 1:ncol(fdata)) {
+     fexp[i,j] = (((sum(fdata[i,]))/sum(fdata))*((sum(fdata[,j]))/sum(fdata))*(sum(fdata)))
+   }
+ }
> fexp
     [,1] [,2]
[1,]  9.6 14.4
[2,] 10.4 15.6
#Chi-squared test
> chisq.test(sex_col)

	Pearson's Chi-squared test with Yates' continuity correction

data:  sex_col
X-squared = 5.0781, df = 1, p-value = 0.02423

#Fisher's exact test
> fisher.test(sex_col)

	Fisher's Exact Test for Count Data

data:  sex_col
p-value = 0.01997
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  1.191479 19.192502
sample estimates:
odds ratio 
  4.512379 

#G-test
> GTest(sex_col)

	Log likelihood ratio (G-test) test of independence without
	correction

data:  sex_col
G = 6.6093, X-squared df = 1, p-value = 0.01014

All tests return significant results, the skin color is relayed with sexes.

#Illustrate results
barplot(as.matrix(sex_col), xlab="Elephants' sexes", names=c("Female","male"), ylab="Counts", beside = T, legend = c('Grey', 'Pink'), ylim = c(0, 30), col =c('Grey', 'Pink'))

在这里插入图片描述

d) What is the result if both sex and color are taken into account when predicting jumping behavior? (4p)

# plot data
f_g_yes = sum(elejump$sex == 'female' & elejump$colour == 'grey' & elejump$jumpflap == 1)
f_g_no = sum(elejump$sex == 'female' & elejump$colour == 'grey' & elejump$jumpflap == 0)
f_p_yes = sum(elejump$sex == 'female' & elejump$colour == 'pink' & elejump$jumpflap == 1)
f_p_no = sum(elejump$sex == 'female' & elejump$colour == 'pink' & elejump$jumpflap == 0)
m_g_yes = sum(elejump$sex == 'male' & elejump$colour == 'grey' & elejump$jumpflap == 1)
m_g_no = sum(elejump$sex == 'male' & elejump$colour == 'grey' & elejump$jumpflap == 0)
m_p_yes = sum(elejump$sex == 'male' & elejump$colour == 'pink' & elejump$jumpflap == 1)
m_p_no = sum(elejump$sex == 'male' & elejump$colour == 'pink' & elejump$jumpflap == 0)

Jump = data.frame(
    Sex = c('Female','Female','Female','Female','Male','Male','Male','Male'), 
    Color = c('Grey','Grey','Pink','Pink','Grey','Grey','Pink','Pink'), 
    Jump = c('Yes','No','Yes','No','Yes','No','Yes','No'), 
    Counts = c(f_g_yes,f_g_no,f_p_yes,f_p_no,m_g_yes,m_g_no,m_p_yes,m_p_no))
ggplot(Jump, aes(x=Jump, y=Counts, fill=Color)) +
  geom_bar(stat="identity", position="dodge") +
  facet_wrap(~Sex) +
  theme_bw()

在这里插入图片描述

> # Construct full model
> full.model = glm(jumpflap~sex+colour, data = elejump, family=binomial)
> # Model selection
> output = dredge(full.model, rank=AIC)
Fixed term is "(Intercept)"
> output
Global model call: glm(formula = jumpflap ~ sex + colour, family = binomial, data = elejump)
---
Model selection table 
  (Intrc) color sex df  logLik  AIC delta weight
2 -2.3980     +      2 -16.182 36.4  0.00  0.707
4 -2.6260     +   +  3 -16.064 38.1  1.76  0.293
3 -0.8473         +  2 -31.932 67.9 31.50  0.000
1  0.0000            1 -34.657 71.3 34.95  0.000
Models ranked by AIC(x)

#Check assumptions
#Independent data
dwtest(fit)
#Residuals do not deviate from predicted distribution
par(mfrow=c(2,2))
plot(fit)
par(mfrow=c(1,1))

在这里插入图片描述

simulationOutput = simulateResiduals(fittedModel=fit, n=250)
plot(simulationOutput)
testDispersion(simulationOutput)

在这里插入图片描述

> #Get results of binomial GLM
> Anova(fit, type=3)
Analysis of Deviance Table (Type III tests)

Response: jumpflap
       LR Chisq Df Pr(>Chisq)    
colour    36.95  1  1.212e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(fit)

Call:
glm(formula = jumpflap ~ colour, family = binomial, data = elejump)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.07821  -0.41716   0.03901   0.49518   2.22931  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -2.3979     0.7385  -3.247  0.00117 ** 
colourpink    4.4348     0.9603   4.618 3.88e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 69.315  on 49  degrees of freedom
Residual deviance: 32.365  on 48  degrees of freedom
AIC: 36.365

Number of Fisher Scoring iterations: 5

From the outcomes, we can see that the jump behavior is only related to color when both sex and color are taken into consideration. According to part c), sex is highly related to color, so when both of them is considered, only color is related to jump behavior. And the conclusion is pink elephant is more likely to jump off cliffs and flap the ears.

#Illustrate results
elejump$pred = predict(fit, elejump, type="response")
elejump$colour = factor(elejump$colour, levels(elejump$colour)[c(1,2)])
mean.ele = aggregate(list(exc=elejump$pred), by=list(colour=elejump$colour), mean, na.rm=TRUE)
barplot(exc~colour, data=mean.ele, names=c("Grey","Pink"), xlab="Sex", ylab="Probability of jump behaviour", ylim=c(0, 1), col="red")

在这里插入图片描述

Chix1996

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
BIOS14:Statistic Exercise 4

a) Is there any difference between the sexes regarding the propensity to jump off cliffs and flap the ears? (ignore color for the moment) (2p)I will the hypothesis that elephants of two sexes are equ...
复制链接

扫一扫