BIOS14:Statistic Exercise 4

8 篇文章 0 订阅

a) Is there any difference between the sexes regarding the propensity to jump off cliffs and flap the ears? (ignore color for the moment) (2p)

I will the hypothesis that elephants of two sexes are equally likely to jump off cliffs and flap the ears with Frequency tests.

Firstly, data import and exploration

# data import and exploration
elejump = read.csv('elejump.csv')
str(elejump)
# reshape the data
f_yes = sum(elejump$sex=='female' & elejump$jumpflap == 1)
f_no = sum(elejump$sex=='female' & elejump$jumpflap == 0)
m_yes = sum(elejump$sex=='male' & elejump$jumpflap == 1)
m_no = sum(elejump$sex=='male' & elejump$jumpflap == 0)
g_yes = sum(elejump$colour=='grey' & elejump$jumpflap == 1)
g_no = sum(elejump$colour=='grey' & elejump$jumpflap == 0)
p_yes = sum(elejump$colour=='pink' & elejump$jumpflap == 1)
p_no = sum(elejump$colour=='pink' & elejump$jumpflap == 0)
f_grey = sum(elejump$sex == 'female' & elejump$colour == 'grey')
f_pink = sum(elejump$sex == 'female' & elejump$colour == 'pink')
m_grey = sum(elejump$sex == 'male' & elejump$colour == 'grey')
m_pink = sum(elejump$sex == 'male' & elejump$colour == 'pink')

sex_jump = data.frame(female = c(f_yes, f_no), male = c(m_yes, m_no), row.names = c('Yes', 'No'))
color_jump = data.frame(grey = c(g_yes, g_no), pink = c(p_yes, p_no), row.names = c('Yes', 'No'))
sex_col = data.frame(female = c(f_grey, f_pink), male = c(m_grey, m_pink), row.names = c('grey', 'pink'))

barplot(as.matrix(sex_jump), beside = T, legend = T, ylim = c(0, 25))

在这里插入图片描述

Check assumptions of frequency tests.

The first two assumptions are “input data must be counts” and “Categories must be mutually exclusive” which are obviously fulfilled.

So I need to test “For chi-square tests and G-tests the expected counts per category must be > 5”.

#Check assumptions
fdata = sex_jump
fexp = matrix(NA, nrow(fdata), ncol(fdata))
for(i in 1:nrow(fdata)) {
  for(j in 1:ncol(fdata)) {
    fexp[i,j] = (((sum(fdata[i,]))/sum(fdata))*((sum(fdata[,j]))/sum(fdata))*(sum(fdata)))
  }
}
> fexp
     [,1] [,2]
[1,]   10   15
[2,]   10   15
#Chi-squared test
> chisq.test(sex_jump)

	Pearson's Chi-squared test with Yates' continuity correction

data:  sex_jump
X-squared = 4.0833, df = 1, p-value = 0.04331

There seems to be a significant association between sex and tendency to jump off cliffs and flap the ears.

#Fisher's exact test
> fisher.test(sex_jump)

	Fisher's Exact Test for Count Data

data:  sex_jump
p-value = 0.04208
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.0608841 0.9595294
sample estimates:
odds ratio 
 0.2556412 

The likely to jump off cliffs and flap the ears is about 4 times for male elephants compared to female elephants.

#G-test
> GTest(sex_jump)

	Log likelihood ratio (G-test) test of independence without
	correction

data:  sex_jump
G = 5.4507, X-squared df = 1, p-value = 0.01956

Also gives a significant output.

#Illustrate results
barplot(as.matrix(sex_jump), xlab="Elephants sexes", names=c("Female","Male"), ylab="Counts", beside = T, legend = c('Jump', 'Not jump'), ylim = c(0, 25))

在这里插入图片描述

Conclusion, Male elephant is about 4 times likely than female elephants to jump off cliffs and flap the ears.

b) Ignoring sex, does color seem to influence the jumping behavior? (2p)

Same as part a), also use frequency tests to test that there is no difference between different elephants color to ump off cliffs and flap the ears.

# plot data
barplot(as.matrix(color_jump), beside = T, legend = T, ylim = c(0, 25))

在这里插入图片描述

#Check assumptions
fdata = color_jump
fexp = matrix(NA, nrow(fdata), ncol(fdata))
for(i in 1:nrow(fdata)) {
  for(j in 1:ncol(fdata)) {
    fexp[i,j] = (((sum(fdata[i,]))/sum(fdata))*((sum(fdata[,j]))/sum(fdata))*(sum(fdata)))
  }
}
> fexp
     [,1] [,2]
[1,]   12   13
[2,]   12   13
#Chi-squared test
> chisq.test(color_jump)

	Pearson's Chi-squared test with Yates' continuity correction

data:  color_jump
X-squared = 28.926, df = 1, p-value = 7.519e-08

#Fisher's exact test
> fisher.test(color_jump)

	Fisher's Exact Test for Count Data

data:  color_jump
p-value = 1.148e-08
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
 0.001097566 0.095029572
sample estimates:
odds ratio 
0.01417599 

# G-test
> GTest(color_jump)

	Log likelihood ratio (G-test) test of independence without
	correction

data:  color_jump
G = 36.95, X-squared df = 1, p-value = 1.212e-09

G-test and T-test both give the significant outputs, and from Fisher’s exact test, we get the conclusion that pink elephants is about 70 times likely than grey elephants to do the jumping behavior.

#Illustrate results
barplot(as.matrix(color_jump), xlab="Elephants' skin color", names=c("Grey","Pink"), ylab="Counts", beside = T, legend = c('Jump', 'Not jump'), ylim = c(0, 30))

在这里插入图片描述

c) Is there any difference in color between the sexes? (2p)

Assume there is no difference in color between sexes. Test it with frequency test.

# lot data
barplot(as.matrix(sex_col), beside = T, legend = T, ylim = c(0, 25))

在这里插入图片描述

#Check assumptions
> fdata = sex_col
> fexp = matrix(NA, nrow(fdata), ncol(fdata))
> for(i in 1:nrow(fdata)) {
+   for(j in 1:ncol(fdata)) {
+     fexp[i,j] = (((sum(fdata[i,]))/sum(fdata))*((sum(fdata[,j]))/sum(fdata))*(sum(fdata)))
+   }
+ }
> fexp
     [,1] [,2]
[1,]  9.6 14.4
[2,] 10.4 15.6
#Chi-squared test
> chisq.test(sex_col)

	Pearson's Chi-squared test with Yates' continuity correction

data:  sex_col
X-squared = 5.0781, df = 1, p-value = 0.02423

#Fisher's exact test
> fisher.test(sex_col)

	Fisher's Exact Test for Count Data

data:  sex_col
p-value = 0.01997
alternative hypothesis: true odds ratio is not equal to 1
95 percent confidence interval:
  1.191479 19.192502
sample estimates:
odds ratio 
  4.512379 

#G-test
> GTest(sex_col)

	Log likelihood ratio (G-test) test of independence without
	correction

data:  sex_col
G = 6.6093, X-squared df = 1, p-value = 0.01014

All tests return significant results, the skin color is relayed with sexes.

#Illustrate results
barplot(as.matrix(sex_col), xlab="Elephants' sexes", names=c("Female","male"), ylab="Counts", beside = T, legend = c('Grey', 'Pink'), ylim = c(0, 30), col =c('Grey', 'Pink'))

在这里插入图片描述

d) What is the result if both sex and color are taken into account when predicting jumping behavior? (4p)

# plot data
f_g_yes = sum(elejump$sex == 'female' & elejump$colour == 'grey' & elejump$jumpflap == 1)
f_g_no = sum(elejump$sex == 'female' & elejump$colour == 'grey' & elejump$jumpflap == 0)
f_p_yes = sum(elejump$sex == 'female' & elejump$colour == 'pink' & elejump$jumpflap == 1)
f_p_no = sum(elejump$sex == 'female' & elejump$colour == 'pink' & elejump$jumpflap == 0)
m_g_yes = sum(elejump$sex == 'male' & elejump$colour == 'grey' & elejump$jumpflap == 1)
m_g_no = sum(elejump$sex == 'male' & elejump$colour == 'grey' & elejump$jumpflap == 0)
m_p_yes = sum(elejump$sex == 'male' & elejump$colour == 'pink' & elejump$jumpflap == 1)
m_p_no = sum(elejump$sex == 'male' & elejump$colour == 'pink' & elejump$jumpflap == 0)

Jump = data.frame(
    Sex = c('Female','Female','Female','Female','Male','Male','Male','Male'), 
    Color = c('Grey','Grey','Pink','Pink','Grey','Grey','Pink','Pink'), 
    Jump = c('Yes','No','Yes','No','Yes','No','Yes','No'), 
    Counts = c(f_g_yes,f_g_no,f_p_yes,f_p_no,m_g_yes,m_g_no,m_p_yes,m_p_no))
ggplot(Jump, aes(x=Jump, y=Counts, fill=Color)) +
  geom_bar(stat="identity", position="dodge") +
  facet_wrap(~Sex) +
  theme_bw()

在这里插入图片描述

> # Construct full model
> full.model = glm(jumpflap~sex+colour, data = elejump, family=binomial)
> # Model selection
> output = dredge(full.model, rank=AIC)
Fixed term is "(Intercept)"
> output
Global model call: glm(formula = jumpflap ~ sex + colour, family = binomial, data = elejump)
---
Model selection table 
  (Intrc) color sex df  logLik  AIC delta weight
2 -2.3980     +      2 -16.182 36.4  0.00  0.707
4 -2.6260     +   +  3 -16.064 38.1  1.76  0.293
3 -0.8473         +  2 -31.932 67.9 31.50  0.000
1  0.0000            1 -34.657 71.3 34.95  0.000
Models ranked by AIC(x) 

#Check assumptions
#Independent data
dwtest(fit)
#Residuals do not deviate from predicted distribution
par(mfrow=c(2,2))
plot(fit)
par(mfrow=c(1,1))

在这里插入图片描述

simulationOutput = simulateResiduals(fittedModel=fit, n=250)
plot(simulationOutput)
testDispersion(simulationOutput)

在这里插入图片描述

> #Get results of binomial GLM
> Anova(fit, type=3)
Analysis of Deviance Table (Type III tests)

Response: jumpflap
       LR Chisq Df Pr(>Chisq)    
colour    36.95  1  1.212e-09 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> summary(fit)

Call:
glm(formula = jumpflap ~ colour, family = binomial, data = elejump)

Deviance Residuals: 
     Min        1Q    Median        3Q       Max  
-2.07821  -0.41716   0.03901   0.49518   2.22931  

Coefficients:
            Estimate Std. Error z value Pr(>|z|)    
(Intercept)  -2.3979     0.7385  -3.247  0.00117 ** 
colourpink    4.4348     0.9603   4.618 3.88e-06 ***
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

(Dispersion parameter for binomial family taken to be 1)

    Null deviance: 69.315  on 49  degrees of freedom
Residual deviance: 32.365  on 48  degrees of freedom
AIC: 36.365

Number of Fisher Scoring iterations: 5

From the outcomes, we can see that the jump behavior is only related to color when both sex and color are taken into consideration. According to part c), sex is highly related to color, so when both of them is considered, only color is related to jump behavior. And the conclusion is pink elephant is more likely to jump off cliffs and flap the ears.

#Illustrate results
elejump$pred = predict(fit, elejump, type="response")
elejump$colour = factor(elejump$colour, levels(elejump$colour)[c(1,2)])
mean.ele = aggregate(list(exc=elejump$pred), by=list(colour=elejump$colour), mean, na.rm=TRUE)
barplot(exc~colour, data=mean.ele, names=c("Grey","Pink"), xlab="Sex", ylab="Probability of jump behaviour", ylim=c(0, 1), col="red")

在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值