http://www.john-uebersax.com/stat/tetra.htm 提到可用两种方法做四分格相关系数的显著性检验,这里用到的是第一种方法,也就是利用标准差的检验.
<The Tetrachoric Correlation and its Asymptotic Standard Error>一文讲到四分格相关系数的标准差为s:
tetrachoric.se <- function(a, b, c, d) {
n <- a + b + c + d
z1 <- qnorm((a + c)/n)
z2 <- qnorm((a + b)/n)
r <- psych::tetrachoric(matrix(c(a, b, c, d), 2, 2))$rho
f1 <- pnorm((z1 - r * z2)/sqrt(1 - r * r)) - 0.5
f2 <- pnorm((z2 - r * z1)/sqrt(1 - r * r)) - 0.5
t <- (a + d) * (b + c)/4 +
(a + c) * (b + d) * f2 * f2 +
(a + b) * (c + d) * f1 * f1 +
2 * (a * d - b * c) * f1 * f2 +
(c * d - b * a) * f2 +
(b * d - a * c) * f1
s <- n^(-3/2) * sqrt(t)/
mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, r, r, 1), 2, 2))
c(n, z1, z2, r, f1, f2, t, s)
}
但是给出的假定相关系数为0时的标准差表达式有误,具体为下列代码中的s1,修正为s2:
tetrachoric.se0 <- function(a, b, c, d) {
n <- a + b + c + d
z1 <- qnorm((a + c)/n)
z2 <- qnorm((a + b)/n)
r <- 0
f1 <- pnorm((z1 - r * z2)/sqrt(1 - r * r)) - 0.5
f2 <- pnorm((z2 - r * z1)/sqrt(1 - r * r)) - 0.5
t <- (a + d) * (b + c)/4 +
(a + c) * (b + d) * f2 * f2 +
(a + b) * (c + d) * f1 * f1 +
2 * (a * d - b * c) * f1 * f2 +
(c * d - b * a) * f2 +
(b * d - a * c) * f1
s <- n^(-3/2) * sqrt(t)/
mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, r, r, 1), 2, 2))
s1 <- n^(-5/2) * sqrt((a + b) * (a + c) * (b + d) * (b + c))/
mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2))
s2 <- n^(-5/2) * sqrt((a + b) * (a + c) * (b + d) * (d + c))/
mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2))
c(n, z1, z2, r, f1, f2, t, s, s1, s2)
}
其中s为直接将相关系数设为0之后求得的标准差, s1为文章提供的表达式求得(估计印刷错误),s2为修正之后的值,与s的值相同.
有了相关系数为0的标准差之后,就可以检验相关系数的显著性了,p值为:
2 * pnorm(abs(r)/s2, lower.tail = F)
写一个完整的求四分格相关系数显著性的R语言代码:
tetrachoric.sig <- function(a, b, c, d) {
n <- a + b + c + d
z1 <- qnorm((a + c)/n)
z2 <- qnorm((a + b)/n)
r <- psych::tetrachoric(matrix(c(a, b, c, d), 2, 2))$rho
se <- n^(-5/2) * sqrt((a + b) * (a + c) * (b + d) * (d + c))/
mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2))
p <- 2 * pnorm(abs(r)/se, lower.tail = F)
list(rho = r, p.value = p)
}
tetrachoric.sig <- function(m) {
n <- sum(m)
z1 <- qnorm(sum(m[1, ])/n)
z2 <- qnorm(sum(m[, 1])/n)
r <- psych::tetrachoric(m)$rho
se <- n^(-5/2) * sqrt(prod(rowSums(m), colSums(m)))/
mnormt::dmnorm(c(z1, z2), varcov = matrix(c(1, 0, 0, 1), 2, 2))
p <- 2 * pnorm(abs(r)/se, lower.tail = F)
list(rho = r, p.value = p)
}