Tilde in R ~
Tilde symbol l is used within formulas of statistical models, as mainly this symbol is used to define the relationship between the dependent variable and the independent variables in the statistical model formula in the R programming language. The left side of the tilde symbol specifies the target variable (dependent variable or outcome) and the right side of the tilde specifies the predictor variable(independent variables). (Source: https://www.geeksforgeeks.org/use-of-tilde-in-r/)
Below source: https://bookdown.org/danieljcarter/r4steph/two-sample-t-test.html
Two-sample t-test
We can also conduct a two-sample t-test to determine if the mean population birthweight in boys is the same as the mean population birthweight in girls. The syntax here is slightly different as it uses R’s formula interface. A formula is indicated by the presence of a tilde (~), and the tilde is shorthand for ‘estimate’. So the formula in the code chunk below says: estimate birthweight from sex. This is slightly counter-intuitive for the t-test but will make more sense when applied more generally under a regression framework later on.
We use the var.test() command to conduct an F test to assess whether the equality of variance assumption holds.
```r
#--- Run the two-sample t-test
bab9 %$% t.test(bweight ~ sex, var.equal = T)
##
## Two Sample t-test
##
## data: bweight by sex
## t = 3, df = 600, p-value = 0.001
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
## 66.6 267.7
## sample estimates:
## mean in group male mean in group female
## 3211 3044
Bonferroni correction for multiple tests
Source: https://book.phylolab.net/binf8441/lab7.html
```r
numtest = 4
pvalue = 1:numtest
for(i in 1:numtest){
pvalue[i] = t.test(data[,i+1] ~ data[,1])$p.value
}
print("the Bonferroni adjusted pvalues")
pvalue*numtest
We estimate the four columns based on the first column, 0 or 1 group. 是按照第一列分成的两组。
> t.test(data[,i+1] ~ data[,1])
Welch Two Sample t-test
data: data[, i + 1] by data[, 1]
t = -2.9682, df = 17.128, p-value = 0.008566
alternative hypothesis: true difference in means between group 0 and group 1 is not equal to 0
95 percent confidence interval:
-2.6167353 -0.4430667
sample estimates:
mean in group 0 mean in group 1
0.3970631 1.9269641