Hypothesis with R and Understanding of P-value and confidence-interval
Hypothesis with R
数据集说明
基于Galton数据集,检验儿子和女儿与母亲身高的相关性
library("AzureML")
ws <- workspace()
galton <- download.datasets(ws, "GaltonFamilies.csv")
head(galton)
The first 6 rows of the data and the columns:
dim(galton)
939 rows and 0 columns (attributes)
数据可视化
画直方图展示分别展示母亲与儿子,母亲与女儿的身高关系
hist.plot = function(df, col, bw, max, min){
ggplot(df, aes_string(col)) + geom_histogram( binwidth = bw ) + xlim(min,max)
}
hist.family = function(df, col1, col2, num.bin = 30){
require(ggplot2)
require(gridExtra)
## compute bin width
max = max(c(df[, col1], df[, co