OLS:最小二乘法
-
通过预测变量的加权和来预测量化的因变量,其中权重是通过数据估计而得的参数
数据特征:
正态性 对于固定的自变量值,因变量值成正态分布。
独立性 Yi值之间相互独立。
线性 因变量与自变量之间为线性相关。
同方差性 因变量的方差不随自变量的水平不同而变化。也可称作不变方差 -
回归模型包含一个因变量和一个自变量时,我们称为简单线性回归
-
当只有一个预测变量,但同时包含变量的幂(比如,X、X 2、X 3)时,我们称之为多项式回归
-
当有不止一个预测变量时,则称为多元线性回归
简单线性回归
fit <- lm(weight ~ height, data = women)
summary(fit)
women$weight
fitted(fit)
residuals(fit)
plot(women$height, women$weight, main = "Women Age 30-39", xlab = "Height (in inches)", ylab = "Weight (in pounds)")
abline(fit)
多项式回归
fit2 <- lm(weight ~ height + I(height^2), data = women)
summary(fit2)
plot(women$height, women$weight, main = "Women Age 30-39",
xlab = "Height (in inches)", ylab = "Weight (in lbs)")
lines(women$height, fitted(fit2))
线性模型: Y ∼ log ( x 1 ) + sin ( x 2 ) Y \sim \log (x_1)+\sin (x_2) Y∼log(x1)+sin(x2)
一般来说,n次多项式生成一个n-1个弯曲的曲线
install.packages("carData")
library(carData)
library(car)
scatterplot(weight ~ height, data = women, spread = FALSE,
lty.smooth = 2, pch = 19, main = "Women Age 30-39", xlab = "Height (inches)",
ylab = "Weight (lbs.)")
spread=FALSE选项删除了残差正负均方根在平滑曲线上的展开和非对称信息。lty.smooth=2选项设置loess拟合曲线为虚线。pch=19选项设置点为实心圆(默认为空心圆)。
states <- as.data.frame(state.x77[, c("Murder", "Populati