首先用R语言构造数据点
> y <- c(5, 7, 9, 11, 16, 20)
> x <- c(1, 2, 3, 4, 7, 9)
> plot(x,y)
绘制的散点图图像如下所示
用lsfit(x,y)计算回归直线方程的谢啦和截距以及残渣
输入如下所示:
> lsfit(x,y)
产生的结果如下:
$coefficients
Intercept X
3.338028 1.845070
$residuals
[1] -0.18309859 -0.02816901 0.12676056 0.28169014 -0.25352113 0.05633803
$intercept
[1] TRUE
$qr
$qt
[1] -27.7608838 12.6939415 0.1787319 0.3366821 -0.1894673 0.1264331
$qr
Intercept X
[1,] -2.4494897 -10.61445555
[2,] 0.4082483 6.87992248
[3,] 0.4082483 0.05334462
[4,] 0.4082483 -0.09200586
[5,] 0.4082483 -0.52805728
[6,] 0.4082483 -0.81875823
$qraux
[1] 1.408248 1.198695
$rank
[1] 2
$pivot
[1] 1 2
$tol
[1] 1e-07
attr(,"class")
[1] "qr"
其中$residuals 是残差的意思
残差的值=实际值-预测试
绘制回归线:
> abline(lsfit(x,y))
回归效果较好,如下所示:
方法二:
用lm函数进行更详细的回归分析,代码和图像如下:
> x <- c(1, 2, 3, 4, 7, 9)
> y <- c(5, 7, 9, 11, 16, 20)
> lm(y~x) -> xy
> summary(xy)
Call:
lm(formula = y ~ x)
Residuals:
1 2 3 4 5 6
-0.18310 -0.02817 0.12676 0.28169 -0.25352 0.05634
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 3.33803 0.16665 20.03 3.67e-05 ***
x 1.84507 0.03227 57.17 5.60e-07 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.222 on 4 degrees of freedom
Multiple R-squared: 0.9988, Adjusted R-squared: 0.9985
F-statistic: 3269 on 1 and 4 DF, p-value: 5.604e-07
> plot(x,y)
> abline( lm(y~x) )
以上代码参考麦好的《机器学习实践指南》一书