《机器学习》之《线性模型》作业

《机器学习》之《线性模型》作业

1. 下列数据时水泥释放的热量与其成分的关系:求其线性依赖关系

yx1x2x3x4
78.5726660
74.31291552
104.31156820
87.61131847
95.9752633
109.21155922
102.7371176
72.51312244
93.12541822
115.92147426
83.81402334
113.31166912
109.41068812

将上述数据用excel保存,命名为data.xlsx。

编写Python代码:

import pandas as pd
import statsmodels.api as sm

data = pd.read_excel('data.xlsx')
data.columns = ['y', 'x1', 'x2', 'x3', 'x4']
# 生成自变量
x = sm.add_constant(data.iloc[:, 1:])
# 生成因变量
y = data['y']
# 生成模型
model = sm.OLS(y, x)
# 模型拟合
result = model.fit()
# 模型描述
print(result.summary())

运行结果:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.982
Model:                            OLS   Adj. R-squared:                  0.974
Method:                 Least Squares   F-statistic:                     111.5
Date:                Tue, 11 Oct 2022   Prob (F-statistic):           4.76e-07
Time:                        21:24:03   Log-Likelihood:                -26.918
No. Observations:                  13   AIC:                             63.84
Df Residuals:                       8   BIC:                             66.66
Df Model:                           4                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         62.4054     70.071      0.891      0.399     -99.179     223.989
x1             1.5511      0.745      2.083      0.071      -0.166       3.269
x2             0.5102      0.724      0.705      0.501      -1.159       2.179
x3             0.1019      0.755      0.135      0.896      -1.638       1.842
x4            -0.1441      0.709     -0.203      0.844      -1.779       1.491
==============================================================================
Omnibus:                        0.165   Durbin-Watson:                   2.053
Prob(Omnibus):                  0.921   Jarque-Bera (JB):                0.320
Skew:                           0.201   Prob(JB):                        0.852
Kurtosis:                       2.345   Cond. No.                     6.06e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 6.06e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

从上述结果描述,我们得到回归模型: y = 62.4054 + 1.5511 x 1 + 0.5102 x 2 + 0.1019 x 3 − 0.1441 x 4 y=62.4054+1.5511x_1+0.5102x_2+0.1019x_3-0.1441x_4 y=62.4054+1.5511x1+0.5102x2+0.1019x30.1441x4

从结果中还可以看出,Prob (F-statistic)为4.76e-07,其接近于零,说明我们的多元线性方程是显著的,也就是y与x1、x2、x3、x4有着显著的线性关系,而R-squared是0.982,也说明这个线性关系比较显著。

2. 经研究发现,学生用于购买书籍及课外读物的支出与本人受教育年限和其家庭收入水平有关,对18名学生进行调查的统计资料如下表所示,求其回归模型

yx1x2
450.54171.2
507.74174.2
613.95204.3
563.44218.7
501.54219.4
781.57240.4
541.84273.5
611.15294.8
1222.110330.2
793.27333.1
660.85366
792.76350.9
580.84357.9
612.75359
890.87371.9
11219435.3
1094.28523.9
125310604.1

将上述数据用excel保存,命名为data.xlsx。

编写Python代码:

import pandas as pd
import statsmodels.api as sm

data = pd.read_excel('data.xlsx')
data.columns = ['y', 'x1', 'x2']
# 生成自变量
x = sm.add_constant(data.iloc[:, 1:])
# 生成因变量
y = data['y']
# 生成模型
model = sm.OLS(y, x)
# 模型拟合
result = model.fit()
# 模型描述
print(result.summary())

运行结果:

                            OLS Regression Results                            
==============================================================================
Dep. Variable:                      y   R-squared:                       0.980
Model:                            OLS   Adj. R-squared:                  0.977
Method:                 Least Squares   F-statistic:                     362.4
Date:                Tue, 11 Oct 2022   Prob (F-statistic):           2.00e-13
Time:                        22:22:18   Log-Likelihood:                -89.942
No. Observations:                  18   AIC:                             185.9
Df Residuals:                      15   BIC:                             188.6
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const         -0.9756     30.322     -0.032      0.975     -65.606      63.655
x1           104.3146      6.409     16.276      0.000      90.654     117.975
x2             0.4022      0.116      3.457      0.004       0.154       0.650
==============================================================================
Omnibus:                        0.776   Durbin-Watson:                   2.561
Prob(Omnibus):                  0.678   Jarque-Bera (JB):                0.728
Skew:                          -0.230   Prob(JB):                        0.695
Kurtosis:                       2.128   Cond. No.                     1.13e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.13e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

从上述结果描述,我们得到回归模型: y = − 0.9756 + 104.3146 x 1 + 0.4022 x 2 y=-0.9756+104.3146x_1+0.4022x_2 y=0.9756+104.3146x1+0.4022x2

从结果中还可以看出,Prob (F-statistic)为2.00e-13,其接近于零,说明我们的多元线性方程是显著的,也就是y与x1、x2有着显著的线性关系,而R-squared是0.980,也说明这个线性关系比较显著。

  • 0
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

UestcXiye

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值