python 多元线性回归 pandas_python – pandas statsmodels中的多元线性回归：ValueError

最新推荐文章于 2024-07-16 15:41:33 发布

孙柔嘉

最新推荐文章于 2024-07-16 15:41:33 发布

阅读量429

点赞数

文章标签： python 多元线性回归 pandas

版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。

本文链接：https://blog.csdn.net/weixin_32938207/article/details/113641354

版权

本文介绍了在使用Python的statsmodels库进行多元线性回归时遇到的ValueError问题。错误源于将因变量和自变量混淆。正确做法是将因变量设为y，自变量设为X。示例中，NBA数据集的'W'是因变量，'PTS'和'oppPTS'是自变量。通过调整代码并应用sm.add_constant添加常数项，成功完成模型构建并展示OLS回归结果，显示R-squared接近0.94，表明模型拟合度较高。

摘要由CSDN通过智能技术生成

当使用sm.OLS(y,X)时,y是因变量,X是

自变量.

在公式W~PTS oppPTS中,W是因变量,PTS和oppPTS是自变量.

因此,使用

y = NBA['W']

X = NBA[['PTS', 'oppPTS']]

代替

X = NBA['W']

y = NBA[['PTS', 'oppPTS']]

import pandas as pd

import statsmodels.api as sm

NBA = pd.read_csv("NBA_train.csv")

y = NBA['W']

X = NBA[['PTS', 'oppPTS']]

X = sm.add_constant(X)

model11 = sm.OLS(y, X).fit()

model11.summary()

产量

OLS Regression Results

==============================================================================

Dep. Variable: W R-squared: 0.942

Model: OLS Adj. R-squared: 0.942

Method: Least Squares F-statistic: 6799.

Date: Sat, 21 Mar 2015 Prob (F-statistic): 0.00

Time: 14:58:05 Log-Likelihood: -2118.0

No. Observations: 835 AIC: 4242.

Df Residuals: 832 BIC: 4256.

Df Model: 2

Covariance Type: nonrobust

==============================================================================

coef std err t P>|t| [95.0% Conf. Int.]

------------------------------------------------------------------------------

const 41.3048 1.610 25.652 0.000 38.144 44.465

PTS 0.0326 0.000 109.600 0.000 0.032 0.033

oppPTS -0.0326 0.000 -110.951 0.000 -0.033 -0.032

==============================================================================

Omnibus: 1.026 Durbin-Watson: 2.238

Prob(Omnibus): 0.599 Jarque-Bera (JB): 0.984

Skew: 0.084 Prob(JB): 0.612

Kurtosis: 3.009 Cond. No. 1.80e+05

==============================================================================

Warnings:

[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

[2] The condition number is large, 1.8e+05. This might indicate that there are

strong multicollinearity or other numerical problems.

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。