python计算方差膨胀因子_python岭回归中的方差膨胀因子

最新推荐文章于 2023-02-24 16:42:25 发布

weixin_39926739

最新推荐文章于 2023-02-24 16:42:25 发布

阅读量678

点赞数

文章标签： python计算方差膨胀因子

我正在对一些共线数据进行岭回归。用于识别稳定拟合的方法之一是岭迹，由于scikit-learn上的一个很好的例子，我能够做到这一点。另一种方法是计算每个变量随k增加的方差通胀因子(vif)。当VIF降到&lt；5时，表明配合度令人满意。Statsmodels有VIFs的代码，但它用于OLS回归。我试图改变它来处理山脊的退却。

我用第5版第10章的例子来检验回归分析的结果。我的代码生成了k=0.000的正确结果，但之后不会。可以使用s a s代码，但我不是SAS用户，我不知道实现与scikit learn(和/或statsmodels)之间的区别。

我已经被困在这几天了，所以任何帮助都是非常感谢的。#http://www.ats.ucla.edu/stat/sas/examples/chp/chp_ch10.htm

from __future__ import division

import numpy as np

import pandas as pd

example = pd.read_csv('by_example_import.csv')

example.dropna(inplace=True)

from sklearn import preprocessing

scaler = preprocessing.StandardScaler().fit(example)

scaler.transform(example)

X = example.drop(['year', 'import'], axis=1)

#c_matrix = X.corr()

y = example['import']

#w, v = np.linalg.eig(c_matrix)

import pylab as pl

from sklearn import linear_model

###############################################################################

# Compute paths

alphas = [0.000, 0.001, 0.003, 0.005, 0.007, 0.009, 0.010, 0.012, 0.014, 0.016, 0.018,

0.020, 0.022, 0.024, 0.026, 0.028, 0.030, 0.040, 0.050, 0.060, 0.070, 0.080,

0.090, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1.0, 1.5, 2.0]

clf = linear_model.Ridge(fit_intercept=False)

clf2 = linear_model.Ridge(fit_intercept=False)

coefs = []

vif_list = [[] for x in range(X.shape[1])]

for a in alphas:

clf.set_params(alpha=a)

clf.fit(X, y)

coefs.append(clf.coef_)

for j, data in enumerate(X.columns):

cols = [col for col in X.columns if col not in [data]]

Z = X[cols]

yy = X.iloc[:,j]

clf2.set_params(alpha=a)

clf2.fit(Z, yy)

r_squared_j = clf2.score(Z, yy)

vif = 1. / (1. - r_squared_j)

print r_squared_j

vif_list[j].append(vif)

pd.DataFrame(vif_list, columns = alphas).T

pd.DataFrame(coefs, index=alphas)

###############################################################################

# Display results

ax = pl.gca()

ax.set_color_cycle(['b', 'r', 'g', 'c', 'k', 'y', 'm'])

ax.plot(alphas, coefs)

pl.vlines(ridge_cv.alpha_, np.min(coefs), np.max(coefs), linestyle='dashdot')

pl.xlabel('alpha')

pl.ylabel('weights')

pl.title('Ridge coefficients as a function of the regularization')

pl.axis('tight')

pl.show()

weixin_39926739

关注

0
点赞
踩
2

收藏

觉得还不错? 一键收藏
0
评论
python计算方差膨胀因子_python岭回归中的方差膨胀因子

我正在对一些共线数据进行岭回归。用于识别稳定拟合的方法之一是岭迹，由于scikit-learn上的一个很好的例子，我能够做到这一点。另一种方法是计算每个变量随k增加的方差通胀因子(vif)。当VIF降到&lt；5时，表明配合度令人满意。Statsmodels有VIFs的代码，但它用于OLS回归。我试图改变它来处理山脊的退却。我用第5版第10章的例子来检验回归分析的结果。我的代码生成了k=0...
复制链接

扫一扫

评论

被折叠的条评论为什么被折叠?

到【灌水乐园】发言

查看更多评论

添加红包

成就一亿技术人!

hope_wisdom

发出的红包

实付元

使用余额支付

点击重新获取

扫码支付

钱包余额 0

抵扣说明：

1.余额是钱包充值的虚拟货币，按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载，可以购买VIP、付费专栏及课程。