Task04—回归分析3_线性回归模型的误差诊断

本笔记为DataWhale 9月GitModel课程的学习内容,链接为:
https://github.com/Git-Model/Modeling-Universe/Data Analysis and Statistical Modeling/

一、OLS系数求解

OLS估计的矩阵求解方法:

OLS估计系数求解的矩阵表达式:
β ^ = ( X ′ X ) − 1 X ′ y \hat{\beta}=\left(X^{\prime} X\right)^{-1} X^{\prime} y β^=(XX)1Xy
 
估计系数 β ^ j \hat{\beta}_j β^j 估计标准误的矩阵表达式:
se ⁡ ( β ^ j ) = σ ^ ( X ′ X ) j + 1 , j + 1 − 1 \operatorname{se}\left(\hat{\beta}_{j}\right)=\hat{\sigma} \sqrt{\left(X^{\prime} X\right)_{j+1, j+1}^{-1}} se(β^j)=σ^(XX)j+1,j+11

 

1. 估计系数的方差构成

Var ⁡ ( β ^ 1 ) = σ 2 ∑ i = 1 n r ^ i 1 2 = σ 2 S S T 1 ( 1 − R 1 2 ) \operatorname{Var}\left(\hat{\beta}_{1}\right)=\frac{\sigma^{2}}{\sum_{i=1}^{n} \hat{r}_{i 1}^{2}}=\frac{\sigma^{2}}{S S T_{1}\left(1-R_{1}^{2}\right)} Var(β^1)=i=1nr^i12σ2=SST1(1R12)σ2

  • 误差方差 δ 2 \delta^2 δ2
    随机误差的方差可以理解为数据集的噪声信息,噪声信息越多,估计的不确定性就越大,方差就越大,这点很顺理成章!但是一般来说,对于给定的数据集,误差方差也是随之确定的,我们一般不会在这个因素上下文章。

  • 自变量 x 1 x_1 x1 本身的总变异 S S T 1 SST_1 SST1
    自变量总变异越大,表明自变量散步(random walking)程度越高,估计越牢靠。即:随机样本量越大,总变异也就越大,进而估计方差会变小,估计精度、也会随之而提高。

  • 自变量间的线性关联程度 R 1 2 R^2_1 R12
    R 1 2 R^2_1 R12 越接近1, x 1 x_1 x1与 其他自变量之间的线性关系就越强烈,估计方差也越大。这种近似的共线性关系被称为多重共线性(multicolinearity)

 

2. 降低估计方差,提高估计精度的方法
  • 采取更合理的数据采样方法,降低数据噪声。
  • 增大数据样本量。
  • 根据线性相关程度筛选纳入模型的变量,且切忌纳入过多变量。

 

3. 多重共线性
  • 度量多重共线性:
    如何衡量共线性的严重程度,我们一般用方差膨胀因子(VIF,Variance Inflation Factor):
    V I F x 1 = 1 1 − R 1 2 VIF_{x1}=\frac{1}{1-R_{1}^{2}} VIFx1=1R121
    V I R > 10 VIR>10 VIR>10,意味着共线性很严重,需要采取措施降低共线性。

  • 降低多重共线性:
    a. 通过查看变量之间的皮尔逊相关系数,需要结合具体业务场景剔除相关系数高的变量;
    b. 减少自变量的个数。一般而言,模型中自变量个数越多,R方会越大(结合Task02Task03的学习内容,R方代表了自变量对因变量的解释程度,通常认为R方越大越好)。但考虑到共线性问题,模型的自变量绝非越多越好

4. 构建模型过程中几种常见错误
  • (1)遗漏相关变量
    对于遗漏变量,通常情况下会对我们实际估计模型的系数产生有偏影响,即我们的系数估计不再是平均意义上的准确,但以下两种情况除外:
    a. 遗漏变量系数为0,即其对因变量不产生影响。
    b. 遗漏变量 与其他待分析变量不相关。

  • (2)多纳入一个无关变量
    a. 无关变量的纳入,并不影响建立在MLR.1-MLR.4四个假设上的OLS无偏性。
    b. 这个对解释因变量无益的自变量,一旦这个无益自变量与其它自变量存在共线性,则会增大其他估计系数的估计方差。

5. 变量选择方法论

首先,明确变量筛选对模型的影响:

  • 减少模型中的变量,可能会导致模型其他系数的估计有偏,但是会使系数估计的方差减少,即:增大偏差、减小方差
  • 增加模型中的变量,不会导致模型系数估计有偏,但是会使系数估计的方差增大,即:减小偏差、增大方差

通常来说,有两种考量纳入变量的方法论:

  • 从少数变量开始,一步步向模型纳入我们认为重要的变量,并通过t检验显著性来判断是否纳入该变量;如果我们认为变量存在二次项效应与交互项效应,可以将其纳入模型并通过联合F检验判断其显著性,直到模型的解释度达到一个较高的水平。
  • 从多数变量开始,一步步将不显著的变量剔除出模型。

第一种方法是前向建模,通过统计检验(T检验、F检验),逐步加入变量或变量的组合形式(二次项、交互项),以增加模型的拟合优度(R-squared)。
第二种方法是后向建模,先一次纳入多个变量,然后通过检验变量系数显著性,逐步剔除解释性差的变量。

  • 这两种方法该用哪一种,视具体情况而定。假若数据集的变量非常多,那我倾向于使用第一种方法,一步步将我们认为重要的变量纳入模型;若变量不多,则第二种方法可以考虑
  • 对于变量的选择,还有一些自动化的方法,如向前逐步选择、向后逐步选择等等。… 这些自动的变量选择方法更关注模型的预测能力,而不注重变量本身对模型的意义,它们更适用于机器学习的预测任务,而不适用于回归分析,因为回归分析中最重要的意义就是“解释与推断”

第二段话提到了建模过程中经常会遇到的一种困境:变量对模型的解释能力(拟合优度,即模型解释) V.S. 变量对业务的解释能力(业务解释)

对于这个问题,通常要根据建模目的进行判断:

  • 如果更注重预测结果,那加入变量后,模型拟合优度 R 2 R^2 R2 变得更好,就应该保留该变量。
  • 如果建模目的是“解释与判断”,那变量的业务含义就至关重要,如对房价走势的归因分析等。
二、 关于正态性假设检验

当t分布自由度df达30以上时,就已经接近为正态分布了,而大样本下样本容量的大数值将让自由度也达到一个很大的数值,因此使用t分布进行检验实际上是没问题的!

 

三、同方差假设
  • 模型形式误设(导致违背MLR.1)以及随机误差的非正态性对模型的影响并不明显,而违反同方差假设的后果就相对严重了:

    • 首先由于MLR.1-MLR.4仍然成立,因此系数估计的无偏性依然成立。
    • 异方差下,OLS估计不再是最优线性无偏估计。
    • 异方差下,估计系数的方差计算不再准确,从而影响到标准误的计算,进而影响到t检验与F检验的有效性。
  • 异方差导致的两个后果:OLS不再最优、t检验与F检验不再稳健。

 

四、 异方差检验

(1)方差是否“恒定”;
(2)方差是否与自变量无关。

检验方差是否与自变量的相关性,有两种方法:

1. BP异方差检验

BP检验的实质是检验异方差与所有自变量的一次项无关。

 

2. White异方差检验
  • White检验是在BP检验的基础上,增加了对所有变量二次项以及交互项的考察。
  • White检验虽然考虑的问题范围更广,但是由于回归元个数指数级增长,模型自由度相比于BP检验大大降低,对于自变量较多的情况,White检验可能并不如BP检验合适。

 

五、 广义最小二乘法(GLS)
1. 加权最小二乘(WLS)

假设异方差的形式除去一个常数外是已知的,即
Var ⁡ ( u ∣ x ) = σ 2 h ( x ) \operatorname{Var}(u \mid x)=\sigma^{2} h(x) Var(ux)=σ2h(x)
那么如果我们在原模型两边同除以 1 h ( x ) \frac{1}{\sqrt{h\left( x \right)}} h(x) 1,即
y h ( x ) = β 0 ∗ h ( x ) + β 1 ∗ x 1 h ( x ) + ⋯ + β k ∗ x k h ( x ) + u h ( x ) \frac{y}{\sqrt{h\left( x \right)}}=\frac{\beta _{0}^{*}}{\sqrt{h\left( x \right)}}+\beta _{1}^{*}\frac{x_1}{\sqrt{h\left( x \right)}}+\cdots +\beta _{k}^{*}\frac{x_k}{\sqrt{h\left( x \right)}}+\frac{u}{\sqrt{h\left( x \right)}} h(x) y=h(x) β0+β1h(x) x1++βkh(x) xk+h(x) u

并将带有 1 h ( x ) \frac{1}{\sqrt{h\left( x \right)}} h(x) 1的自变量与随机误差视作是新的变量与随机误差,则问题就是同方差情形了,我们可以使用OLS估计。

 

2. 可行的广义最小二乘法(FGLS)

FGLS将 h ( x ) h(x) h(x)假设为
h ( x ) = exp ⁡ ( δ 0 + δ 1 x 1 + ⋯ + δ k x k ) h(x)=\exp \left(\delta_{0}+\delta_{1} x_{1}+\cdots+\delta_{k} x_{k}\right) h(x)=exp(δ0+δ1x1++δkxk)
则条件方差的表达式就有
V a r ( u ∣ x ) = E ( u 2 ∣ x ) = σ 2 exp ⁡ ( δ 0 + δ 1 x 1 + ⋯ + δ k x k ) \mathrm{Var(}u\mid x)=E\left( u^2\mid x \right) =\sigma ^2\exp \left( \delta _0+\delta _1x_1+\cdots +\delta _kx_k \right) Var(ux)=E(u2x)=σ2exp(δ0+δ1x1++δkxk)
进而有
E ( u 2 ∣ x ) = σ 2 exp ⁡ ( δ 0 + δ 1 x 1 + ⋯ + δ k x k ) ⇒ log ⁡ ( u 2 ) ∼ x 1 + ⋯ + x k E\left( u^2\mid x \right) =\sigma ^2\exp \left( \delta _0+\delta _1x_1+\cdots +\delta _kx_k \right) \Rightarrow \log \left( u^2 \right) \sim x_1+\cdots +x_k E(u2x)=σ2exp(δ0+δ1x1++δkxk)log(u2)x1++xk
也就是说,我们先用OLS估计求解一个以 log ⁡ ( u ^ 2 ) \log \left(\hat{u}^{2}\right) log(u^2)为因变量的线性回归模型,再将该模型进行指数化处理,便可以得到 h ( x ) h(x) h(x)的估计形式了。

总结出以下步骤:

  1. 做回归 y ∼ x 1 + ⋯ x k y \sim x_{1}+\cdots x_{k} yx1+xk,得到残差 u ^ \hat{u} u^
  2. 做回归 log ⁡ ( u ^ 2 ) ∼ x 1 + ⋯ + x k \log \left(\hat{u}^{2}\right) \sim x_{1}+\cdots+x_{k} log(u^2)x1++xk,得到拟合值 g ^ \hat{g} g^
  3. 计算函数 h ^ = exp ⁡ ( g ^ ) \hat{h}=\exp (\hat{g}) h^=exp(g^)
  4. 1 / h ^ 1/\hat{h} 1/h^做权重,用wls估计模型 y ∼ x 1 + ⋯ x k y \sim x_{1}+\cdots x_{k} yx1+xk

 

六、作业
1. 作业1:含二次项/对数项模型的讨论

我们想要探究婴儿出生的体重与何种因素相关,数据集为bwght2.dta,本次习题所使用的变量解释如下:

因变量:
· bwght:婴儿出生体重

自变量:
· npvis:母亲产前检查次数
· mage:母亲年龄

使用python进行实操并回答以下问题:
(1):使用OLS估计方程
输出报告表,并回答:自变量npvis的二次项是否显著?自变量npvis是否对因变量有显著影响?

# 读入数据
bwght2=pd.read_stata('./data/bwght2.dta')
bwght2.describe().transpose()
# 先进行ols估计
bwght2_lm_ols=sm.formula.ols('bwght~npvis+mage',data=bwght2).fit()
print(bwght2_lm_ols.summary())
#结果如下:
						OLS Regression Results                            
==============================================================================
Dep. Variable:                  bwght   R-squared:                       0.011
Model:                            OLS   Adj. R-squared:                  0.010
Method:                 Least Squares   F-statistic:                     9.669
Date:                Sun, 25 Sep 2022   Prob (F-statistic):           6.66e-05
Time:                        10:59:24   Log-Likelihood:                -13718.
No. Observations:                1764   AIC:                         2.744e+04
Df Residuals:                    1761   BIC:                         2.746e+04
Df Model:                           2                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept   3121.5979     92.997     33.567      0.000    2939.202    3303.993
npvis         15.3753      3.756      4.093      0.000       8.008      22.743
mage           3.4197      2.895      1.181      0.238      -2.259       9.098
==============================================================================
Omnibus:                      165.443   Durbin-Watson:                   1.914
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              375.741
Skew:                          -0.564   Prob(JB):                     2.56e-82
Kurtosis:                       4.960   Cond. No.                         217.
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.


# 对变量 npvis 进行二次项变换
bwght2_lm_ols_2=sm.formula.ols('bwght~npvis+I(npvis**2)+mage',data=bwght2).fit()
print(bwght2_lm_ols_2.summary())
							 OLS Regression Results                            
==============================================================================
Dep. Variable:                  bwght   R-squared:                       0.014
Model:                            OLS   Adj. R-squared:                  0.012
Method:                 Least Squares   F-statistic:                     8.372
Date:                Sun, 25 Sep 2022   Prob (F-statistic):           1.59e-05
Time:                        12:47:10   Log-Likelihood:                -13715.
No. Observations:                1764   AIC:                         2.744e+04
Df Residuals:                    1760   BIC:                         2.746e+04
Df Model:                           3                                         
Covariance Type:            nonrobust                                         
=================================================================================
                    coef    std err          t      P>|t|      [0.025      0.975]
---------------------------------------------------------------------------------
Intercept      2989.5587    108.029     27.674      0.000    2777.681    3201.437
npvis            38.9402     10.538      3.695      0.000      18.271      59.609
I(npvis ** 2)    -0.8193      0.342     -2.393      0.017      -1.491      -0.148
mage              2.7406      2.905      0.943      0.346      -2.957       8.439
==============================================================================
Omnibus:                      151.579   Durbin-Watson:                   1.920
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              330.833
Skew:                          -0.533   Prob(JB):                     1.45e-72
Kurtosis:                       4.834   Cond. No.                     1.49e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.49e+03. This might indicate that there are
strong multicollinearity or other numerical problems.


bwght2_lm_ols_1=sm.formula.ols('I(bwght-mage)~1',data=bwght2).fit()
print(bwght2_lm_ols_1.summary())
 	                    OLS Regression Results                            
==============================================================================
Dep. Variable:        I(bwght - mage)   R-squared:                       0.000
Model:                            OLS   Adj. R-squared:                  0.000
Method:                 Least Squares   F-statistic:                       nan
Date:                Sun, 25 Sep 2022   Prob (F-statistic):                nan
Time:                        12:53:22   Log-Likelihood:                -14245.
No. Observations:                1832   AIC:                         2.849e+04
Df Residuals:                    1831   BIC:                         2.850e+04
Df Model:                           0                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept   3371.5644     13.467    250.362      0.000    3345.153    3397.976
==============================================================================
Omnibus:                      193.634   Durbin-Watson:                   1.948
Prob(Omnibus):                  0.000   Jarque-Bera (JB):              484.108
Skew:                          -0.599   Prob(JB):                    7.54e-106
Kurtosis:                       5.215   Cond. No.                         1.00
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.

# 用anova 做F检验
anova_lm(bwght2_lm_ols_1,bwght2_lm_ols_2)

结果如下:
在这里插入图片描述
由此可得:p值远大于0.1,不能拒绝原假设,自变量npvis的二次项是显著的,且对因变量有显著影响。

 

2. 作业2:异方差模型的讨论

紧接着example13(主管人数与工人人数间关系的问题),我们对异方差问题进行更进一步的探讨。

使用python进行实操并回答以下问题

(1):在假设方差形式为 Var ⁡ ( u ∣ x ) = σ 2 x 2 \operatorname{Var}(u \mid x)=\sigma^{2} x^{2} Var(ux)=σ2x2并进行wls估计后,比较wls估计与ols估计的残差图,回答:异方差消除了吗?

# 载入数据集
data=pd.read_table('./data/P176.txt')
data.head()
# 查看ols估计的残差与X的散点图
data_lm=sm.formula.ols('Y~X',data=data).fit()
data_lm_wls=sm.formula.wls('Y~X',weights=1/data.X**2,data=data).fit()
# 注意:weights传入的是一个数组,不是一个“表达式”。如果方差函数为h(x),则要传入1/h(x)的数组
print(data_lm.summary().tables[1])
print(data_lm_wls.summary().tables[1])
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     14.4481      9.562      1.511      0.143      -5.245      34.141
X              0.1054      0.011      9.303      0.000       0.082       0.129
==============================================================================
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      3.8033      4.570      0.832      0.413      -5.608      13.215
X              0.1210      0.009     13.445      0.000       0.102       0.140
==============================================================================

# 直接看OLS估计的残差与X的散点图
fig=plt.figure(figsize=(13,6))
ax1=fig.add_subplot(1,2,1)
plt.scatter(data.X,data_lm.resid,axes=ax1)
ax1.set_xlabel('X')
ax1.set_ylabel('resid_ols')
ax1.set_title('resid_ols | X')

# 查看WLS估计的残差与X的散点图()
ax2=fig.add_subplot(1,2,2)
plt.scatter(data.X,data_lm_wls.resid,axes=ax2)
ax2.set_xlabel('X')
ax2.set_ylabel('resid_wls')
ax2.set_title('resid_wls | X')

data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAw4AAAGDCAYAAACC80diAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMywgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/NK7nSAAAACXBIWXMAAAsTAAALEwEAmpwYAAAuZUlEQVR4nO3df5zcd13g8debNMAKHNvaUpJNMAFKvGI80luxXkD5UdiCHI05wHKoRdCciIrCBRrq3endYQJRVFCByA+roFAhLH1QMLaUH+odhbRLG/oj19BS2klrA96C0j1Iwvv+mO8mk3Szs7M7M9/vfOf1fDzm0ZnPd368+93J9zPvz8/ITCRJkiRpPg8pOwBJkiRJ1WfiIEmSJKktEwdJkiRJbZk4SJIkSWrLxEGSJElSWyYOkiRJktoycZAkSZLUlomDNIeIeFlE/O08xz8TEb+whPdfExEZEact4nVfXeznSpK6o9f1xEnv9VsR8f5Fvu63uhGDBCYO0pwy8wOZ+dyy4+hURHwkIv70pLKPRsQflRWTJNXRoNYTsyJiLCL+b0Q8raVsdVH2o2XGpuoycVBtddqaXxOvBjZHxDMBIuKngfOAS0uNSpIqaEjrCQAyswG8AXh3RDy8KH4X8L7MvK68yFRlJg6qlYj4akS8ISJuAr4dEadFxPkR8b8iYjoiboyIZ7Q8/+URcUdE/HNE3BkRL2sp//uW5z0nIm6LiG8WrfexgFgeEhG/GRF3RcT9EfHnEfHoUzx3zjg6lZn3Aa8D/jQiHge8DfhPmfkvi3k/SaqbitUTd0XEvy3uv6wYwvrk4vErI2Jyjtc8PCLeHxHfKOL9YkScvcjT8afAvcB/i4hLgHXAby7yvTQETBxURy8FfhIYBc4GrgL+J3AG8J+Bj0TEWRHxCJo/rJ+XmY8C/h3wpZPfLCLOBHbTvJieCXwF2LiAOF5e3J4JPB54JPCgIUMLjWOhMvPPihhvAP4mM/9mse8lSTVVlXris8Azivs/AdwB/HjL48/O8ZpLgEcDq4HvB34JmFnAZz1IZibwC8AvA38A/GJmPrCY99JwMHFQHb0tM+/OzBngZ4BPZOYnMvN7mXk1sBd4fvHc7wE/FBEjmXlvZt48x/s9H7g5Mz+cmYdpXlzvW0AcLwPempl3FC3+24CLT9E1vpA4OvF3NCuUjifTSdIQqEo98VmaCQLA04HtLY9PlTgcpnl9f2JmHs3M6zPzWwv4rFO5CzgIfAv43BLeR0PAxEF1dHfL/R8AXlx0505HxDTwNGBFZn4b+GmarTX3RsRVEfGDc7zfytb3LFpo7p7jeXO97q6Wx3cBp9Fs3TqmgzgWJCLOodli9ifA70XE8sW+lyTVVFXqic8CT4+IFcAy4ApgY0Ssodmr8KU5XvMXwB7ggxFxMCLessTr/KXAN4D7adYd0imZOKiOsuX+3cBfZOZoy+0RmbkDIDP3ZOZzgBXAbTTHe57sXppdwgBERLQ+nsdBmhXSrMcBR4B/fFDAC4ujrSK2d9Ns7fpV4Ns0J79Jko6rRD2RmQeAB2herz9X9BzcB2wB/j4zvzfHaw5n5m9n5rk0h069APi5Bf1fnyQizgW20hyu9ErgjUXjkzQnEwfV3fuBfx8RExGxrJhU9oyIWBURZ0fERcUY1u8A/0KzS/pkVwFPjojNxTCjXwMeu4DP/ivgNyJibUQ8Evgd4EOZeaT1SR3EsRCvojm+9neKCueVwOuX0oMhSTVXZj0BzV6HX+H4sKTPnPT4BBHxzIhYHxHLaA4vOnyKmOYVEQ8B3gO8JTNvy8ybaM7n2FUkPtKDmDio1jLzbuAi4I3AIZotS1tpfvcfAryWZs/AP9EcT/qqOd7j68CLgR00u3PPAf5hAR//Xppdyp8D7gT+H81WpZMtKI52ilWUfgd4ZWZ+t4j9FuD3aK6yZEUgSScpuZ6AZoLwKI7PLzj58ckeC3yYZtJwa/H8v1jgZ7V6DfB9wFtayv5H8f5d2bhO9RPNYXiSBkEx7vUzmbmm5FAkSRUXxa7Rmflb5UaiurDHQZIkSVJbJg7SEkTEOyPiX+a4vbNHHzlNc+KzJGkAlFBPtPpMcZO6wqFKkiRJktqyx0GSJElSW3PtYDuQzjzzzFyzZk3ZYUhSZV1//fVfz8yzyo6jbNYXkjS/U9UXtUkc1qxZw969e8sOQ5IqKyLuav+s+rO+kKT5naq+cKiSJEmSpLZMHCRJkiS1ZeIgSZIkqS0TB0mSJEltmThIkiRJasvEQZIkSVJblUgcImJZRExFxMeLx2sj4rqIOBARH4qIh5YdoyRJkjTMKpE4AK8Bbm15/Gbg9zPzicD/BV5ZSlSSJEmSgAokDhGxCvhJ4N3F4wCeBXy4eMrlwKZSgpMkSZIEVGPn6D8AXg88qnj8/cB0Zh4pHt8DjJUQl6QWk1MNdu7Zz8HpGVaOjrB1Yh2bNvhPU5LK4nVZ/VZqj0NEvAC4PzOvX+Trt0TE3ojYe+jQoS5HJ2nW5FSDbbv30ZieIYHG9Azbdu9jcqpRdmiSNJS8LqsMZQ9V2gi8MCK+CnyQ5hClPwRGI2K2N2QVMOe/gszclZnjmTl+1lln9SNeaSjt3LOfmcNHTyibOXyUnXv2lxSRJA03r8sqQ6mJQ2Zuy8xVmbkGuBi4NjNfBnwaeFHxtEuAj5UUoiTg4PRMR+WSpN7yuqwylN3jcCpvAF4bEQdoznl4T8nxSENt5ehIR+WSpN6q6nV5cqrBxh3XsvbSq9i441qHTtVMZRKHzPxMZr6guH9HZj41M5+YmS/OzO+UHZ/UL1W86G6dWMfI8mUnlI0sX8bWiXUlRSRJw62K12XnXdRfFVZVklSYvejOjludvegCpa6UMfvZw7x6h6uXSKqSKl6X55t34fWyHkwcpAqp8kV304ax0mMoS1UTOknDrWrXZedd1F9lhipJ8qJbVa5eIkntVXXehbrHxEGqEC+61WRCJ0ntVXHehbrLxEGqEC+61WRCJ0ntbdowxvbN6xkbHSGAsdERtm9eX6nhVFoa5zhIFVLFyW5qJnStcxzAhE6S5lK1eRfqLhMHqWK86FaPCZ0kSSYOkrQgJnSSpGHnHAdJkiRJbZk4SJIkSWrLxEGSJElSWyYOkiRJktoycZAkSZLUlomDJEmSpLZMHCRJkiS1ZeIgSZIkqS03gJMkSZK6YHKqwc49+zk4PcPK0RG2Tqyr1eahJg6SJEnSEk1ONdi2ex8zh48C0JieYdvufQC1SR4cqiRJkiQt0c49+48lDbNmDh9l5579JUXUfSYOkiRJ0hIdnJ7pqHwQmThIkiRJS7RydKSj8kFk4iBJkiQt0daJdYwsX3ZC2cjyZWydWFdSRN3n5GhJkiRpiWYnQLuqkiRJkqR5bdowVqtE4WQOVZIkSZLUlj0OkiSpJ+q+GZY0bEwcJElS19VtMyyTIMmhSpIkqQfqtBnWbBLUmJ4hOZ4ETU41yg5N6isTB0mS1HV12gyrTkmQtBQmDpIkqevqtBlWnZIgaSlMHCRJlRcRqyPi0xFxS0TcHBGvKcrPiIirI+L24r+nlx2rmuq0GVadkiBpKUwcJEmD4Ajwusw8FzgfeHVEnAtcCnwqM88BPlU8VgVs2jDG9s3rGRsdIYCx0RG2b14/kBOK65QESUvhqkqSpMrLzHuBe4v7/xwRtwJjwEXAM4qnXQ58BnhDCSFqDnXZDGsYdgSWFsLEQZI0UCJiDbABuA44u0gqAO4Dzj7Fa7YAWwAe97jH9SFK1U1dkiBpKRyqJEkaGBHxSOAjwK9n5rdaj2VmAjnX6zJzV2aOZ+b4WWed1YdIJal+TBwkSQMhIpbTTBo+kJm7i+J/jIgVxfEVwP1lxSdJdWfiIEmqvIgI4D3ArZn51pZDVwKXFPcvAT7W79gkaVg4x0GSNAg2Aj8L7IuILxVlbwR2AFdExCuBu4CXlBOeJNWfiYMkqfIy8++BOMXhZ/czFi3O5FRjSasSLfX1kpbOxEGSJPXU5FSDbbv3MXP4KACN6Rm27d4HsKAf/0t9vaTucI6DJEnqqZ179h/70T9r5vBRdu7Z35fXS+oOEwdJktRTB6dnOirv9usldYeJgyRJ6qmVoyMdlXf79ZK6w8RBQ21yqsHGHdey9tKr2LjjWianGmWHJEm1s3ViHSPLl51QNrJ8GVsn1vXl9ZK6w8nRGlpOtpOk/pi9pi52VaSlvl5Sd5g4aGjNN9nOykiSumvThrElXVuX+npJS+dQJQ0tJ9tJkiQtnImDhpaT7SRJkhbOxEFDy8l2kiRJC+ccBw0tJ9tJkiQtnImDhpqT7SRJkhbGoUqSJEmS2jJxkCRJktSWQ5W0KJNTDecGSJIkDRETB3XMHZclSZKGT6lDlSJidUR8OiJuiYibI+I1RfkZEXF1RNxe/Pf0MuPUiebbcVmSJEn1VPYchyPA6zLzXOB84NURcS5wKfCpzDwH+FTxWBXhjsuSVD2TUw027riWtZdexcYd1zI51Sg7JEk1U2rikJn3ZuYNxf1/Bm4FxoCLgMuLp10ObColQM3JHZclqVpmh5A2pmdIjg8hNXmQ1E1l9zgcExFrgA3AdcDZmXlvceg+4Oyy4tKDueOyJFWLQ0gl9UMlJkdHxCOBjwC/npnfiohjxzIzIyJP8botwBaAxz3ucf0IVbjjsiRVjUNIJfVD6YlDRCynmTR8IDN3F8X/GBErMvPeiFgB3D/XazNzF7ALYHx8fM7kQr3hjsuSVB0rR0dozJEkOIRUUjeVvapSAO8Bbs3Mt7YcuhK4pLh/CfCxfscmSdKgcAippH4ou8dhI/CzwL6I+FJR9kZgB3BFRLwSuAt4STnhSZJUfQ4hldQPpSYOmfn3QJzi8LP7GYvK4Q7UktQdDiGV1Gtl9zhoiLkDtSRJ0uCozHKsGj4uHyhJkjQ4TBxUGpcPlCRJGhwOVdKSLGWOgssHSqoz53BpsfzuqKrscdCizc5RaEzPkByfozA51VjQ610+UFJdLfX6qOHld0dVZuKgRVvqHIVNG8bYvnk9Y6MjBDA2OsL2zettVZE08JzDpcXyu6Mqc6iSFq0bcxRcPlBSHTmHS4vld0dVZo+DFu1UcxGcoyBp2Hl91GL53VGVmTho0ZyjIElz8/qoxfK7oypzqJIWbXaIkSs/SNKJvD5qsfzuqMoiM8uOoSvGx8dz7969ZYchSZUVEddn5njZcZTN+kKS5neq+sKhSpIkSZLaMnGQJEmS1JaJgyRJkqS2nBwtSZL6YnKq4aRfaYCZOEiSpJ6bnGqwbfe+Y7siN6Zn2LZ7H4DJgzQgHKokSZJ6buee/ceShlkzh4+yc8/+kiKS1Cl7HKQ+spte0rA6OD3TUbmk6rHHQeqT2W76xvQMyfFu+smpRtmhSVLPrRwd6ahcUvWYOEh9Yje9pGG2dWIdI8uXnVA2snwZWyfWlRSRpE45VEnqE7vpJQ2z2WGZDteUBpeJg9QnK0dHaMyRJNhNL2lYbNowZqIgDTCHKkl9Yje9JEkaZPY4SH1iN71UH66QJmkYmThIfWQ3vTT43MhM0rByqJIkaaBFxIURsT8iDkTEpb3+PFdIkzSsTBwkSQMrIpYBfww8DzgXeGlEnNvLz3SFNEnDysRBkjTIngocyMw7MvO7wAeBi3r5gW5kJmlYmThIkgbZGHB3y+N7irITRMSWiNgbEXsPHTq0pA90hTRJw8rEQZJUe5m5KzPHM3P8rLPOWtJ7bdowxvbN6xkbHSGAsdERtm9e78RoSbXnqkqSpEHWAFa3PF5VlHXVXMuv/sOlz+r2x6gNl8GVymXiIEkaZF8EzomItTQThouB/9jND3D51Wrw7yCVz8Sh4mxdkaRTy8wjEfErwB5gGfDezLy5m58x3/KrVbwe17XeGLS/g1RHJg4VZuuKJLWXmZ8APtGr9x+k5VfrXG8M0t9BqisnR1eYmwxJUvl6vfzq5FSDjTuuZe2lV7Fxx7VMTi1+ikad6w2XwZXKZ+JQYbauSFL5ern86mwPQWN6huR4D8Fik4c61xsugyuVz8ShwmxdkaTy9XL51W73ENS53nAZXKl8znGosK0T604Yqwq2rkhSGTZtGOvJD9Ru9xDUvd7o1d9B0sKYOFTY7MWxjqtjSJKaPQGNOZKExfYQWG9I6iUTh4qzdUWS6qsXPQTWG5J6xcRBkqSS2EMgaZCYOEiSVCJ7CCQNCldVkiRJktSWiYMkSZKkthyqJEmSHmRyquHcC0knMHGQJEknmN3Rena1p9kdrQGTB2mImThoyWyVkqR6mW9H635e361fpGoxcdCS2ColSfXT7R2tF8P6RaoeJ0drSeZrlZIkDaZT7Vy92B2tF8P6RaoeEwctSRVapSRpGE1ONdi441rWXnoVG3dcy+RUo2vvvXViHSPLl51QttQdrTtl/SJVj0OVFsExl8etHB2hMcdFvJ+tUpI0bHo9jKcKO1pbv0jVY+LQIcdcnmjrxLoTzgf0v1VKkoZNPyYvl72jtfWLVD0OVeqQYy5PtGnDGNs3r2dsdIQAxkZH2L55/VAmUZLUL8MwjMf6Raoeexw6NAwX606V3SolSVXRr6GswzKMx/pFqhZ7HDpUhZUmJEnVMzuUtTE9Q3J8KGs3Jy3PqsLkZZWnlxPjpfmYOHTIi7UkaS79HMo6bMN4/KF8XD8TVOlklR2qFBEXAn8ILAPenZk7Sg4JqMZKE5Kk6un3UNZhGcbjoiQnqsqu3hpOlUwcImIZ8MfAc4B7gC9GxJWZeUu5kTUNy8VakrRwwzLvoN/8oXwi51qqTFUdqvRU4EBm3pGZ3wU+CFxUckySJJ2SQ1l7wx/KJ3KupcpU1cRhDLi75fE9RdkJImJLROyNiL2HDh3qW3CSJJ1s2OYd9Is/lE9kgqoyVXKo0kJl5i5gF8D4+HiWHI6WwN24JdWBQ1m7z43gTuRcS5WpqolDA1jd8nhVUaYacuKbVF8RcTqwOjNvKjsWDSZ/KD+YCarKsqjEISIeAjwyM7/V5XhmfRE4JyLW0kwYLgb+Y48+SyVz4ptULxHxGeCFNOuY64H7I+IfMvO1pQamgeUPZakaFjzHISL+MiL+VUQ8AvgycEtEbO1FUJl5BPgVYA9wK3BFZt7ci89S+Zz4JtXOo4uGpc3An2fmjwIXlByTJGmJOpkcfW5REWwCPgmsBX62F0EBZOYnMvNJmfmEzHxTrz5H5XPim1Q7p0XECuAlwMfLDkaS1B2dJA7LI2I5zcThysw8DDghWUvmChFS7fx3mj3GBzLzixHxeOD2kmOSasXdtFWGTuY4vAv4KnAj8LmI+AGgV3McNESc+CbVS2b+NfDXLY/vAP5DeRFJ9eKiIirLghOHzHwb8LaWorsi4pndD0nDyIlv0uCLiLczT090Zv5aH8ORastFRVSWtolDRLRbBeOtXYpFkjTY9pYdgDQMXFREZVlIj8Ojeh6FJGngZeblABHxhMz8StnxSHW1cnSExhxJgouKqNfaJg6Z+dv9CEQadu6erRp5b0Ssorknz98Bn8vMfSXHJNWGu2mrLJ3s47AqIj4aEfcXt48UFYOkJZqd6NaYniE5PtHNVTI0iDLzJ4B/DbwdGAWuioh/KjUoaYCdvIISwPbN6xkbHSGAsdERtm9eb2OTeq6TVZXeB/wl8OLi8c8UZc/pdlDSsHGim+okIp4GPL24jdLcy+HvyoxJGlSnWkFp++b1/MOlzyo5Og2bTvZxOCsz35eZR4rbnwFn9Sguaag40U018xmae/7sAp6Rmb+cmX9VakTSgJqvYUnqt056HL4RET8DzF78Xwp8o/shlccx5iqLE91UM2cCG4EfB34tIr4H/O/M/C/lhiUNHhuWVCWd9Di8AngJcB9wL/Ai4Od7EVQZqj7G3B0i683ds1UnmTkN3AHcSbO+eALNJEJSh07VgGTDksqw4MQhM+/KzBdm5lmZ+ZjM3JSZX5s9HhHbehNif1S5K7DqSY2WbtOGMSe6qTYi4g7g94DTgXcA64oJ05I6ZMOSqqSToUrtvBjY3sX366sqdwU6cXY4uHu2auSJmfm9Ux2MiG2ZObD1hdRPs/WCQ6lVBd1MHKKL79V3VR5jXuWkZjGcSyLV23xJQ2GgG5qkfrNhSVXRyRyHdrKL79V3Ve4KrNP4RoddSWLAG5okaVh1M3EY6IqgymPMq5zUdKrKc0kk9c1ANzRJ0rDq5lClv+7ie5Wiql2BdRrfWLdhV5IWZaAbmiRpWLVNHCLi7czTOpSZv1b893e6GFctdHMsf1WTmk5VeS5JVTgHRENg4BuaJGkYLaTHYW/x343AucCHiscvBm7pRVD90ssfaKfaIh4Y6h+BWyfWnXBeYHCHXfWC3xsNMhuaJKne2s5xyMzLM/Ny4IeBZ2Tm2zPz7cCzgaf0OL6e6fUk3aqO5S97I7kqzyWpgqp+b6QF2gtcDzwcOA+4vbg9BXjoYt80InZGxG0RcVNEfDQiRluObYuIAxGxPyImlhS9JGlencxxOB34V8A/FY8fWZQNpF7vjVDFsfxVac2uy7CrXqji90ZaqKKRiYh4FfC0zDxSPH4n8HdLeOurgW2ZeSQi3gxsA94QEecCFwNPBlYC10TEkzLz6DzvJUlapE5WVdoBTEXEn0XE5cANwMB2N/f6B1oVl1C1Nbv6qvi9kRZhtqFp1pIamjLzb2eTEODzwKri/kXABzPzO5l5J3AAeOpiP0eSNL8FJw6Z+T7gR4GPAruBH5ttXRpEvf6BVsUlVG3Nrr4qfm+kRehlQ9MrgE8W98eAu1uO3VOUPUhEbImIvRGx99ChQ10KRZKGy0JWVfrBzLwtIs4rimYv0isjYmVm3tC78Hqn15N0q7iEatVWNHL1oAer4vdG6lRmvi8iPkmzsQngDZl533yviYhrgMfOceiyzPxY8ZzLgCPABxYR0y5gF8D4+Lj7SEjSIixkjsNrgS3A781xLIFndTWiPunHD7SqjeWv0opGVZlvUUVV+95IC7WUhqbMvKDNe78ceAHw7Myc/eHfAFa3PG1VUSZJ6oG2iUNmbin++8zeh9Nfw/YDrUqt2b2enD4o7HVRzfSkoSkiLgReD/xEZj7QcuhK4C8j4q00J0efA3xhMZ8hSWpvwasqRcSLgb/JzH+OiN+kudTe/8jMqZ5Fp66rSrLkfAt7XVQ/PWxo+iPgYcDVEQHw+cz8pcy8OSKuoLmn0BHg1a6oJEm908mqSv+lSBqeBlwAvAd4Z2/CUt25epCrXKm+IuLFEfGo4v5vRsTuiNiw2PfLzCdm5urMfEpx+6WWY2/KzCdk5rrM/OR87yNJWppOEofZXzg/CezKzKtYwoY+Gm6uHmSvi2rNhiZJqqFOEodGRLwL+GngExHxsA5fLx3jDtL2uqjWbGiSpBrqZOfolwAXAr+bmdMRsQLY2puwhtuwTJitynyLslRplSupy2Ybmp4DvNmGJkmqh042gHsAuB94WlF0BLi9F0ENs9kJs43pGZLjE2Ynp1xhsG7sdVGNvQTYA0xk5jRwBjY0SdLA62RVpf8GjAPrgPcBy4H3Axt7E9pwcpnS4TLsvS6qp8x8ICJmG5pux4YmSaqFTrqOfwp4IfBtgMw8CDyqF0ENMyfMShp0RUPTG4BtRdFsQ5MkaYB1kjh8t9itMwEi4hG9CWm4OWFWUg3Y0CRJNbSgxCGaO+58vJjsNhoRvwhcA/xpL4MbRi5TKqkGbGiSpBpa0ByHzMxi5+jXAt+iOc/hv2bm1b0MbhjNjncfhlWVJNXPKRqaXoENTZI08DpZjvUGYDozXRmjx5wwK2lQ2dAkSfXVSeLwo8DLIuIuinGrAJn5w12PSpI0yGxokqQa6iRxmOhZFJKkOrGhSZJqaMGJQ2be1ctAqmxYdnKWpC6xoUmSaqiTHoehNLuT8+ymbLM7OQMmD5I0h2FuaJKkOutkH4ehNN9OzpIkSdKwMHFow52cJUmSJBOHttzJWZIkSTJxaMudnCVJkiQnR7flTs6SJEmSicOCuJOzJEmShp1DlSRJkiS1ZeIgSZIkqS2HKkmSJA2YyamG8y/VdyYOkiRJA2RyqsG23fuObVDbmJ5h2+59ACYP6imHKkmSJA2QnXv2H0saZs0cPsrOPftLikjDwsRBkiRpgBycnumoXOoWEwdJkqQBsnJ0pKNyqVtKSxwiYmdE3BYRN0XERyNitOXYtog4EBH7I2KirBglSZKqZuvEOkaWLzuhbGT5MrZOrCspIg2LMnscrgZ+KDN/GPg/wDaAiDgXuBh4MnAh8CcRseyU7yJJkjRENm0YY/vm9YyNjhDA2OgI2zevd2K0eq60VZUy829bHn4eeFFx/yLgg5n5HeDOiDgAPBX4330OUZIkqZI2bRgzUVDfVWWOwyuATxb3x4C7W47dU5Q9SERsiYi9EbH30KFDPQ5RkiRJGl497XGIiGuAx85x6LLM/FjxnMuAI8AHOn3/zNwF7AIYHx/PJYQqSZIkaR49TRwy84L5jkfEy4EXAM/OzNkf/g1gdcvTVhVlkiRJkkpS5qpKFwKvB16YmQ+0HLoSuDgiHhYRa4FzgC+UEaMkSZKkptImRwN/BDwMuDoiAD6fmb+UmTdHxBXALTSHML06M4/O8z6SJEmSeqzMVZWeOM+xNwFv6mM4kiRJkuZRlVWVJEmSJFWYiYMkSZKktkwcJEmSJLVl4iBJkiSpLRMHSZIkSW2ZOEiSJElqy8RBkiRJUltlbgAn6SSTUw127tnPwekZVo6OsHViHZs2jJUdliRJkomDVBWTUw227d7HzOHmRumN6Rm27d4HYPIgSZJK51AlqSJ27tl/LGmYNXP4KDv37C8pIkmSpONMHKSKODg901G5JElSP5k4SBWxcnSko3JJkqR+MnGQKmLrxDpGli87oWxk+TK2TqwrKSJJkqTjnBwtVcTsBGhXVZIkSVVk4iBVyKYNYyYK0ilExOuA3wXOysyvR0QAfwg8H3gAeHlm3lBmjJJUZw5VkiRVXkSsBp4LfK2l+HnAOcVtC/COEkKTpKFh4iBJGgS/D7weyJayi4A/z6bPA6MRsaKU6CRpCJg4SJIqLSIuAhqZeeNJh8aAu1se31OUzfUeWyJib0TsPXToUI8ilaR6c46DJKl0EXEN8Ng5Dl0GvJHmMKVFy8xdwC6A8fHxbPN0SdIcTBwkSaXLzAvmKo+I9cBa4MbmXGhWATdExFOBBrC65emrijJJUg84VEmSVFmZuS8zH5OZazJzDc3hSOdl5n3AlcDPRdP5wDcz894y45WkOrPHQZI0qD5BcynWAzSXY/35csORpHozcZAkDYyi12H2fgKvLi8aSRouJg6SJEnqu8mpBjv37Ofg9AwrR0fYOrHOTVArzsRBkiRJfTU51WDb7n3MHD4KQGN6hm279wGYPFSYk6MlSZLUVzv37D+WNMyaOXyUnXv2lxSRFsLEQZIkSX11cHqmo3JVg4mDJEmS+mrl6EhH5aoGEwdJkiT11daJdYwsX3ZC2cjyZWydWFdSRFoIJ0dLkiSpr2YnQLuq0mAxcZAkSVLfbdowZqIwYByqJEmSJKktEwdJkiRJbZk4SJIkSWrLxEGSJElSWyYOkiRJktoycZAkSZLUlsuxSuqqyamG63JLklRDJg6SumZyqsG23fuYOXwUgMb0DNt27wMweZAkacA5VElS1+zcs/9Y0jBr5vBRdu7ZX1JEkiSpW0wcJHXNwemZjsolSdLgMHGQ1DUrR0c6KpckSYPDxEFS12ydWMfI8mUnlI0sX8bWiXUlRSRJkrrFydGSumZ2ArSrKkmSVD8mDpK6atOGMRMFSZJqyKFKkiRJktoycZAkSZLUlomDJEmSpLZMHCRJkiS1ZeIgSZIkqS0TB0mSJEltmThIkiRJasvEQZIkSVJbJg6SJEmS2jJxkCRJktRW6YlDRLwuIjIiziweR0S8LSIORMRNEXFe2TFKkiRJw67UxCEiVgPPBb7WUvw84JzitgV4RwmhSZIkSWpRdo/D7wOvB7Kl7CLgz7Pp88BoRKwoJTpJkiRJQImJQ0RcBDQy88aTDo0Bd7c8vqcom+s9tkTE3ojYe+jQoR5FKkmSJOm0Xr55RFwDPHaOQ5cBb6Q5TGnRMnMXsAtgfHw82zxdkiRJ0iL1NHHIzAvmKo+I9cBa4MaIAFgF3BARTwUawOqWp68qyiRJkiSVpJShSpm5LzMfk5lrMnMNzeFI52XmfcCVwM8VqyudD3wzM+8tI05JkiRJTT3tcVikTwDPBw4ADwA/X244kiRJkiqROBS9DrP3E3h1edFIkiRJOlnZy7FKkiRJGgAmDpIkSZLaMnGQJEmS1JaJgyRJkqS2TBwkSZIktWXiIEmSJKktEwdJkiRJbZk4SJIkSWrLxEGSJElSWyYOkqTKi4hfjYjbIuLmiHhLS/m2iDgQEfsjYqLMGCWp7k4rOwBJkuYTEc8ELgL+TWZ+JyIeU5SfC1wMPBlYCVwTEU/KzKPlRStJ9WWPgySp6l4F7MjM7wBk5v1F+UXABzPzO5l5J3AAeGpJMUpS7Zk4SJKq7knA0yPiuoj4bET8SFE+Btzd8rx7irIHiYgtEbE3IvYeOnSox+FKUj05VEmSVLqIuAZ47ByHLqNZV50BnA/8CHBFRDy+k/fPzF3ALoDx8fFcWrSSNJxMHCRJpcvMC051LCJeBezOzAS+EBHfA84EGsDqlqeuKsokST3gUCVJUtVNAs8EiIgnAQ8Fvg5cCVwcEQ+LiLXAOcAXygpSkurOHgdJUtW9F3hvRHwZ+C5wSdH7cHNEXAHcAhwBXu2KSpLUOyYOkqRKy8zvAj9zimNvAt7U34gkaTg5VEmSJElSWyYOkiRJktoycZAkSZLUlomDJEmSpLZMHCRJkiS1ZeIgSZIkqS0TB0mSJEltmThIkiRJasvEQZIkSVJbJg6SJEmS2jJxkCRJktSWiYMkSZKktkwcJEmSJLVl4iBJkiSpLRMHSZIkSW2ZOEiSJElq67SyA5AktTc51WDnnv0cnJ5h5egIWyfWsWnDWNlhSZIqppf1hYmDJFXc5FSDbbv3MXP4KACN6Rm27d4HYPIgSTqm1/WFQ5UkqeJ27tl/rBKYNXP4KDv37C8pIklSFfW6vjBxkKSKOzg901G5JGk49bq+MHGQpIpbOTrSUbkkaTj1ur4wcZCkits6sY6R5ctOKBtZvoytE+tKikiSVEW9ri+cHC1JFTc7oc1VlSRJ8+l1fWHiIEkDYNOGMRMFSVJbvawvHKokSZIkqS0TB0mSJEltmThIkiRJasvEQZIkSVJbJg6SJEmS2jJxkCRJktSWiYMkSZKktkwcJEmSJLVl4iBJkiSpLRMHSZIkSW1FZpYdQ1dExCHgrpOKzwS+XkI4i2Gs3TcocYKx9sKgxAn9i/UHMvOsPnxOpQ14fTEocYKx9sqgxDoocYKxzmXO+qI2icNcImJvZo6XHcdCGGv3DUqcYKy9MChxwmDFWleD8jcYlDjBWHtlUGIdlDjBWDvhUCVJkiRJbZk4SJIkSWqr7onDrrID6ICxdt+gxAnG2guDEicMVqx1NSh/g0GJE4y1VwYl1kGJE4x1wWo9x0GSJElSd9S9x0GSJElSFwxs4hARqyPi0xFxS0TcHBGvKcrPiIirI+L24r+nF+UREW+LiAMRcVNEnFdCzMsiYioiPl48XhsR1xUxfSgiHlqUP6x4fKA4vqbPcY5GxIcj4raIuDUifqyK5zUifqP42385Iv4qIh5epXMaEe+NiPsj4sstZR2fx4i4pHj+7RFxSZ/i3Fn8/W+KiI9GxGjLsW1FnPsjYqKl/MKi7EBEXNrtOE8Va8ux10VERsSZxePSzul8sUbErxbn9uaIeEtLeWnnte7C+qJXMQ5EXVF8/m9EReuLU1yDK1dXzBOr9UUP4oyq1hWZOZA3YAVwXnH/UcD/Ac4F3gJcWpRfCry5uP984JNAAOcD15UQ82uBvwQ+Xjy+Ari4uP9O4FXF/V8G3lncvxj4UJ/jvBz4heL+Q4HRqp1XYAy4ExhpOZcvr9I5BX4cOA/4cktZR+cROAO4o/jv6cX90/sQ53OB04r7b26J81zgRuBhwFrgK8Cy4vYV4PHFd+ZG4Nx+nNOifDWwh+ba/GeWfU7nOa/PBK4BHlY8fkwVzmvdb1hf9CrGytcVxWdXur44xbWicnXFPLFaX3T/nFa2rujLP9p+3ICPAc8B9gMrirIVwP7i/ruAl7Y8/9jz+hTfKuBTwLOAjxdfzq+3/GP7MWBPcX8P8GPF/dOK50Wf4nw0zQtsnFReqfNKsyK4u/jHfFpxTieqdk6BNSddDDo6j8BLgXe1lJ/wvF7FedKxnwI+UNzfBmxrObanOM/HzvVcz+t1rMCHgX8DfJXjFUGp5/QUf/8rgAvmeF7p53WYblhfdCPGgagris+qfH0xx7WiknXFXLGedMz6ojt//8rWFQM7VKlV0Y24AbgOODsz7y0O3QecXdyfvXDMuqco65c/AF4PfK94/P3AdGYemSOeY7EWx79ZPL8f1gKHgPcV3eTvjohHULHzmpkN4HeBrwH30jxH11PNc9qq0/NY9vcW4BU0W2KYJ57S4oyIi4BGZt540qHKxQo8CXh6MfzhsxHxI0V5FWOtJeuLrhmIugIGtr4YxLoCrC+6pbJ1xcAnDhHxSOAjwK9n5rdaj2Uz7cpSAmsRES8A7s/M68uOZQFOo9ll9o7M3AB8m2Y36TFVOK/FeM+LaFZeK4FHABeWGVOnqnAe24mIy4AjwAfKjmUuEfF9wBuB/1p2LAt0Gs1Wz/OBrcAVERHlhjQ8rC+6aiDqChj8+qIq57Ed64uuqmxdMdCJQ0Qsp1kJfCAzdxfF/xgRK4rjK4D7i/IGzXFts1YVZf2wEXhhRHwV+CDN7uc/BEYj4rQ54jkWa3H80cA3+hTrPcA9mXld8fjDNCuHqp3XC4A7M/NQZh4GdtM8z1U8p606PY+lfW8j4uXAC4CXFRUX88RTVpxPoPlj4Mbi39cq4IaIeGwFY4Xmv6/d2fQFmi3KZ1Y01lqxvui6QakrYDDri4GpK8D6ogcqW1cMbOJQZF7vAW7NzLe2HLoSuKS4fwnNsayz5T9XzJw/H/hmSzdgT2XmtsxclZlraE60ujYzXwZ8GnjRKWKd/X94UfH8vrQ2ZOZ9wN0Rsa4oejZwC9U7r18Dzo+I7yu+C7NxVu6cnqTT87gHeG5EnF60mj23KOupiLiQ5lCJF2bmAyfFf3E0Vx1ZC5wDfAH4InBONFcpeSjN7/mVvY4zM/dl5mMyc03x7+sempNg76Ni57QwSXPSGxHxJJqT2L5Oxc5r3Vhf9CTOQakrYDDri4GoK8D6okcmqWpd0YuJE/24AU+j2XV3E/Cl4vZ8muMQPwXcTnNG+hnF8wP4Y5qzzvcB4yXF/QyOr5Lx+OIPfgD4a47Pnn948fhAcfzxfY7xKcDe4txO0lxJoHLnFfht4Dbgy8Bf0FxloDLnFPgrmuNpD9O8QL1yMeeR5pjRA8Xt5/sU5wGa4yVn/229s+X5lxVx7gee11L+fJqr1XwFuKxf5/Sk41/l+GS30s7pPOf1ocD7i+/sDcCzqnBe637D+qJX8T2FAagris+vbH1ximtF5eqKeWK1vuj+Oa1sXeHO0ZIkSZLaGtihSpIkSZL6x8RBkiRJUlsmDpIkSZLaMnGQJEmS1JaJgyRJkqS2TBykLoqI1RFxZ0ScUTw+vXi8puTQJEkVYn2hQWTiIHVRZt4NvAPYURTtAHZl5ldLC0qSVDnWFxpE7uMgdVlELAeuB94L/CLwlMw8XG5UkqSqsb7QoDmt7ACkusnMwxGxFfgb4LlWApKkuVhfaNA4VEnqjefR3EL+h8oORJJUadYXGhgmDlKXRcRTgOcA5wO/EREryo1IklRF1hcaNCYOUhdFRNCc7Pbrmfk1YCfwu+VGJUmqGusLDSITB6m7fhH4WmZeXTz+E+BfR8RPlBiTJKl6rC80cFxVSZIkSVJb9jhIkiRJasvEQZIkSVJbJg6SJEmS2jJxkCRJktSWiYMkSZKktkwcJEmSJLVl4iBJkiSpLRMHSZIkSW39f2qx/Xe1MfE9AAAAAElFTkSuQmCC

from statsmodels.stats.diagnostic import het_white
from statsmodels.stats.diagnostic import het_breuschpagan
# White检验函数在python上的使用与bp检验完全一样

def white_test(res, X):
    result_bp_test = sm.stats.diagnostic.het_white(res, X)
    bp_lm_statistic = result_bp_test[0]
    bp_lm_pval = result_bp_test[1]
    bp_F_statistic= result_bp_test[2]
    bp_F_pval = result_bp_test[3]
    white_test_output=pd.Series(result_bp_test[0:4],index=['white_lm_statistic','white_lm_pval','white_F_statistic','white_F_pval'])    
    return white_test_output

# 定义一个输出bp检验的函数
def bp_test(res, X):
    result_bp_test = sm.stats.diagnostic.het_breuschpagan(res, X)
    bp_lm_statistic = result_bp_test[0]
    bp_lm_pval = result_bp_test[1]
    bp_F_statistic= result_bp_test[2]
    bp_F_pval = result_bp_test[3]
    bp_test_output=pd.Series(result_bp_test[0:4],index=['bp_lm_statistic','bp_lm_pval','bp_F_statistic','bp_F_pval'])    
    return bp_test_output
    
# data_lm_wls=sm.formula.wls('Y~X',weights=1/data.X**2,data=data).fit()
data_lm_wls_reg=sm.formula.wls('Y~X',weights=1/data.X**2,data=data)# 使用BP检验
print(bp_test(data_lm_wls.resid,data_lm_wls_reg.exog))
print('----------------------------------')
# 使用White检验
print(white_test(data_lm_wls.resid,data_lm_wls_reg.exog))
#结果输出:
bp_lm_statistic    12.769063
bp_lm_pval          0.000352
bp_F_statistic     22.431874
bp_F_pval           0.000074
dtype: float64
----------------------------------
white_lm_statistic    1.869561e+01
white_lm_pval         8.715662e-05
white_F_statistic     2.701550e+01
white_F_pval          7.166864e-07
dtype: float64
  • 结论:相比于ols估计,wls估计并没有消除异方差,只是系数标准误要小于ols估计的系数标准误

 

(2):使用FGLS估计对该模型进行重新估计,观察残差图并回答:异方差消除了吗?

data['resid']=data_lm.resid

# 进行辅助回归
data_lm_log_2=sm.formula.ols('np.log(resid**2)~X',data=data).fit()
h_hat=np.exp(data_lm_log_2.fittedvalues)

# 进行wls检验
data_lm_wls_2=sm.formula.wls('Y~X',weights=1/h_hat,data=data).fit()
print(data_lm_wls_2.summary())

# 结果输出:
                            WLS Regression Results                            
==============================================================================
Dep. Variable:                      Y   R-squared:                       0.873
Model:                            WLS   Adj. R-squared:                  0.868
Method:                 Least Squares   F-statistic:                     171.6
Date:                Sun, 25 Sep 2022   Prob (F-statistic):           1.07e-12
Time:                        14:18:58   Log-Likelihood:                -110.15
No. Observations:                  27   AIC:                             224.3
Df Residuals:                      25   BIC:                             226.9
Df Model:                           1                                         
Covariance Type:            nonrobust                                         
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      3.1257      5.074      0.616      0.543      -7.325      13.576
X              0.1238      0.009     13.098      0.000       0.104       0.143
==============================================================================
Omnibus:                        5.129   Durbin-Watson:                   2.275
Prob(Omnibus):                  0.077   Jarque-Bera (JB):                1.886
Skew:                           0.224   Prob(JB):                        0.390
Kurtosis:                       1.785   Cond. No.                     1.21e+03
==============================================================================

Notes:
[1] Standard Errors assume that the covariance matrix of the errors is correctly specified.
[2] The condition number is large, 1.21e+03. This might indicate that there are
strong multicollinearity or other numerical problems.

查看散点图:

# 直接看Y与X的散点图
fig=plt.figure(figsize=(13,6))
ax1=fig.add_subplot(1,2,1)
plt.scatter(data.X,data_lm.resid,axes=ax1)
ax1.set_xlabel('X')
ax1.set_ylabel('resid_ols')
ax1.set_title('resid_ols | X')
# 查看ols估计的残差与X的散点图()
ax2=fig.add_subplot(1,2,2)
plt.scatter(data.X,data_lm_wls_2.resid,axes=ax2)
ax2.set_xlabel('X')
ax2.set_ylabel('resid_wls')
ax2.set_title('resid_wls | X')

在这里插入图片描述

print(data_lm.summary().tables[1])
print(data_lm_wls_2.summary().tables[1])

==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept     14.4481      9.562      1.511      0.143      -5.245      34.141
X              0.1054      0.011      9.303      0.000       0.082       0.129
==============================================================================
==============================================================================
                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
Intercept      3.1257      5.074      0.616      0.543      -7.325      13.576
X              0.1238      0.009     13.098      0.000       0.104       0.143
==============================================================================
  • 结论:相比于ols估计,FGLS估计也依然没有消除异方差,只是系数标准误要小于ols估计的系数标准误

 
(3):画出log(Y)与X的散点图,观察方差的状况,说说你的发现;根据散点图的情况,请大胆假设一个你认为正确的模型。

# 直接看Y与X的散点图
fig=plt.figure(figsize=(13,6))
ax1=fig.add_subplot(1,2,1)
plt.scatter(data.X,data.Y,axes=ax1)
ax1.set_xlabel('X')
ax1.set_ylabel('Y')
ax1.set_title('Y | X')

ax2=fig.add_subplot(1,2,2)
plt.scatter(data.X,np.log(data.Y),axes=ax2)
ax2.set_xlabel('X')
ax2.set_ylabel('log(Y)')
ax2.set_title('log(Y) | X')

在这里插入图片描述

  • 在因变量取log的情况下,随着X的增大,样本分布的分散情况有转好的迹象,方差有所缩小。
  • 所以可以考虑对因变量做log变换,类似如下的模型:
    log ⁡ ( Y ) = β 0 + β 1 X \log (Y)=\beta_{0}+\beta_{1} X log(Y)=β0+β1X

 
(4):考虑新模型
log ⁡ ( Y ) = β 0 + β 1 X + β 2 X 2 + u \log (Y)=\beta_{0}+\beta_{1} X+\beta_{2} X^2+u log(Y)=β0+β1X+β2X2+u
使用ols估计该模型,并画出残差散点图,说说你的发现

data_lm_new=sm.formula.ols('np.log(Y)~X+I(X**2)',data=data).fit()
fig=plt.figure(figsize=(13,6))
ax1=fig.add_subplot(1,1,1)
plt.scatter(data.X,data_lm_new.resid,axes=ax1)
ax1.set_xlabel('X')
ax1.set_ylabel('resid_ols')
ax1.set_title('resid_ols | X')

在这里插入图片描述

  • 可以看出改造后的模型,残差基本处于[-0.3, 0.3]这个区间,异方差已基本消除。
# 使用White检验
data_lm_new_reg=sm.formula.ols('np.log(Y)~X+I(X**2)',data=data)
print(white_test(data_lm_new.resid,data_lm_new_reg.exog))

white_lm_statistic    9.615766
white_lm_pval         0.047422
white_F_statistic     3.042223
white_F_pval          0.038767
dtype: float64
  • 进一步结合White检验,可以在显著性水平0.01下拒绝原假设。

 
(5):综合以上四个问题,谈谈你对纠正模型异方差的见解。

  • 纠正模型异方差,首先要从数据分布入手,可以借助散点图,观察自变量与因变量、自变量与残差的分布变化,观察是否存在明显的样本分散程度变大情况。在有了初步的图形判断后,还可以进一步结合BP检验和White检验进行验证。
  • 之后,可以开始对模型进行修正,主要从变量的变换形式上入手,比如增加二次项、对数项或者交互项,具体要结合业务实质。
    以对数项为例,对数项主要适用于以下情况:

对于大数值大区间变量(价格类变量、人口变量等),可取对数变换,如:工资、薪水、销售额、企业市值、人口数量、雇员数量等

  • 本题中的员工数量正好适用于这一情况,同时在变换后的模型结果也进一步验证了这种变换的正确性。
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
在机器学习中,线性回归有多种模型可以使用。其中包括传统的线性回归模型和Lasso模型。 传统的线性回归模型是一种非常经典的方法,它通过拟合一个线性函数来预测因变量和自变量之间的关系。这个模型的数学原理可以通过最小二乘法来推导和求解。最小二乘法的目标是最小化预测值与实际观测值之间的残差平方和,从而得到最优的模型参数。\[1\] Lasso模型是一种用于处理多重共线性问题的算法。它通过对系数进行L1正则化来实现特征选择。L1正则化是指将系数的绝对值乘以一个正则化系数,使得一些特征的系数变为零,从而自动丢弃这些特征。Lasso模型在sklearn库中有相应的实现。\[2\] 线性回归是回归分析中最常用的方法之一,因为它比非线性模型更容易拟合,并且估计的统计特性也更容易确定。线性回归模型可以使用最小二乘法来求解,通过最小化残差平方和来得到最优的模型参数。\[3\] 综上所述,机器学习中线性回归有多种模型可供选择,包括传统的线性回归模型和Lasso模型。这些模型可以通过最小二乘法和L1正则化来求解。 #### 引用[.reference_title] - *1* [机器学习——线性回归模型及python代码实现](https://blog.csdn.net/qq_43045620/article/details/123079305)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *2* [【机器学习之线性回归】多元线性回归模型的搭建+Lasso回归的特征提取](https://blog.csdn.net/qq_43018832/article/details/128103389)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] - *3* [机器学习常用模型-线性回归模型详解(简单易懂)](https://blog.csdn.net/weixin_43308610/article/details/123346498)[target="_blank" data-report-click={"spm":"1018.2226.3001.9630","extra":{"utm_source":"vip_chatgpt_common_search_pc_result","utm_medium":"distribute.pc_search_result.none-task-cask-2~all~insert_cask~default-1-null.142^v91^control_2,239^v3^insert_chatgpt"}} ] [.reference_item] [ .reference_list ]
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值