线性回归模型
模型表示
y
i
=
x
i
T
β
+
ϵ
i
y_i=x_i^T\beta+\epsilon_i
yi=xiTβ+ϵi
Y
=
(
y
1
,
⋯
,
y
N
)
T
Y=(y_1,\cdots,y_N)^T
Y=(y1,⋯,yN)T
X
=
(
X
1
,
⋯
,
x
N
)
T
∈
R
N
∗
P
X=(X_1,\cdots,x_N)^T \in R^{N*P}
X=(X1,⋯,xN)T∈RN∗P
ϵ
=
(
ϵ
1
,
⋯
,
ϵ
n
)
T
∈
R
N
\epsilon=(\epsilon_1,\cdots,\epsilon_n)^T \in R^N
ϵ=(ϵ1,⋯,ϵn)T∈RN
Y
=
X
β
+
ϵ
Y=X\beta +\epsilon
Y=Xβ+ϵ
let
X
1
=
1
X_1=1
X1=1
可以输入
- quantitative inputs(量化)和变形:多项式项,log项
- qualitative inputs(定性变量):dummy variable
- interaction between 2 variables 两个变量的交叉项
模型假设
A1. the relationship. between y and x is linear
A2.
X
X
X is a non-stochastic matrix and
r
a
n
k
(
X
)
=
p
rank(X)=p
rank(X)=p
A3.
E
ϵ
=
0
E \epsilon =0
Eϵ=0 which implies
E
Y
=
X
β
EY=X\beta
EY=Xβ
A4.
c
o
v
(
ϵ
)
=
σ
2
I
cov(\epsilon)=\sigma^2 I
cov(ϵ)=σ2I
A5.
ϵ
∼
N
(
0
,
σ
2
I
)
\epsilon \sim N(0, \sigma^2 I)
ϵ∼N(0,σ2I)
X 也可以为random matrix 那么假设要改写一下
A1. the relationship. between y and x is linear
A2.
r
a
n
k
(
X
)
=
p
rank(X)=p
rank(X)=p with
p
=
1
p=1
p=1
A3.
E
(
ϵ
∣
X
)
=
0
E (\epsilon |X)=0
E(ϵ∣X)=0 which implies
E
(
Y
∣
X
)
=
X
β
E(Y|X)=X\beta
E(Y∣X)=Xβ
A4.
c
o
v
(
ϵ
∣
X
)
=
σ
2
I
cov(\epsilon|X)=\sigma^2 I
cov(ϵ∣X)=σ2I
A5.
ϵ
∣
X
∼
N
(
0
,
σ
2
I
)
\epsilon|X \sim N(0, \sigma^2 I)
ϵ∣X∼N(0,σ2I)
A2的充分条件 λ m i n ( X T X ) → ∞ a . s . \lambda_{min}(X^TX)\rightarrow \infty \ a.s. λmin(XTX)→∞ a.s.
model estimation
最小二乘估计
comment:
if X not full rank,那么 X T X X^TX XTX不可逆,相当于X 有了一个无信息的列,但列空间不变
统计推断
BIC可以一致选择正确的模型
shrinkage methods
ridge 方差更小的估计量。
lasso shrink to 0 可以起到变量选择的作用