Model selection
d
=
d
e
g
r
e
e
.
o
f
.
p
o
l
y
n
o
m
i
d
d = degree .of .polynomid
d=degree.of.polynomid
d
=
1
,
h
θ
(
x
)
=
θ
0
+
θ
1
x
d=1,h_{\theta}(x)=\theta_0 +\theta_1x
d=1,hθ(x)=θ0+θ1x
d
=
2
,
h
θ
(
x
)
=
θ
0
+
θ
1
x
+
θ
2
x
d=2,h_{\theta}(x)=\theta_0 +\theta_1x+\theta_2x
d=2,hθ(x)=θ0+θ1x+θ2x
d
=
3
,
h
θ
(
x
)
=
θ
0
+
θ
1
x
+
θ
2
x
+
θ
3
x
3
d=3,h_{\theta}(x)=\theta_0 +\theta_1x+\theta_2x+\theta_3x^{3}
d=3,hθ(x)=θ0+θ1x+θ2x+θ3x3
d
=
10
,
h
θ
(
x
)
=
θ
0
+
θ
1
x
+
θ
2
x
+
θ
3
x
3
.
.
.
.
.
.
θ
1
0
x
10
d=10 ,h_{\theta}(x)=\theta_0 +\theta_1x+\theta_2x+\theta_3x^{3}......\theta_10x^{10}
d=10,hθ(x)=θ0+θ1x+θ2x+θ3x3......θ10x10
Then calculate everyone
Θ
(
d
)
\Theta^{(d)}
Θ(d)–>
J
t
e
s
t
(
Θ
(
d
)
)
J_{test}(\Theta^{(d)})
Jtest(Θ(d)),to choose the most reasonable one
But the problem still live in ,when new training set appear.
- In order to get around this problem ,we’re going to do is split it into 3 pieces
(Testing set60%, Cross validation set20% , Test set20%)
J
t
r
a
i
n
(
θ
)
=
1
/
2
m
∑
i
=
1
m
(
h
θ
(
x
(
i
)
)
−
y
(
i
)
)
2
J_{train}(\theta)=1/2m \displaystyle \sum^{m}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})^2
Jtrain(θ)=1/2mi=1∑m(hθ(x(i))−y(i))2
J
c
v
(
θ
)
=
1
/
2
m
c
v
∑
i
=
1
m
c
v
(
h
θ
(
x
(
i
)
)
−
y
(
i
)
)
2
J_{cv}(\theta)=1/2m_{cv} \displaystyle \sum^{m_{cv}}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})^2
Jcv(θ)=1/2mcvi=1∑mcv(hθ(x(i))−y(i))2
J
t
e
s
t
(
θ
)
=
1
/
2
m
t
e
s
t
∑
i
=
1
m
t
e
s
t
(
h
θ
(
x
(
i
)
)
−
y
(
i
)
)
2
J_{test}(\theta)=1/2m_{test} \displaystyle \sum^{m_{test}}_{i=1}(h_{\theta}(x^{(i)})-y^{(i)})^2
Jtest(θ)=1/2mtesti=1∑mtest(hθ(x(i))−y(i))2
Diagnosing bias vs. variance
Regularization and bias/variance(正则化和偏差、方差)
taking about how it interacts with and is effected by the regularization of your learning algorithm
learning curves(学习曲线)
If a learning algorithm is suffering from high bias, getting more training data will not (by itself) help much
- Get more training examples(fixes high variance)
- Try smaller sets of features(fixes high variance)
- Try getting additional features(fixes high bias)
- Try adding polynomial features( x 1 2 , x 2 2 , x 1 , x 2 , e t c x_1^2,x_2^2,x_1,x_2,etc x12,x22,x1,x2,etc)(fixes high bias)
- Try decreasing λ \lambda λ( fix high bias)
- Try increasing λ \lambda λ( fix high variance)