Improvements and Diagnostics on Algorithms
1. How to Evaluate A Hypothesis
Split training set into 2 parts: training set + test set
If
Jtest(θ)
J
t
e
s
t
(
θ
)
high,
J(θ)
J
(
θ
)
low, then overfitting occurs.
Linear Regression Test Error:
Same as
J(θ)
J
(
θ
)
Logistic Regression Test Error:
then
2. Model Selection
Split training set into 3 parts: training set + cross validation set (CV) + test set
1) Optimize the parameters in Θ using the training set for each polynomial degree.
2) Find the polynomial degree d with the least error using the cross validation set.
3) Estimate the generalization error using the test set with
Jtest(Θ(d))
J
t
e
s
t
(
Θ
(
d
)
)
, (
d
d
= theta from polynomial with lower error);
In reality, CV set and test set should be randomly picked!
3. Diagnosing Bias & Variance
Training error decreases with increases.
Validation error first decreases, then increases as
d
d
becomes bigger.
High Bias:
is high
High Variance:
JCV(θ)
J
C
V
(
θ
)
high,
Jtrain(θ)
J
t
r
a
i
n
(
θ
)
low
4. Choosing λ λ When Doing Regularization
Try
λ:=λ∗2
λ
:=
λ
∗
2
, Pick the one wth least
JCV(θ)
J
C
V
(
θ
)
and see its test error
High Bias:
JCV(θ)≈Jtrain(θ)
J
C
V
(
θ
)
≈
J
t
r
a
i
n
(
θ
)
is high,
λ
λ
is big
High Variance:
JCV(θ)
J
C
V
(
θ
)
high,
Jtrain(θ)
J
t
r
a
i
n
(
θ
)
low,
λ
λ
is small
5. Learning Curves
x-axis is m, y-axis is error
High Bias:
If bias is high, adding more training data won’t help.
High Variance:
If variance is high, adding more training data may help.
6. Solutions for Bias & Variance
High Bias:
-more features;
-more polynomials;
-decreasing
λ
λ
High Variance:
-more examples;
-less features;
-increasing
λ
λ
7.Bias & Variance for Neural Network
Small Network: High Bias
Big Network: High Variance, using
λ
λ
doing regularization
8. Error Metrics: Precision & Recall
Put y=1 in presence of rare classes.
- Precision: Of all y=1 predictions, how many are correctly detected?
- Recall: Of all the rare cases, how many are correctly detected?
How to compare precision and recall? Using F score.
F score =
2PRP+R
2
P
R
P
+
R