week 2.
midterm exam | (midterm exam) 2 | final exam |
89 | 7921 | 96 |
72 | 5184 | 74 |
94 | 8836 | 87 |
69 | 4761 | 78 |
You'd like to use polynomial regression to predict a student's final exam score from their midterm exam score. Concretely, suppose you want to fit a model of the form hθ(x)=θ0+θ1x1+θ2x2 , where x1 is the midterm score and x2 is (midterm score) 2 . Further, you plan to use both feature scaling (dividing by the "max-min", or range, of a feature) and mean normalization.
What is the normalized feature x(2)2 ? (Hint: midterm = 72, final = 74 is training example 2.) Please round off your answer to two decimal places and enter in the text box below.
【解析】mean normalization
Replace xi with xi-μi to make fetures have approximately zero mean.Do not apply to x0=1;
均值归一化
$$ x = \dfrac{x_i -avg }{max-min} $$
avg = (7921+5184+8836+4761)/4=6675.5
answer = (5184-(6675.5))/(8836-4761)
2.Which of the following are reasons for using feature scaling?
It speeds up gradient descent by making it require fewer iterations to get to a good solution.
【解析】Feature scaling speeds up gradient descent by avoiding many extra iterations that are required when one or more features take on much larger values than the rest.
The cost function J(θ) for linear regression has no local optima.
The magnitude of the feature values are insignificant in terms of computational cost.