4.1 Multiple features(variables)
notation
n
n
n = number of feature
x
(
i
)
x^{(i)}
x(i) = input (features) of
i
i
ith training example
x
j
(
i
)
x_j^{(i)}
xj(i) = value of features
j
j
j in
i
i
ith training example
hypothesis:
previously:
h
θ
(
x
)
=
θ
0
+
θ
1
x
h_{\theta}(x) = \theta_0 +\theta_1x
hθ(x)=θ0+θ1x
Now:
h
θ
(
x
)
=
θ
0
+
θ
1
x
1
+
θ
2
x
2
+
⋯
+
θ
n
x
n
=
θ
T
x
h_{\theta}(x)=\theta_0 + \theta_1x_1 +\theta_2x_2+\cdots+\theta_nx_n=\theta^Tx
hθ(x)=θ0+θ1x1+θ2x2+⋯+θnxn=θTx
for convenience of notation,define
x
0
=
1
x_0=1
x0=1
X
=
(
x
0
x
1
x
2
…
x
n
)
∈
R
n
+
1
θ
=
(
θ
0
θ
1
θ
2
…
θ
n
)
∈
R
n
+
1
X=\begin{pmatrix} x_0\\x_1\\x_2\\\dots\\x_n\end{pmatrix}\in\mathbb{R^{n+1}}\qquad \theta =\begin{pmatrix}{\theta_0}\\{\theta_1}\\{\theta_2}\\\dots\\{\theta_n}\end{pmatrix}\in\mathbb{R^{n+1}}
X=⎝⎜⎜⎜⎜⎛x0x1x2…xn⎠⎟⎟⎟⎟⎞∈Rn+1θ=⎝⎜⎜⎜⎜⎛θ0θ1θ2…θn⎠⎟⎟⎟⎟⎞∈Rn+1
conclusion:
multivariate linear regression
4.2 Gradient Descent for multiple variables
hypothesis: