Mathine Learning Week1
Classify
Supervised learning:
“right answers”given
i) Regression: Predict continuous valued outputs
ii) Classification: Discrete valued output(0 or 1 or even more)
Unsupervised learning
Clustering(Algorithm)
Liner Regression
Hypothesis Function
The Hypothesis Function:
hθ(x)=θ0+θ1x
in details:
the way of choosing parameters:
Cost Function
Cost Function:
J(θ0,θ1)=12m∑i=1m(hθ(x(i)−y(i))2
3D cost function figures:
using contour figures to represent 3D plots:
Gradient Descent
Gradient Descent:
θj:=θj−α∂∂θjJ(θ0,θ1)for j=0 and j=1
Especially Gradient Descent for Linear Regression:
repeat until convergence:{
Simultaneous update the parameters:
Gradient descent with one variable:
Why learning rate shouldn’t be too big or small:
But you can keep the learning rate fixed with the steps automatically getting small:
with 2 parameters:
Liner regression’s cost function is always bowl shape:
So there are no local optima but only one global optimum