9:38 2014-10-12
start stanford openclassroom, machine learning
video I
9:38 2014-10-12
T == Target,
P == Performance,
E == Experience
10:13 2014-10-12
supervised learning, unsupervised learning
10:15 2014-10-12
How to use these machine learning tools in practice?
10:16 2014-10-12
supervised learning introduction
10:18 2014-10-12
supervised learning: "right answer" given
10:21 2014-10-12
supervisde learning: house pricing
unsupervised learning:
10:26 2014-10-12
this is a "classification problem", in constrast to
"regression problem"
10:27 2014-10-12
feature extraction
10:29 2014-10-12
(age, tumor size)
// example of unsupervised learning, "tumor classification"
10:29 2014-10-12
infinite number of tumors, so that your learning
algorithm has a lot of features, lot of attribute,
10:31 2014-10-12
how do you deal with infinite number of features?
10:32 2014-10-12
SVM == Support Vector Machine
10:32 2014-10-12
regression problem // prediction
classification problem
10:35 2014-10-12
unsupervised learning: clustering algorithm
10:38 2014-10-12
google news: example of "clustering algorithm"
10:39 2014-10-12
what clustering algorithm do is to group different
individuals to different clusters.
10:40 2014-10-12
market segmentation // "clustering algorithm"
10:41 2014-10-12
cocktail party problem
10:44 2014-10-12
to separate these 2 audio sources
10:45 2014-10-12
prototype using matlab, then go to c++ or Java...
10:52 2014-10-12
prototyping language: matlab
10:53 2014-10-12
supervised learning: with tag
unsupervised learning: without tag
----------------------------------------------------
/
14:34 2014-10-13
supervised learning,
14:34 2014-10-13
the linear regression model
15:01 2014-10-13
training set, training example
15:05 2014-10-13
cost function
18:15 2014-10-13
I'm going to minimize this cost function
18:15 2014-10-13
least square cost function
18:16 2014-10-13
How to minimize this cost function J(θ)?
using "gradient descent"
18:25 2014-10-13
α, the learning rate controls the step size
18:48 2014-10-13
the significance of α:
* if α too small, gradient descent can be slow
* if α too large, can overshoot minimum, gradient
descent can diverge
18:52 2014-10-13
Batch Gradient Descent
19:39 2014-10-13
vectorized implementation
19:44 2014-10-13
using "vectorized implementation" to replace loop
19:48 2014-10-13
what is the y = kx + b in linear regression ??? // least square
it's essentially a "hypothesis": h(x)
///
8:23 2014-10-14 Tuesday
feature scaling
8:24 2014-10-14
make sure features are on similar scale
8:27 2014-10-14
learning rate: how to make sure gradient descent
work properly
8:46 2014-10-14
make sure J(θ) decrease on every iteration?
8:50 2014-10-14
automatic convergence test
8:56 2014-10-14
try a bunch of different learning rate(α)?
9:10 2014-10-14
polynomial regression
9:40 2014-10-14
normal equations
10:03 2014-10-14
redundant features(linear dependent features)
------------------------------------------------------
start stanford openclassroom, machine learning
video I
9:38 2014-10-12
T == Target,
P == Performance,
E == Experience
10:13 2014-10-12
supervised learning, unsupervised learning
10:15 2014-10-12
How to use these machine learning tools in practice?
10:16 2014-10-12
supervised learning introduction
10:18 2014-10-12
supervised learning: "right answer" given
10:21 2014-10-12
supervisde learning: house pricing
unsupervised learning:
10:26 2014-10-12
this is a "classification problem", in constrast to
"regression problem"
10:27 2014-10-12
feature extraction
10:29 2014-10-12
(age, tumor size)
// example of unsupervised learning, "tumor classification"
10:29 2014-10-12
infinite number of tumors, so that your learning
algorithm has a lot of features, lot of attribute,
10:31 2014-10-12
how do you deal with infinite number of features?
10:32 2014-10-12
SVM == Support Vector Machine
10:32 2014-10-12
regression problem // prediction
classification problem
10:35 2014-10-12
unsupervised learning: clustering algorithm
10:38 2014-10-12
google news: example of "clustering algorithm"
10:39 2014-10-12
what clustering algorithm do is to group different
individuals to different clusters.
10:40 2014-10-12
market segmentation // "clustering algorithm"
10:41 2014-10-12
cocktail party problem
10:44 2014-10-12
to separate these 2 audio sources
10:45 2014-10-12
prototype using matlab, then go to c++ or Java...
10:52 2014-10-12
prototyping language: matlab
10:53 2014-10-12
supervised learning: with tag
unsupervised learning: without tag
----------------------------------------------------
/
14:34 2014-10-13
supervised learning,
14:34 2014-10-13
the linear regression model
15:01 2014-10-13
training set, training example
15:05 2014-10-13
cost function
18:15 2014-10-13
I'm going to minimize this cost function
18:15 2014-10-13
least square cost function
18:16 2014-10-13
How to minimize this cost function J(θ)?
using "gradient descent"
18:25 2014-10-13
α, the learning rate controls the step size
18:48 2014-10-13
the significance of α:
* if α too small, gradient descent can be slow
* if α too large, can overshoot minimum, gradient
descent can diverge
18:52 2014-10-13
Batch Gradient Descent
19:39 2014-10-13
vectorized implementation
19:44 2014-10-13
using "vectorized implementation" to replace loop
19:48 2014-10-13
what is the y = kx + b in linear regression ??? // least square
it's essentially a "hypothesis": h(x)
///
8:23 2014-10-14 Tuesday
feature scaling
8:24 2014-10-14
make sure features are on similar scale
8:27 2014-10-14
learning rate: how to make sure gradient descent
work properly
8:46 2014-10-14
make sure J(θ) decrease on every iteration?
8:50 2014-10-14
automatic convergence test
8:56 2014-10-14
try a bunch of different learning rate(α)?
9:10 2014-10-14
polynomial regression
9:40 2014-10-14
normal equations
10:03 2014-10-14
redundant features(linear dependent features)
------------------------------------------------------