本文是学习Andrew Ng的机器学习系列教程的学习笔记。教学视频地址:
https://study.163.com/course/introduction.htm?courseId=1004570029#/courseDetail?tab=1
本节是新的监督学习分类算法:SVM-Support Vector Machines支持向量机
50. Support Vector Machines: Optimization objective
用线性来近似出逻辑函数,得到SVM的代价函数,定义为cost_1(z), cost_0(z),see the mathematical definition:
51. Support Vector Machines : large margin intuition
suppose C is very large, the first part can be 0.
be subject to the constraint(s.t.)
you get a very interesting decision boundary.
margin of the support vector machine, this will give SVM a certain robustness. 鲁棒性
Because it tries to separate the data with as large margin as possible.
SVM will select the black line:
when you use large margin algorithm, you learning algorithm can be sensitive to outliers.
add a special point, will change to magenta one:
but if C ware reasonably small, still can get the black line:
in involving the regularization parameter to advance SVM
52. Support Vector Machines: The mathematics behind large margin classification
从向量内积的几何计算方式,推导出SVM决策边界的几何表示,从而验证要想实现最优目标,必须让theta的范数最小,p最大,也就是实现大间距。
Pythagoras theorem 毕达哥拉斯定理即勾股定理
Square root 二次方根
p = length of the projection the vector v onto vector u 向量v在向量u上投影的长度
p is signed 可能是正或者负的
inner product 内积
向量内积的计算方法之一:
通过已知向量长度,未知向量投影projection到已知向量上,利用勾股定理计算
几何意义:p是未知向量在已知向量上的 投影长度,是有正负的;||u||是已知向量的长度或范数norm;
Why SVM will always find the large margin: p越小,theta就得越大,而优化目标是让theta最小。最终是的p最大才行;
θ0=0 or ≠0 what the SVM is trying to do when you have this optimization objective, C is very large.
This support vector machine is still finding the large margin separator between the positive and negative examples.
53. Support Vector Machines – Kernels I
通过kernel来转换SVM的特征,以构建复杂的非线性分类器
sigma σ
kernel 核
how to find a better choice of the features?
54. Support Vector Machines - Using a SVM