Machine Learning In Action
MrTriste
Machine Learning & Data Mining
展开
-
Machine Learning In Action - Chapter11 - Association analysis
Chapter11 - Association analysis一个项集的支持度(support)被定义为数据集中包含该项集的记录所占的比例可信度或置信度(confidence)是针对一条诸如{尿布} ->{葡萄酒}的关联规则来定义的。这条规则的可信度被定义为“支持度({尿布,葡萄酒})/支持度({尿布})”。寻找频繁项集Apriori原理是说如果某个项集是频繁的,那么它的所有子集也是频繁的原创 2017-08-07 23:09:20 · 229 阅读 · 0 评论 -
Machine Learning In Action-Chapter8 线性回归
Chapter8 - regression线性回归找到一个回归系数向量w,用y = xw来计算预测结果,问题就是如何运用现有的数据集找到最合适的w,一个常用的方法就是找出误差最小的w,如果简单地将误差加减,则正值和负值会抵消,因此选用平方误差。即 ∑i=1m(yi−xTiw)2\sum_{i=1}^m(y_i-x_i^Tw)^2 用矩阵替换掉求和符号即为 (Y−Xw)T(Y−Xw)(Y原创 2017-08-03 19:54:50 · 355 阅读 · 0 评论 -
Machine Learning In Action - Chapter 7 AdaBoost
Chapter7 - AdaBoostBagging与Boosting- Bagging:the data is taken from the original dataset S times to make S new datasets. The datasets are the same size as the original. Each dataset is built by ran原创 2017-08-03 20:15:38 · 249 阅读 · 0 评论 -
Machine Learning In Action - Chapter 5 Logistic Regression
Chapter5 - Logistic RegressionFor the logistic regression classifier we’ll take our features and multiply each one by a weight and then add them up. This result will be put into the sigmoid, and we’l原创 2017-08-03 20:21:02 · 283 阅读 · 0 评论 -
Machine Learning In Action - Chapter 4 naïve Bayes
Chapter 4 - naïve Bayes利用p(x|c)求p(c|x):p(c|x)=p(x|c)p(c)p(x)p(c|x) = \frac{p(x|c)p(c)}{p(x)}原理对N维特征向量w(w1,w2,…,wN)的数据集有k个类别,c1 c2 c3 …ck,现在想知道一个实例(x,y)的分类对(x,y)求分别属于k个类别的概率 p(ci|w)=p(w|ci)p(ci)p原创 2017-08-03 20:24:43 · 265 阅读 · 0 评论 -
Machine Learning In Action - Chapter 3 Decision Tree
Chapter 3 - Decision TreeThe kNN algorithm in chapter 2 did a great job of classifying, but it didn’t lead to any major insights about the data. One of the best things about decision trees is that原创 2017-08-03 20:26:09 · 284 阅读 · 0 评论 -
Machine Learning In Action - Chapter 2 KNN
Chapter 2 - KNNKNN伪代码For every point in our dataset: calculate the distance between inX and the current point sort the distances in increasing order take k items with lowest distances to inX原创 2017-08-03 20:27:59 · 300 阅读 · 0 评论 -
Machine Learning In Action - Chapter 9 Tree-based regression
Chapter9 - Tree-based regressionCART是classification and regression tree,分类与回归树,正如名字所说的,它其实有两种树,分类树和回归树。第三章中讲的决策树是ID3决策树,根据信息增益作为特征选择算法。CART树与前面说的树有什么差别呢?1.之前的生成树的算法在对某个特征切分的时候,将数据集按这个特征的所有取值分成很多部分,这样原创 2017-08-07 00:53:35 · 242 阅读 · 0 评论 -
Machine Learning In Action - Chapter 10 k-means clustering
Chapter10 - k-means clusteringk-均值算法流程:创建k个点作为起始质心(经常是随机选择)当任意一个点的簇分配结果发生改变时 对数据集中的每个数据点 对每个质心 计算质心与数据点之间的距离 将数据点分配到距其最近的簇 对每一个簇,计算簇中所有点的均值并将均值作为质心python实现def ran原创 2017-08-07 00:54:50 · 217 阅读 · 0 评论