An overview of machine learning

最新推荐文章于 2022-10-27 23:08:14 发布

CBF

最新推荐文章于 2022-10-27 23:08:14 发布

阅读量1k

点赞数

分类专栏：机器学习

机器学习专栏收录该内容

8 篇文章 0 订阅

订阅专栏

　　学习机器学习有两种方法：（1）根据学习策略（learning style）对算法进行分类学习；（2）根据相似的形式或者功能进行学习。

根据学习策略进行分类：

监督式学习：例子包括分类问题和回归问题，具体的算法包括LR（logistic regression）、神经网络中的反向传播（BP）(Back Propagation Neural Network)。
非监督式学习：相关性学习（association rule learning）和聚类（clustering），具体的算法包括Apriori和k-means。
半监督式学习：仍然以分类和回归问题为主。
强化学习:

根据算法相似性进行分类：

Regression

Ordinary Least Squares
Logistic Regression
Stepwise Regression
Multivariate Adaptive Regression Splines (MARS)
Locally Estimated Scatterplot Smoothing (LOESS)

Instance-based Methods

k-Nearest Neighbour (kNN)
Learning Vector Quantization (LVQ)
Self-Organizing Map (SOM)

Regularization Methods

Ridge Regression
Least Absolute Shrinkage and Selection Operator (LASSO)
Elastic Net

Decision Tree Learning

Classification and Regression Tree (CART)
Iterative Dichotomiser 3 (ID3)
C4.5
Chi-squared Automatic Interaction Detection (CHAID)
Decision Stump
Random Forest
Multivariate Adaptive Regression Splines (MARS)
Gradient Boosting Machines (GBM)

Bayesian

Naive Bayes
Averaged One-Dependence Estimators (AODE)
Bayesian Belief Network (BBN)

Kernel Methods

　　将输入数据映射到跟高的维度，从而解决分类问题：
　　1. Support Vector Machines (SVM)
　　2. Radial Basis Function (RBF)
　　3. Linear Discriminant Analysis (LDA)

Clustering Methods

k-Means
Expectation Maximisation (EM)

Association Rule Learning

Apriori algorithm
Eclat algorithm

Artificial Neural Networks

Perceptron
Back-Propagation
Hopfield Network
Self-Organizing Map (SOM)
Learning Vector Quantization (LVQ)

Deep Learning

　　近年来非常火热的方法，主要采用半监督式的学习方法：
　　1. Restricted Boltzmann Machine (RBM)
　　2. Deep Belief Networks (DBN)
　　3. Convolutional Network
　　4. Stacked Auto-encoders

Dimensionality Reduction

　　对样本数据结构进行非监督式的学习，总结出更少的信息，然后用于监督式学习：
　　1. Principal Component Analysis (PCA)
　　2. Partial Least Squares Regression (PLS)
　　3. Sammon Mapping
　　4. Multidimensional Scaling (MDS)
　　5. Projection Pursuit

Ensemble Methods

　多个弱学习模型分别进行训练，然后按照一定方式结合进行预测。
　1. Boosting
　2. Bootstrapped Aggregation (Bagging)
　3. AdaBoost
　4. Stacked Generalization (blending)
　5. Gradient Boosting Machines (GBM)
　6. Random Forest

　　学习机器学习算法是一件令人头疼的事，我们有那么多的论文、书籍、网站可以参考，它们或是精炼的数学描述（mathematically），或是一步一步的文本介绍(textually)。如果足够幸运，可能还会找到一些伪代码。如果人品爆发，甚至会被告知如何安装。但是，全靠人品毕竟不是长久之计，详尽的算法修炼秘籍也是寥寥。如此窘境，如何是好？
　　某侠士集成多年所得，指得一条明路：从“只言片语”中抽丝剥茧，多方参照，可获真经。
　　这“只言片语”并非随意摘选，而是指某一派算法之总。即算法最原始的出处（即文章），以及来自于综述和典籍的二次阐述。这些地方，常藏有算法的代码实现。如若勤加钻研，定会事半功倍。
　　修习算法，可先广泛涉猎算法知识，以窥全貌，之后代码实现。这不失为一种方法，但是算法为源，代码为支。追宗溯本，算法心经为要。然心经修习，讲究日积月累，需的耐烦、入定。此事甚难。该侠士又提一辅助修炼方法——结构化算法学习。
　　修习某一算法之前，列出所要关注的若干问题，包括：

算法的标准和简称是什么？（What is the standard and abbreviations used for the algorithm?）
信息处理策略是什么？（What is the information processing strategy of the algorithm？）
算法的目标是什么？（What is the objective or goal for the algorithm?）
算法的衍生实践有哪些？（What metaphors or analogies are commonly used to describe the behavior of the algorithm?）
伪代码的实现（What is the pseudocode or flowchart description of the algorithm?）
算法的使用技巧、注意事项有哪些？（What are the heuristics or rules of thumb for using the algorithm?）
算法可以解决哪类问题？（What classes of problem is the algorithm well suited?）
描述相关算法的资源有哪些？（What are useful resources for learning more about the algorithm?）
算法的源头在哪里？（What are the primary references or resources in which the algorithm was first described?）
日常学习之中，将同一算法的讨论逐一归入如上框架。
这一方法的精妙之处在于，不需要你事先成为某一领域的大家。只是每遇一算法，便归入框架，日积月累，功力自然与日俱增。

参考：
1. A Tour of Machine Learning Algorithms
2. How to Learn a Machine Learning Algorithm
3. 学习资源：
　（1） List of Machine Learning Algorithms
　（2） Machine Learning Algorithms Category
　（3） CRAN Task View: Machine Learning & Statistical Learning:
　（4） Top 10 Algorithms in Data Mining

CBF

关注

0
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
An overview of machine learning

学习机器学习有两种方法：（1）根据学习策略（learning style）对算法进行分类学习；（2）根据相似的形式或者功能进行学习。根据学习策略进行分类：监督式学习：例子包括分类问题和回归问题，具体的算法包括LR（logistic regression）、神经网络中的反向传播（BP）(Back Propagation Neural Network)。非监督式学习：相关性学习（associati
复制链接

扫一扫