Brief History of Machine Learning 机器学习简史

Since the initial standpoint of science, technology and AI, scientists following Blaise Pascal and Von Leibniz ponder about a machine that is intellectually capable as much as humans. Famous writers like Jules

Machine Learning is one of the important lanes of AI which is very spicy hot subject in the research or industry. Companies, universities devote many resources to advance their knowledge. Recent advances in the field propel very solid results for different tasks, comparable to human performance (98.98% at Traffic Signs

Here I would like to share a crude timeline of Machine Learning and sign some of the milestones by no means complete. In addition, you should add “up to my knowledge” to beginning of any argument in the text.

First step toward prevalent ML was proposed by HebbHebbian Learning

Let us assume that the persistence or repetition of a reverberatory activity (or “trace”) tends to induce lasting cellular changes that add to its stability.… When an  axonABA’B

NoneIn 1952Arthur SamuelCheckers

With that program Samuel confuted the general providence dictating machines cannot go beyond the written codes and learn patterns like human-beings. He coined “machine learning, ” which he defines as;

a field of study that gives computer the ability without being explicitly programmed.

NoneIn 1957Rosenblatt’sPerceptron

The perceptron is designed to illustrate some of the fundamental properties of intelligent systems in general, without becoming too deeply enmeshed in the special, and frequently unknown, conditions which hold for particular biological organisms.[2]

After 3 years later, Widrow [4]Delta Learning ruleLeast SquareMinskyXOR

There had been not to much effort until the intuition of Multi-Layer Perceptron (MLP)Werbos[6]Backpropagation(BP)Linnainmaa [5]MLP BP

None

None

After ID3, many different alternatives or improvements have been explored by the community (e.g. ID4, Regression Trees, CART …) and still it is one of the active topic in ML.

None

None

None

None

Little before, another solid ML model was proposed by Freund and Schapire 1997Adaboost.PAC  (Probably Approximately Correct)

The model we study can be interpreted as a broad, abstract extension of the well-studied on-line prediction model to a general decision-theoretic setting…[11]

Another ensemble model explored by Breiman2001Random Forests(RF)

Random forests are a combination of tree predictors such that each tree depends on the values of a

As we come closer today, a new era of NN called Deep Learning2005

With the combination of all those ideas and non-listed ones, NN models are able to beat off state of art at very different tasks such as Object Recognition, Speech Recognition, NLP etc. However, it should be noted that this absolutely does not mean, it is the end of other ML streams. Even Deep Learning success stories grow rapidly , there are many critics directed to training cost and tuning exogenous parameters of  these models. Moreover, still SVM is being used more commonly owing to its simplicity. (said but may cause a huge debate [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-5eOcUWWU-1638514558745)(https://www.52ml.net/wp-content/uploads/2014/05/0b4b4d1d1619bacc4b157682c669aaef.gif)]

Before finish, I need to touch on one another relatively young ML trend. After the growth of WWW and Social Media, a new term,  BigDataBandit Algorithms [27 – 38]OnlineLearning

I would like to conclude this infant sheet of ML history. If you found something wrong (you should [外链图片转存失败,源站可能有防盗链机制,建议将图片保存下来直接上传(img-0Iah8JNt-1638514558747)(https://www.52ml.net/wp-content/uploads/2014/05/0b4b4d1d1619bacc4b157682c669aaef.gif)]

[1] Hebb D. O., The organization of behaviour. New York: Wiley & Sons.

[2] Rosenblatt, Frank. “The perceptron: a probabilistic model for information storage and organization in the brain.”  Psychological review

[3] Minsky, Marvin, and Papert Seymour. “Perceptrons.” (1969).

[4]Widrow, Hoff  “Adaptive switching circuits.” (1960): 96-104.

[5]S. Linnainmaa. The representation of the cumulative rounding error of an algorithm as a Taylor

[6] P. J. Werbos. Applications of advances in nonlinear sensitivity analysis. In Proceedings of the 10th

[7]  Rumelhart, David E., Geoffrey E. Hinton, and Ronald J. Williams.  Learning internal representations by error propagation

[8]  Hecht-Nielsen, Robert. “Theory of the backpropagation neural network.”  Neural Networks, 1989. IJCNN., International Joint Conference on

[9]  Quinlan, J. Ross. “Induction of decision trees.”  Machine learning

[10]  Cortes, Corinna, and Vladimir Vapnik. “Support-vector networks.”  Machine learning

[11]  Freund, Yoav, Robert Schapire, and N. Abe. “A short introduction to boosting.” Journal-Japanese Society For Artificial Intelligence

[12]  Breiman, Leo. “Random forests.”  Machine learning

[13]  Hinton, Geoffrey E., Simon Osindero, and Yee-Whye Teh. “A fast learning algorithm for deep belief nets.”  Neural computation

[14] Bengio, Lamblin, Popovici, Larochelle, “Greedy Layer-Wise

[15] Ranzato, Poultney, Chopra, LeCun ” Efficient Learning of  Sparse Representations with an Energy-Based Model “, NIPS’2006

[16] Olshausen B a, Field DJ. Sparse coding with an overcomplete basis set: a strategy employed by V1? Vision Res

[17] Vincent, H. Larochelle Y. Bengio and P.A. Manzagol,  Extracting and Composing Robust Features with Denoising Autoencoders

[18]  Fukushima, K. (1980). Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position. Biological Cybernetics, 36, 193–202.

[19]  LeCun, Yann, et al. “Gradient-based learning applied to document recognition.” Proceedings of the IEEE

[20]  LeCun, Yann, and Yoshua Bengio. “Convolutional networks for images, speech, and time series.”  The handbook of brain theory and neural networks

[21]  Zeiler, Matthew D., et al. “Deconvolutional networks.”  Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on

[22] S. Vishwanathan, N. Schraudolph, M. Schmidt, and K. Mur- phy. Accelerated training of conditional random fields with stochastic meta-descent. In International Conference on Ma- chine Learning (ICML ’06), 2006.

[23] Nocedal, J. (1980). ”Updating Quasi-Newton Matrices with Limited Storage.” Mathematics of Computation 35 (151): 773782. doi:10.1090/S0025-5718-1980-0572855-

[24] S. Yun and K.-C. Toh, “A coordinate gradient descent method for l1- regularized convex minimization,” Computational Optimizations and Applications, vol. 48, no. 2, pp. 273–307, 2011.

[25] Goodfellow I, Warde-Farley D. Maxout networks. arXiv Prepr arXiv …

[26] Wan L, Zeiler M. Regularization of neural networks using dropconnect. Proc …

[27]  Alekh AgarwalOlivier ChapelleMiroslav DudikJohn LangfordA Reliable Effective Terascale Linear Learning System

[28]  M. HoffmanD. BleiF. BachOnline Learning for Latent Dirichlet Allocation

[29]  Alina BeygelzimerDaniel HsuJohn LangfordTong ZhangAgnostic Active Learning Without Constraints

[30]  John DuchiElad HazanYoram SingerAdaptive Subgradient Methods for Online Learning and Stochastic Optimization

[31]  H. Brendan McMahanMatthew StreeterAdaptive Bound Optimization for Online Convex Optimization

[32]  Nikos KarampatziakisJohn LangfordImportance Weight Aware Gradient Updates

[33]  Kilian WeinbergerAnirban DasguptaJohn LangfordAlex SmolaJosh AttenbergFeature Hashing for Large Scale Multitask Learning

[34]  Qinfeng ShiJames PettersonGideon DrorJohn LangfordAlex SmolaSVN VishwanathanHash Kernels for Structured Data

[35]  John LangfordLihong LiTong ZhangSparse Online Learning via Truncated Gradient

[36]  Leon BottouStochastic Gradient Descent

[37]  Avrim BlumAdam KalaiJohn LangfordBeating the Holdout: Bounds for KFold and Progressive Cross-Validation

[38]  Nocedal, J.

[39] D. H. Ballard. Modular learning in neural networks. In AAAI, pages 279–284, 1987.

[40] S. Hochreiter. Untersuchungen zu dynamischen neuronalen Netzen. Diploma thesis, Institut f ̈ur In-

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

ziix

你的鼓励将是我创作的最大动力

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值