精通机器学习的5本免费电子书(5 free e-books for machine learning mastery)

原文5 free e-books for machine learning mastery 
作者:Serdar Yegulalp 翻译:赖信涛 责编:仲培艺

There are few subjects in computing as fascinating, or intimidating, as machine learning. Let's face it -- you can't master machine learning in a weekend, and at the very least it requires a good grasp of the underlying mathematical principles.

That said, if you have the math chops, you'll want to augment your use of machine learning frameworks (there are plenty to pick from) with a good understanding of the theory behind them.

[ The InfoWorld review roundup: AWS, Microsoft, Databricks, Google, HPE, and IBM machine learning in the cloud. | Get a digest of the day's top tech stories in the InfoWorld Daily newsletter. ]

Here are five high-quality, free-to-read texts that provide introductions to and explanations of machine learning's ins and outs. Some have code examples, but most focus on formulas and theory; in principle, they can be applied to any number of languages, frameworks, or problems.

A Course in Machine Learning

The gist: A highly readable text designed to provide an extremely beginner-friendly approach to the topic. The book is a work in progress -- some sections are still marked TODO -- but what it lacks in completeness, it makes up in sheer accessibility.

Target audience: Anyone with a good grasp of calculus, probability, and linear algebra. No expertise in any specific language is required.

Code content: Some pseudocode; the majority of what's presented is concepts and formulas.

The Elements of Statistical Learning

The gist: A 500-plus-page text that covers what the authors describe as "learning from data," the processes of employing statistics that are the underpinnings for machine learning. It's been through two editions and 10 printings since 2001, for good reason -- it covers a massive amount of territory and isn't limited to any one field.

Target audience: Those who already have a good foundation in math and statistics and don't need a lot of hand-holding to translate their math skills into good code.

Code content: None. This isn't a software development text; this is about foundational concepts around machine learning.

  • A Course in Machine LearningLEARN MORE
    on  Hal Daumé III
  • The Elements of Statistical Learning, 2nd Ed.LEARN MORE
    on  Stanford University
  • Bayesian Reasoning and Machine LearningLEARN MORE
    on  David Barber
  • Gaussian Processes for Machine LearningLEARN MORE
    on  Gaussian Processes for Machine...
  • Machine LearningLEARN MORE
    on  InTech

Bayesian Reasoning and Machine Learning

The gist: Bayesian methods are behind everything from spam filters to pattern recognition, so they constitute a major field of study for machine-learning mavens. This text walks through all the major aspects of Bayesian statistics, and how they apply to common scenarios in machine learning.

Target audience: Anyone with a good grasp of calculus, probability, and linear algebra.

Code content: Lots! Each chapter contains both pseudocode and links to a toolkit of actual code demos. That said, the code is not in Python or R, but is code for the commercial MATLAB environment, although GNU Octave can work as an open source substitute.

Gaussian Processes for Machine Learning

The gist: Gaussian processes are part of the family of analyses used by Bayesian methods. This text focuses on how Gaussian concepts can be used in common machine learning methods like classification, regression, and model training.

Target audience: Roughly the same as "Bayesian Reasoning and Machine Learning."

Code content: Most of the code featured in the book is pesudocode, but like "Bayesian Reasoning and Machine Learning," the appendices include examples for MATLAB/Octave.

Machine Learning

The gist: A collection of essays on different and highly specific aspects of machine learning. Some are more general and philosophical; others are focused on specific problem domains, such as "Machine Learning Methods for Spoken Dialogue Simulation and Optimization."

Target audience: Intended for lay readers as well as the more technically inclined.

Code content: Virtually none, although formulas abound. Read for flavor.



计算机中有一些领域非常令人着迷,或令人畏惧,机器学习就是这样。精通机器学习并非一朝之事,至少,你需要花一些时间掌握必备的数学知识。

也就是说,如果你数学很好,那么就会更加理解机器学习框架背后的原理,使用起来也会得心应手。

下面介绍5本高质量的、免费阅读的电子书,主要是对机器学习的介绍和解释。其中有一些有代码示例,但是一般都是专注于公式和理论的,这些原理可以应用到各种语言、框架和问题。

A Course in Machine Learning

要点:为初学者准备的初涉机器学习的高质量文档。此书仍在撰写中——有一些章节依然标记着TODO——但是其高可读性完全可以弥补这部分不足。

目标读者:任何掌握微积分、概率论和线性代数的人都可以阅读此书,不需要有任何编程语言专长。

代码内容:有一些伪代码,不过此书大部分用来展示的东西还是原理和公式。

The Elements of Statistical Learning

要点:超过500页的文本,据作者称,具体陈述了如何“从数据中学习”,对机器学习岗位需求的急剧升高显示了这个领域的热门程度。此书自2001年已经出版过两个版本并印刷了10次,此书还有一大好处:跨度很大,不局限于一个领域。

目标读者:统计学和数学基础较好的、不需要将自己的数学形式转换成代码的人。

代码内容:没有。这并不是一本软件开发的书,而是关于机器学习的理论基础。

Bayesian Reasoning and Machine Learning

要点内容: Bayesian(贝叶斯)方法是所有有关模式识别和垃圾过滤的基础,所以逐渐形成了一个特殊的领域。此书涵盖Bayesian统计的各个主要方面,阐述了它是如何应用的。

目标读者:任何有微积分、概率论和线性代数基础的人。

代码内容:很多!每一个章节都有伪代码和工具的链接,以及一些demo。而且,代码并不是Python或R语言的,而是商业MATLAB环境,GNU Octave也可以作为一个开源的替代品。

Gaussian Processes for Machine Learning

重点内容:高斯处理也是贝叶斯方法的一部分。本书集中讨论如何在一般机器学习方法中使用高斯原理,例如分类、回归和模型训练等。

目标读者:大致和Bayesian Reasoning and Machine Learning差不多。

代码内容:书中使用的代码大多是伪代码,但是和ayesian Reasoning and Machine Learning一样,有些MATLAB/Octave代码。

Machine Learning

重点内容:一个论文集,包括很多不同方面、内容深奥的机器学习知识。其中一些比较抽象,另一些专注于特定的问题,比如“模拟对话的机器学习方法”等。

目标读者:想要在这方面深入学习的人。

代码内容:有一些公式,没有代码。

Preface Machine learning algorithms dominate applied machine learning. Because algorithms are such a big part of machine learning you must spend time to get familiar with them and really understand how they work. I wrote this book to help you start this journey. You can describe machine learning algorithms using statistics, probability and linear algebra. The mathematical descriptions are very precise and often unambiguous. But this is not the only way to describe machine learning algorithms. Writing this book, I set out to describe machine learning algorithms for developers (like myself). As developers, we think in repeatable procedures. The best way to describe a machine learning algorithm for us is: 1. In terms of the representation used by the algorithm (the actual numbers stored in a file). 2. In terms of the abstract repeatable procedures used by the algorithm to learn a model from data and later to make predictions with the model. 3. With clear worked examples showing exactly how real numbers plug into the equations and what numbers to expect as output. This book cuts through the mathematical talk around machine learning algorithms and shows you exactly how they work so that you can implement them yourself in a spreadsheet, in code with your favorite programming language or however you like. Once you possess this intimate knowledge, it will always be with you. You can implement the algorithms again and again. More importantly, you can translate the behavior of an algorithm back to the underlying procedure and really know what is going on and how to get the most from it. This book is your tour of machine learning algorithms and I’m excited and honored to be your tour guide. Let’s dive in.
属于网络下载资源,感谢原作者的贡献。 ##目录介绍 - **DeepLearning Tutorials** 这个文件夹下包含一些深度学习算法的实现代码,以及具体的应用实例,包含: Keras使用进阶。介绍了怎么保存训练好的CNN模型,怎么将CNN用作特征提取,怎么可视化卷积图。 [keras_usage]介绍了一个简单易用的深度学习框架keras,用经典的Mnist分类问题对该框架的使用进行说明,训练一个CNN,总共不超过30行代码。 将卷积神经网络CNN应用于人脸识别的一个demo,人脸数据库采用olivettifaces,CNN模型参考LeNet5,基于python+theano+numpy+PIL实现。 CNN卷积神经网络算法的实现,模型为简化版的LeNet,应用于MNIST数据集(手写数字),来自于DeepLearning.net上的一个教程,基于python+theano 多层感知机算法的实现,代码实现了最简单的三层感知机,并应用于MNIST数据集。 [Softmax_sgd(or logistic_sgd)]Softmax回归算法的实现,应用于MNIST数据集,基于Python+theano。 - **PCA** 基于python+numpy实现了主成份分析PCA算法 - **kNN** 基于python+numpy实现了K近邻算法,并将其应用在MNIST数据集上, - **logistic regression** - 基于C++以及线性代数库Eigen实现的logistic回归,[代码] - 基于python+numpy实现了logistic回归(二类别) - **ManifoldLearning** 运用多种流形学习方法将高维数据降维,并用matplotlib将数据可视化(2维和3维) - **SVM** - **GMM** GMM和k-means作为EM算法的应用,在某种程度有些相似之处,不过GMM明显学习出一些概率密度函数来,结合相关理解写成python版本 - **DecisionTree** Python、Numpy、Matplotlib实现的ID3、C4.5,其中C4.5有待完善,后续加入CART。 - **KMeans** 介绍了聚类分析中最常用的KMeans算法(及二分KMeans算法),基于NumPy的算法实现,以及基于Matplotlib的聚类过程可视化。 朴素贝叶斯算法的理论推导,以及三种常见模型(多项式模型,高斯模型,伯努利模型)的介绍与编程实现(基于Python,Numpy)
机器学习是一种人工智能领域的研究,通过让计算机从数据中模式中学习和改进,可以让计算机具备从经验中学习的能力。R是一种流行的编程语言,被广泛用于数据分析和统计学习。在R中,可以使用不同的机器学习算法来处理和分析数据。 "Machine Learning Mastery with R" 是一本书籍或教程,旨在帮助读者掌握使用R进行机器学习的技能。该书可能包含以下内容: 1. R的基础知识:介绍R编程语言的基本语法和数据结构,帮助读者理解如何在R环境中进行数据处理和分析。 2. 机器学习算法:介绍常见的机器学习算法,如线性回归、逻辑回归、决策树、随机森林等,并提供使用R实现这些算法的示例。 3. 特征工程:介绍如何选择和处理数据的特征,以提高机器学习算法的性能。这可能包括特征选择、特征缩放和特征转换等技术。 4. 模型评估和调优:介绍如何评估和优化机器学习模型的性能。这可能包括交叉验证、网格搜索和模型选择等技术。 5. 实际案例:提供一些真实世界的案例研究,展示如何应用机器学习和R来解决实际问题。 通过学习"Machine Learning Mastery with R",读者可以了解机器学习的基本概念和技术,并掌握使用R语言进行机器学习的实践技能。这将使他们能够在实际项目中应用机器学习算法,从而更好地理解和分析数据,并做出准确的预测和决策。
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值