机器学习实用指南_机器学习方法:实用指南

机器学习实用指南

by Karlijn Willems

通过Karlijn Willems

机器学习方法:实用指南 (How Machines Learn: A Practical Guide)

You may have heard about machine learning from interesting applications like spam filtering, optical character recognition, and computer vision.

您可能已经听说过从有趣的应用程序(例如垃圾邮件过滤,光学字符识别和计算机视觉)中学习机器的方法。

Getting started with machine learning is long process that involves going through several resources. There are books for newbies, academic papers, guided exercises, and standalone projects. It’s easy to lose track of what you need to learn among all these options.

机器学习入门是一个漫长的过程,涉及到涉及多种资源。 有适合新手的书籍,学术论文,指导练习和独立项目。 在所有这些选项中,很容易忘记需要学习的内容。

So in today’s post, I’ll list seven steps (and 50+ resources) that can help you get started in this exciting field of Computer Science, and ramp up toward becoming a machine learning hero.

因此,在今天的帖子中,我将列出七个步骤(以及50多个资源),这些步骤可以帮助您开始在这个令人兴奋的计算机科学领域入门,并逐步成为一名机器学习英雄。

Note that this list of resources is not exhaustive and is meant to get you started. There are many more resources around.

请注意,此资源列表并不详尽,旨在帮助您入门。 周围还有更多资源。

1.获得必要的背景知识 (1. Get the necessary background knowledge)

You might remember from DataCamp’s Learn Data Science infographic that mathematics and statistics are key to starting machine learning (ML). The foundations might seem quite easy because it’s just three topics. But don’t forget that these are in fact three broad topics.

您可能从DataCamp的“ 学习数据科学”信息图中还记得,数学和统计学是启动机器学习(ML)的关键。 建立基础似乎很容易,因为这只是三个主题。 但是请不要忘记,这些实际上是三个主要主题。

There are two things that are very important to keep in mind here:

这里有两件事要记住很重要:

  • First, you’ll definitely want some further guidance on what exactly you need to cover to get started.

    首先,您肯定会想要一些入门方面的进一步指导。
  • Second, these are the foundations of your further learning. Don’t be scared to take your time. Get the knowledge on which you’ll build everything.

    其次,这些是您进一步学习的基础。 不要害怕花时间。 获得构建一切所需的知识。

The first point is simple: it’s a good idea to cover linear algebra and statistics. These two are the bare minimum that one should understand. But while you’re at it, you should also try to cover topics such as optimization and advanced calculus. They will come in handy when you’re getting deeper into ML.

第一点很简单:覆盖线性代数和统计量是一个好主意。 这两个是应该理解的最低要求。 但是当您使用它时,您还应该尝试涵盖诸如优化和高级演算之类的主题。 当您深入学习ML时,它们将派上用场。

Here are some pointers on where to get started if you are starting from zero:

如果您从零开始,以下是一些入门指南:

If you’re more into books, consider the following:

如果您更喜欢书籍,请考虑以下事项:

However, in most cases, you’ll start off already knowing some things about statistics and mathematics. Or maybe you have already gone through all the theory resources listed above.

但是,在大多数情况下,您将已经开始了解一些有关统计和数学的知识。 也许您已经遍历了上面列出的所有理论资源。

In these cases, it’s a good idea to recap and assess your knowledge honestly. Are there any areas that you need to revise or are you good for now?

在这种情况下,最好对自己的知识进行回顾和评估。 您是否有需要修改的地方,或者您目前是否擅长?

If you’re all set, it’s time to go ahead and apply all that knowledge with R or Python. As a general guideline, it’s a good idea to pick one and get started with that language. Later, you can still add the other programming language to your skill set.

如果一切都准备好了,该是继续使用R或Python应用所有知识的时候了。 作为一般准则,最好选择一个并开始使用该语言。 以后,您仍然可以将其他编程语言添加到您的技能中。

Why is all this programming knowledge necessary?

为什么所有这些编程知识都是必需的?

Well, you’ll see that the courses listed above (or those you have taken in school or university) will provide you with a more theoretical (and not applied) introduction to mathematics and statistics topics. However, ML is very applied and you’ll need to be able to apply all the topics you have learned. So it’s a good idea to go over the materials again, but this time in an applied way.

好吧,您会看到上面列出的课程(或您在学校或大学里修过的课程)将为您提供关于数学和统计学主题的更理论性(而非实际应用)的介绍。 但是,ML的应用非常广泛,您需要能够应用所学的所有主题。 因此,再次遍历这些材料是一个好主意,但是这次以一种实用的方式进行。

If you want to master the basics of R and Python, consider the following courses:

如果您想掌握R和Python的基础知识,请考虑以下课程:

When you have nailed down the basics, check out DataCamp’s blog on the 40+ Python Statistics For Data Science Resources. This post offers 40+ resources on the statistics topics you need to know to get started with data science (and by extension also ML).

掌握了基础知识之后,请访问DataCamp的博客,该博客上有40多个用于数据科学资源的Python Statistics 。 这篇文章提供了40多个关于统计主题的资源,您需要了解这些知识才能着手进行数据科学(并扩展为ML)。

Also make sure you check out this SciPy tutorial on vectors and arrays and this workshop on Scientific Computing with Python.

另外,还要确保您查看了有关向量和数组的SciPy教程以及有关使用Python进行科学计算的研讨会

To get hands-on with Python and calculus, you can check out the SymPy package.

要动手使用Python和演算,您可以查看SymPy软件包

2.不要害怕投资机器学习的“理论” (2. Don’t be scared to invest in the “theory” of ML)

A lot of people don’t make the effort to go through some more theoretical material because it’s “dry” or “boring.” But going through the theory and really investing your time in it is essential and invaluable in the long run. You’ll better understand new advancements in machine learning, and you’ll be able to link back to your background knowledge. This will help you stay motivated.

许多人不愿意尝试一些更理论的材料,因为它是“枯燥的”或“无聊的”。 但是,从长远来看,仔细研究该理论并真正在该理论上投入时间是必不可少且无价的。 您将更好地了解机器学习的新进展,并且能够链接回您的背景知识。 这将帮助您保持动力。

Additionally, the theory doesn’t need to be boring. As you read in the introduction, there are so many materials that will make it easier for you to get into it.

此外,该理论不必很无聊。 正如您在简介中所读到的那样,有太多材料可以使您更轻松地入门。

Books are one of the best ways to absorb the theoretical knowledge. They force you to stop and think once in a while. Of course, reading books is a very static thing to do and it might not agree with your learning style. Nonetheless, try out the following books and see if it might be something for you:

书籍是吸收理论知识的最佳方法之一。 他们迫使您停下来思考一下。 当然,读书是一件非常静态的事情,可能与您的学习风格不一致。 尽管如此,请尝试以下书籍,看看是否适合您:

  • Machine Learning textbook, by Tom Mitchell might be old but it’s gold. This book goes over the most important topics in machine learning in a well-explained and step-by-step way.

    汤姆·米切尔(Tom Mitchell)撰写的《 机器学习》教科书虽然年代久远,但却是黄金。 本书以详尽的解释和分步介绍了机器学习中最重要的主题。

  • Machine Learning: The Art and Science of Algorithms that Make Sense of Data (you can see the slides of the book here): this book is great for beginners. There are many real-life applications discussed, which you might find lacking in Tom Mitchell’s book.

    机器学习:具有数据意义的算法的艺术和科学 (您可以在此处查看本书的幻灯片):本书非常适合初学者。 讨论了许多现实生活中的应用程序,您可能会在Tom Mitchell的书中发现这些应用程序缺乏。

  • Machine Learning Yearning: this book by Andrew Ng is not yet complete, but it’s bound to be an excellent reference for those who are learning ML.

    机器学习的渴望 :Andrew Ng的这本书尚未完成,但对于学习ML的人来说无疑是一个很好的参考。

  • Algorithms and Data Structures by Jurg Nievergelt and Klaus Hinrichs

    Jurg Nievergelt和Klaus Hinrichs的算法和数据结构

  • Also check out the Data Mining for the Masses by Matthew North. You’ll find that this book guides you through some of the most difficult topics.

    还可以查看Matthew North 的《大众数据挖掘》 。 您会发现这本书指导您完成一些最困难的主题。

  • Introduction to Machine Learning by Alex Smola and S.V.N. Vishwanathan.

    Alex Smola和SVN Vishwanathan撰写的机器学习入门

Videos / MOOCs are awesome for those who learn by watching and listening. There are a lot of MOOCs and videos out there, but it can also be hard to find your way through all those materials. Below is a list of the most notable ones:

对于那些通过观看和收听来学习的人来说, 视频/ MOOC非常棒。 那里有很多MOOC和视频,但是在所有这些材料中也很难找到自己的方式。 以下是最著名的列表:

At this point, it’s important for you to go over the separate techniques and grasp the whole picture. This starts with understanding key concepts: the distinction between supervised and unsupervised learning, classification and regression, and so on. Manual (written) exercises can come in handy. They can help you understand how algorithms work and how you should go about them. You’ll most often find these written exercises in courses from universities. Check out this ML course by Portland State University.

在这一点上,对您来说,重要的是要复习单独的技术并掌握整个情况。 首先要理解关键概念:有监督和无监督学习之间的区别,分类和回归等等。 手动(书面)练习可以派上用场。 它们可以帮助您了解算法如何工作以及应该如何使用它们。 您通常会在大学课程中找到这些书面练习。 看看波特兰州立大学的ML课程

3.动手 (3. Get hands-on)

Knowing the theory and understanding the algorithms by reading and watching is all good. But you also need to surpass this stage and get started with some exercises. You’ll learn to implement these algorithms and apply the theory that you’ve learned.

通过阅读和观看知识了解理论并理解算法都是很好的。 但是您还需要超越此阶段并开始一些练习。 您将学习实现这些算法并应用所学的理论。

First, you have tutorials which introduce you to the basics of machine learning in Python and R. The best way is, of course, to go for interactive tutorials:

首先,您有一些教程,向您介绍Python和R中的机器学习基础。当然,最好的方法是进行交互式教程:

Also check out the following tutorials, which are static and will require you to work in an IDE:

另请查看以下教程,这些教程是静态的,需要您在IDE中工作:

Besides the tutorials, there are also courses. Taking courses will help you apply the concepts that you’ve learned in a focused way. Experienced instructors will help you. Here are some interactive courses for Python and ML:

除了教程,还有课程。 参加课程将帮助您集中精力应用所学的概念。 经验丰富的教练将为您提供帮助。 以下是一些针对Python和ML的交互式课程:

  • Supervised Learning with scikit-learn: you’ll learn how to build predictive models, tune their parameters, and predict how well they will perform on unseen data. All while using real world datasets. You’ll do so with Scikit-Learn.

    使用scikit-learn进行监督学习 :您将学习如何构建预测模型,调整其参数以及预测它们在看不见的数据上的表现如何。 始终使用真实世界的数据集。 您将使用Scikit-Learn进行操作。

  • Unsupervised Learning in Python: shows you how to cluster, transform, visualize, and extract insights from unlabeled datasets. At the end of the course, you’ll build a recommender system.

    Python中的无监督学习 :向您展示如何从未标记的数据集中进行聚类,转换,可视化和提取见​​解。 在课程结束时,您将构建一个推荐系统。

  • Deep Learning in Python: you’ll gain hands-on, practical knowledge of how to use deep learning with Keras 2.0, the latest version of a cutting-edge library for deep learning in Python.

    使用Python进行深度学习 :您将获得有关如何在Keras 2.0中使用深度学习的动手实践知识,Keras 2.0是用于Python深度学习的前沿库的最新版本。

  • Applied Machine Learning in Python: introduces the learner to applied ML and focuses more on the techniques and methods than on the statistics behind these methods.

    Python中的应用机器学习 :向学习者介绍应用机器学习,并更多地侧重于技术和方法,而不是这些方法背后的统计数据。

For those who are learning ML with R, there are also these interactive courses:

对于那些正在学习R的ML的人,还提供以下交互式课程:

  • Introduction to Machine Learning gives you a broad overview of the discipline’s most common techniques and applications. You’ll gain more insight into the assessment and training of different ML models. The rest of the course focuses on an introduction to three of the most basic ML tasks: classification, regression, and clustering.

    机器学习入门为您提供了该学科最常见的技术和应用的广泛概述。 您将获得有关不同ML模型的评估和培训的更多见解。 本课程的其余部分重点介绍三个最基本的ML任务:分类,回归和聚类。

  • R: Unsupervised Learning provides a basic introduction to clustering and dimensionality reduction in R from a ML perspective. This allows you to get from data to insights as quickly as possible.

    R:无监督学习从ML的角度为R中的聚类和降维提供了基本的介绍。 这使您可以尽快从数据获取见解。

  • Practical Machine Learning covers the basic components of building and applying prediction functions with an emphasis on practical applications.

    实用机器学习涵盖了构建和应用预测功能的基本组件,重点是实际应用。

Lastly, there are also books that go over ML topics in a very applied way. If you’re looking to learn with the help of text and an IDE, check out these books:

最后,还有一些书籍以非常实用的方式讨论了ML主题。 如果您想借助文本和IDE来学习,请查看以下书籍:

4.实践 (4. Practice)

Practice is even more important than getting hands-on and revising the material with Python. This step was probably the hardest one for me. Check out how other people have implemented ML algorithms when you have done some exercises. Then, get started on your own projects that illustrate your understanding of ML algorithms and theories.

实践甚至比动手动手并用Python修改材料更重要。 这一步对我来说可能是最难的一步。 完成一些练习后,请查看其他人如何实现ML算法。 然后,从您自己的项目开始,这些项目说明您对ML算法和理论的理解。

One of the most straightforward ways is to see the exercises a tiny bit bigger. You want to do a bigger exercise which requires you to do more data cleaning and feature engineering.

最直接的方法之一是查看练习稍大一点。 您想做一个更大的练习,这需要您做更多的数据清理和功能设计。

Tip: don’t forget that there are handy resources to help you out when you’re practicing — Check out these data science cheat sheets.

提示 :在练习时,请不要忘记有方便的资源来帮助您-查看这些数据科学备忘单

5.项目 (5. Projects)

Doing small exercises is good. But in the end, you’ll want to make a project in which you can demonstrate your understanding of the ML algorithms with which you’ve been working.

做些小运动是好的。 但是最后,您将需要创建一个项目,在其中可以证明您对正在使用的ML算法的理解。

The best exercise is to implement your own ML algorithm. You can read more about why you should do this exercise and what you can learn from it in the following pages:

最好的练习是实现自己的ML算法。 您可以在以下页面中阅读有关为什么要进行此练习的更多信息,以及可以从中学到的知识:

Next, you can check out the following posts and repositories. They’ll give you some inspiration from others and will show how they have implemented ML algorithms.

接下来,您可以查看以下帖子和存储库。 他们将从其他人那里给您一些启发,并说明他们如何实现ML算法。

6.不要停止 (6. Don’t stop)

Learning ML is something that should never stop. As many will confirm, there are always new things to learn — even when you’ve been working in this area for a decade.

学习机器学习是永不停息的。 正如许多人会确认的那样,即使您已经在这一领域工作了十年,也总是有新的东西需要学习。

There are, for example, ML trends such as deep learning which are very popular right now. You might also focus on other topics that aren’t central at this point but which might be in the future. Check out this interesting question and the answers if you want to know more.

例如, 机器学习趋势(例如深度学习)目前非常流行。 您可能还会专注于目前尚不重要但将来可能会涉及的其他主题。 如果您想了解更多信息,请查看此有趣的问题和答案

Papers may not be the first thing that spring to mind when you’re worried about mastering the basics. But they are your way to get up to date with the latest research. Papers are not for those who are just starting out. They are definitely a good fit for those who are more advanced.

当您担心掌握基础知识时,想到的第一件事可能不是论文 。 但是,它们是您了解最新研究的方法。 论文不适合那些刚刚起步的人。 它们绝对适合更高级的人。

Other technologies are also something to consider. But don’t worry about them when you’re just starting out. You can, for example, focus on adding Python or R (depending on which one you already know) to your skill set. You can look through this post to find interesting resources.

其他技术也是要考虑的东西。 但是,当您刚入门时,不必担心它们。 例如,您可以集中精力将Python或R(取决于您已经知道的哪个)添加到您的技能组合中。 您可以浏览这篇文章以找到有趣的资源。

If you also want to move towards big data, you could consider looking into Spark. Here are some interesting resources:

如果您还想转向大数据,则可以考虑研究Spark。 以下是一些有趣的资源:

Other programming languages, such as Java, JavaScript, C, and C++ are gaining importance in ML. In the long run, you can consider also adding one of these languages to your to-do list. You can use these blog posts to guide your choice:

其他编程语言(例如Java,JavaScript,C和C ++)在ML中正变得越来越重要。 从长远来看,您还可以考虑将这些语言之一添加到您的工作清单中。 您可以使用这些博客文章来指导您的选择:

7.利用那里的所有材料 (7. Make use of all the material that is out there)

Machine learning is a difficult topic which can make you lose your motivation at some point. Or maybe you feel you need a change. In such cases, remember that there’s a lot of material on which you can fall back. Check out the following resources:

机器学习是一个困难的话题,它会使您在某些时候失去动力。 也许您觉得自己需要改变。 在这种情况下,请记住,有很多材料可以依靠。 查看以下资源:

Podcasts. Great resource for continuing your journey into ML and staying up-to-date with the latest developments in the field:

播客 。 继续学习ML并保持该领域最新动态的宝贵资源:

There are, of course, many more podcasts, but this list is just to get you started!

当然,还有更多的播客,但是此列表只是为了帮助您入门!

Documentation and package source code are two ways to get deeper into the implementation of the ML algorithms. Check out some of these repositories:

文档和程序包源代码是深入了解ML算法实现的两种方法。 查看以下一些存储库:

  • Scikit- Learn: Well-known Python ML package

    Scikit - Learn :著名的Python ML软件包

  • Keras: Deep learning package for Python

    Keras :Python深度学习软件包

  • caret: very popular R package for Classification and Regression Training

    插入符 :非常流行的用于分类和回归训练的R包

Visualizations are one of the newest and trendiest ways to get into the theory of ML. They’re fantastic for beginners, but also very interesting for more advanced learners. The following visualizations will intrigue you and will help you gain more understanding into the workings of ML:

可视化是进入ML理论的最新方式。 对于初学者来说,它们很棒,但是对于更高级的学习者来说,它们也非常有趣。 以下可视化效果会吸引您,并会帮助您进一步了解ML的工作原理:

您可以立即开始 (You Can Get Started Now)

Now it’s up to you. Learning ML is something that’s a continuous process, so the sooner you get started, the better. You have all of the tools in your hands now to get started. Good luck and make sure to let us know how you’re progressing.

现在由您决定。 学习机器学习是一个连续的过程,因此,越早开始越好。 现在,您已经掌握了所有工具以开始使用。 祝您好运,并确保让我们知道您的进度。

This post is based on an answer I gave to the Quora question How Does A Total Beginner Start To Learn Machine Learning.

这篇文章基于我对Quora问题的回答,即一个完全的初学者如何开始学习机器学习

翻译自: https://www.freecodecamp.org/news/how-machines-learn-a-practical-guide-203aae23cafb/

机器学习实用指南

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值