《成为一名机器学习工程师》_成为机器学习的拉斐尔·纳达尔

《成为一名机器学习工程师》

by Sudharsan Asaithambi

通过Sudharsan Asaithambi

成为机器学习的拉斐尔·纳达尔 (Become the Rafael Nadal of Machine Learning)

One year back, I was a newbie to the world of Machine Learning. I used to get overwhelmed by small decisions, like choosing the language to code with, choosing the right online courses, or choosing the correct algorithms.

一年前,我是机器学习领域的新手。 我过去常常被一些小的决定所淹没,例如选择编码语言,选择正确的在线课程或选择正确的算法。

So, I have planned to make it easier for folks to get into Machine Learning.

因此,我计划让人们更轻松地学习机器学习。

I’ll assume that many of us are starting from scratch on our Machine Learning journey. Let’s find out how current professionals in the field reached their destination, and how we can emulate them on our journey.

我假设我们中的许多人是在机器学习之旅中从头开始的。 让我们找出当前该领域的专业人员如何到达目的地,以及我们如何在旅途中效仿他们。

I will illustrate how you can learn Data Science by drawing a parallel between how Rafael Nadal learned to play tennis, and how you can learn Machine Learning.

我将通过拉斐尔·纳达尔(Rafael Nadal)的打网球方式与机器学习的方式之间的相似之处来说明如何学习数据科学。

投入自己-阶段1 (Commit Yourself — Stage 1)

Nadal had sports talent all around him in his family. Inspired by them, he began his tennis journey at the age of 3.

纳达尔在他的家人中都拥有体育才能。 受他们的启发,他从3岁开始网球之旅。

For anyone starting out in Machine Learning, it’s important to surround yourselves with people who are also learning, teaching and practicing Machine Learning.

对于刚开始学习机器学习的任何人来说,重要的是要让自己也同时学习,教授和练习机器学习。

Learning the ropes is not easy if you do it alone. So, commit yourselves to learning Machine Learning — and find data science communities to help make your entry less painful.

如果独自一人学习绳索并不容易。 因此,请致力于学习机器学习-并找到数据科学社区,以帮助减轻您的入学痛苦。

学习生态系统-第二阶段 (Learn the Ecosystem — Stage 2)

Rafael Nadal learnt the not only the rules of Tennis, but also the surrounding ‘ecosystem’.

拉斐尔·纳达尔(Rafael Nadal)不仅学习了网球规则,还学习了周围的“生态系统”。

He learnt about the different types of rackets, balls, court surfaces. He learned about the scoring in tennis. He enrolled himself for a tennis coaching.

他了解了球拍,球和球场表面的不同类型。 他了解了网球得分的知识。 他报名参加了网球教练。

探索机器学习生态系统 (Discover the Machine Learning ecosystem)

Data Science is a field which has embraced and made full use of open source platforms. While data analysis can be conducted in a number of languages, using the right tools can make or break projects.

数据科学是一个已经拥抱并充分利用开源平台的领域。 虽然可以使用多种语言进行数据分析,但是使用正确的工具可以创建或破坏项目。

Data Science libraries are flourishing in the Python and R ecosystems. See here for an infographic on Python vs R for data analysis.

数据科学图书馆在PythonR生态系统中蓬勃发展。 参见此处获取有关Python与R进行数据分析的信息图

Whichever language you choose, Jupyter Notebook and RStudio makes our life much easier. They allow us to visualize data while manipulating it. Follow this link to read more on the features of Jupyter Notebook.

无论选择哪种语言, Jupyter NotebookRStudio 都能使我们的生活变得更加轻松。 它们使我们能够在处理数据时可视化数据。 单击此链接以阅读有关Jupyter Notebook功能的更多信息。

Kaggle, Analytics Vidhya, MachineLearningMastery and KD Nuggets are some of the active communityies where data scientists all over the world enrich each other’s learning.

Kaggle,Analytics Vidhya,MachineLearningMastery和KD Nuggets是活跃的社区,全世界的数据科学家都在此相互学习。

Machine Learning has been democratized by online courses or MOOCs from Coursera, EdX and others, where we learn from amazing professors at world class universities. Here’s a list of the top MOOCs on data science available right now.

机器学习已被CourseraEdX等公司的在线课程或MOOC民主化,我们从世界一流大学的杰出教授那里学习。 这是目前可用的数据科学顶级MOOC列表

巩固基金会-第三阶段 (Cement the Foundation — Stage 3)

拉斐尔·纳达尔(Rafael Nadal)掌握了基本动作 (Rafael Nadal learned the basic shots)

Nadal’s coach taught him the forehand and backhand shots. This is the main foundation of tennis. Rafael could play the match competently with these basic shots.

纳达尔的教练教给他正手和反手射击。 这是网球的主要基础。 拉斐尔可以凭借这些基本投篮胜任比赛。

学习操纵数据 (Learn to manipulate data)

Data scientists, according to interviews and expert estimates, spend 50 percent to 80 percent of their time mired in the mundane labor of collecting and preparing unruly digital data, before it can be explored for useful nuggets. - Steve Lohr of New York Times

根据采访和专家估计,数据科学家将其50%至80%的时间都花在了收集和准备不规则数字数据的繁琐工作上,然后才可以探索有用的块。 -纽约时报的史蒂夫·洛尔

‘Data Crunching’ is the soul of the whole Machine Learning workflow. To help with this process, the Pandas library in python or R’s DataFrames allow you to manipulate and conduct analysis. They provide data structures for relational or labeled data.

“数据处理”是整个机器学习工作流程的灵魂。 为了帮助完成此过程,可以使用python或R's DataFrames中的Pandas库来操纵和进行分析。 它们提供关系数据或标记数据的数据结构。

Data science is more than just building machine learning models. It’s also about explaining the models and using them to drive data-driven decisions. In the journey from analysis to data-driven outcomes, data visualization plays a very important role of presenting data in a powerful and credible way.

数据科学不仅仅是构建机器学习模型。 它还涉及解释模型并使用它们来驱动数据驱动的决策。 在从分析到以数据为依据的结果的过程中,数据可视化扮演着以强大而可靠的方式呈现数据的非常重要的角色。

Matplotlib library in Python or ggplot in R offer complete 2D graphic support with very high flexibility to create high quality data visualizations.

Matplotlib Python中的库或R中的ggplot提供了完整的2D图形支持,并且具有很高的灵活性,可以创建高质量的数据可视化。

These are some of the libraries you will be spending most of your time on when conducting the analysis.

这些是进行分析时将花费大部分时间的一些库。

日复一日地练习—阶段4 (Practice day in and day out — Stage 4)

Rafael Nadal, when asked how much he trained:

当被问及他接受了多少训练时,拉斐尔·纳达尔(Rafael Nadal):

“I train four hours a day, 210 days a year. If we add to that I play around 80 matches per year, each one lasting an average of two hours. That is 1000 hours playing tennis per year — and that is without counting the training days during tournaments.”
“我一年210天,每天训练四个小时。 如果再加上我每年参加约80场比赛,平均每场比赛持续2个小时。 那就是每年打网球1000个小时-这还不包括比赛期间的训练天数。”
学习机器学习算法并进行实践 (Learn Machine Learning algorithms and practice them)

After the foundation is set, you get to implement the Machine Learning algorithms to predict and do all the cool stuff.

设置好基础之后,您就可以实现机器学习算法来预测和完成所有有趣的工作。

The Scikit-learn library in Python or the caret, e1071 libraries in R provide a range of supervised and unsupervised learning algorithms via a consistent interface.

Python中的Scikit-learn库或R中的carete1071库通过一致的接口提供了一系列有监督和无监督的学习算法。

These let you implement an algorithm without worrying about the inner workings or nitty-gritty details.

这些使您可以实现算法,而不必担心内部工作原理或细节问题。

Apply these machine learning algorithms in the use cases you find all around you. This could either be in your work, or you can practice in Kaggle competitions. In these, data scientists all around the world compete at building models to solve problems.

在周围发现的用例中应用这些机器学习算法。 这可以在您的工作中,也可以在Kaggle比赛中进行练习。 在这些工具中,世界各地的数据科学家都在竞争解决问题的模型构建方面。

Simultaneously, understand the inner workings of one algorithm after another. Starting with ‘Hello World!’ of Machine Learning, Linear Regression then move to Logistic Regression, Decision Trees to Support Vector Machines. This will require you to brush up your statistics and linear algebra.

同时,了解一种算法的内部工作原理。 从“ Hello World!”开始 机器学习, 线性回归然后转向逻辑回归决策树 支持向量机 。 这将要求您重新整理统计信息和线性代数。

Coursera Founder Andrew Ng, a pioneer in AI has developed a Machine Learning course which gives you a good starting point to understanding inner workings of Machine Learning algorithms.

Coursera创始人AI的先驱Andrew Ng开发了机器学习课程 ,为您提供了一个很好的起点,让您了解机器学习算法的内部工作原理。

学习高级技能-阶段5 (Learn the advanced skills— Stage 5)

拉斐尔·纳达尔(Rafael Nadal)学会了打高手 (Rafael Nadal learned to play advanced shots)

Nadal, while concentrating on the fundamental play, also was introduced to the advanced shots. The shots that only professionals who play tennis day in and day out are able to pull off.

纳达尔(Nadal)在专注于基本比赛的同时,也向他介绍了高级投篮。 只有日复一日打网球的专业人士才能投篮。

学习复杂的机器学习算法和深度学习架构 (Learn complex Machine Learning Algorithms and Deep Learning architectures)

While Machine Learning as a field was established long back, the recent hype and media attention is primarily due to Machine Learning applications in AI fields like Computer Vision, Speech Recognition, Language Processing. Many of these have been pioneered by the tech giants like Google, Facebook, Microsoft.

虽然机器学习作为一个领域早已建立,但最近的炒作和媒体关注主要归因于AI领域中的机器学习应用,例如计算机视觉,语音识别,语言处理。 其中许多都是由Google,Facebook,Microsoft等科技巨头开创的。

These recent advances can be credited to the progress made in cheap computation, the availability of large scale data, and the development of novel Deep Learning architectures.

这些最新进展可以归功于廉价计算,大规模数据的可用性以及新型深度学习架构的发展。

To work in Deep Learning, you will need to learn how to process unstructured data — be it free text, images, or sounds.

要在深度学习中工作,您将需要学习如何处理非结构化数据-无论是自由文本,图像还是声音。

You will learn to use platforms like TensorFlow or Torch, which lets us apply Deep Learning without worrying about low level hardware requirements. You will learn Reinforcement learning, which has made possible modern AI wonders like AlphaGo Zero.

您将学习使用TensorFlowTorch之类的平台,这使我们能够应用深度学习,而不必担心底层硬件的需求。 您将学习强化学习,这使诸如AlphaGo Zero之类的现代AI奇迹成为可能。

立即迈出学习机器学习的第一步! (Take your first step towards learning Machine Learning now!)

  1. Install Anaconda and use Jupyter to write Python

    安装Anaconda并使用Jupyter编写Python

Go through some Python tutorials and learn its fundamental data structures and syntax.

通过一些Python教程 ,学习其基本数据结构和语法。

2. Surround yourselves with Data Science. Create account at:

2.自己掌握数据科学。 在以下位置创建帐户:

Kaggle and checkout the kernels written by top data scientists. Kaggle helps you to lubricate and establish a standard workflow to adhere to any Data Science Problem

Kaggle并签出由顶级数据科学家编写的内核。 Kaggle可帮助您润滑并建立标准的工作流程以遵守任何数据科学问题

Analytics Vidhya: This website is a goto place for many data scientists. This site boasts of a 4 million unique visitors per month and has a very active community.

Analytics Vidhya :该网站是许多数据科学家的首选之地。 该网站每月拥有400万唯一身份访问者,并且拥有非常活跃的社区。

●Checkout YouTube pyData Channel. pyData is a conference arranged by the open source community to educate analysts with the latest developments in Data Science. This gives you

●结帐YouTube pyData Channel 。 pyData是一个由开源社区组织的会议,目的是教育分析人员了解数据科学的最新发展。 这给你

● Use podcasts to learn about the latest tools and technology in AI. Podcasts is a great way to spend time on your daily chores, be it jogging, to arranging your closet or while commuting. If you are new to podcasts, download the Podcast addict app onto your phone.

●使用播客了解AI中的最新工具和技术。 播客是一种在日常琐事上花费时间的好方法,无论是慢跑,安排壁橱还是上下班途中。 如果您不熟悉播客,请将播客上瘾者应用程序下载到手机上。

Machine Learning — Software Engineering Daily | Every week Jeff interviews people from the heart of Data Science. It gives you the very rare early peek into what’s going on in silicon valley, helping you to get onto new techniques and technologies. It gives you so many new ideas to implement into your work. Can’t recommend this enough.

机器学习—软件工程日报| 杰夫每周都会采访来自数据科学中心的人们。 它为您提供了非常罕见的早期窥视硅谷动态的信息,可帮助您掌握新技术。 它为您提供了许多新想法,可以在您的工作中实施。 不能推荐这个。

● Medium

●中

Follow some of the Machine Learning publications here on Medium:

在Medium上关注一些机器学习出版物:

● Go to Coursera and Edx, and check out the various Machine Learning courses available.

●转到CourseraEdx ,并查看可用的各种机器学习课程。

I will end this post with this quote by Robin Sharma:

我将以Robin Sharma的话作为结尾:

Every Pro was Once an Amateur.
每个职业选手都曾经是业余选手。
Every Expert was Once a Beginner.
每个专家都是初学者。
So Dream Big.
所以梦想大。
And Start Now.
并立即开始。

Please comment below to tell us why you are planning to start your Machine Learning journey, and how you plan to do so.

请在下面发表评论,以告诉我们您为何计划开始您的机器学习之旅,以及您打算如何开始。

And for all you Machine Learning pros, give us the nuances of what works and what doesn’t. Please comment below on how you started your Machine Learning journey and what expedited and hindered your learning process.

对于所有机器学习专家来说,请告诉我们哪些有效和哪些无效。 请在下面评论您是如何开始机器学习之旅的,以及加速和阻碍学习过程的因素。

翻译自: https://www.freecodecamp.org/news/baby-steps-to-learn-machine-learning-from-a-tennis-fan-d4171f51c23f/

《成为一名机器学习工程师》

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值