

Asking Machine Learning/AI hires to have fancy degrees is outdated. Here’s why.

要求机器学习/ AI员工拥有高级学位已经过时了。 这就是为什么。

Should your machine learning hire have a PhD? Do you need a PhD to work in ML?

您的机器学习应聘者应具有博士学位吗? 您需要博士学位才​​能从事ML吗?

I see PhD and Masters degrees listed as requirements in ML job descriptions all the time. The very first Google Jobs result I opened for “Machine Learning Engineer” required:

我一直都将博士学位和硕士学位列为ML职位描述中的要求。 我为“机器学习工程师”打开的第一个Google Jobs结果要求:

Ph.D. in Data Science, Machine Learning, Statistics, Operations Research or related field

博士 在数据科学,机器学习,统计,运筹学或相关领域

M.S. in related field with 5+ years experience applying data science techniques to real business problems.


Having a PhD or even Master’s degree is a really unusual ask for software engineers. We don’t expect this of developers working on networking or security or systems architecture or app development. So what makes machine learning so special?

对于软件工程师来说,拥有博士学位甚至硕士学位是非常不寻常的。 我们不希望从事网络或安全性,系统架构或应用程序开发的开发人员如此。 那么,什么使机器学习如此特别呢?

Some might say that ML is uniquely complex and math-y, the domain of scientists rather than software developer hacks (ouch!).


I don’t buy it. ML may be tricky, but so is cryptography and distributed systems and graphics and tons of other topics in Computer Science. Yet we don’t require developers have PhDs to work on those things. I think it’s something else:

我不买 ML可能很棘手,但是密码学和分布式系统,图形以及计算机科学中的其他许多主题也是如此。 但是,我们不要求开发人员拥有博士学位来从事这些工作。 我认为这是另外一回事:

We often forget that machine learning is really new for most of us in tech. Just five years ago, my colleagues were still talking about deep learning as a dubious bet. Tools were difficult to use. Beefy hardware designed for ML was hard to get. Model quality was far from where it is today.

我们常常忘记了机器学习对于我们大多数人来说都是真正的新技术。 就在五年前,我的同事们还在谈论深度学习是一个可疑的赌注。 工具难以使用。 专为ML设计的强大硬件很难获得。 模型质量远非今天。

As a result, most of us didn’t learn about ML in school. Five years ago, Princeton, my alma matter, offered only around ~3 ML/AI classes. Machine learning was definitely not a “standard” part of your typical Computer Science curriculum, and you could easily graduate without learning much about it. Online resources were scant. Then, when AI suddenly became the new hotness, lots of folks rushed back into Master’s programs to fill the newly-relevant gap in their education.

结果,我们大多数人在学校都不了解ML。 五年前,我的母校普林斯顿大学只提供了大约3个ML / AI课程。 机器学习绝对不是典型的计算机科学课程的“标准”部分,并且无需学习太多就可以轻松毕业。 在线资源很少。 然后,当AI突然成为新的热点时,许多人赶回了硕士课程,以填补他们在教育方面新近出现的空白。

With a shortage of ML talent, it’s no wonder that if you did want to hire someone with experience in the field, that person would probably be an academic.


Meanwhile, engineers began learning ML on the job. Even at Google, one of the world’s largest employers of PhD AI researchers, tons of engineers who work on ML products have limited prior experience with the technology. They learn through online resources, internal courses (like Google’s Machine Learning Crash Course), or by taking on small chunks of projects and learning as they go.

同时,工程师开始在工作中学习ML。 即使在Google(全球最大的AI博士研究人员的雇主之一),从事ML产品研究的大量工程师在使用该技术方面的经验也很有限。 他们通过在线资源,内部课程(例如Google的机器学习速成课程 )或通过承担小部分项目并随需学习来学习。

Five years is an eon in tech, and the data science landscape has changed. It’s much easier to learn machine learning outside of the classroom today than it used to be, and our toolset has become significantly more user-friendly (see PyTorch, TensorFlow 2.0, Keras). Paired with an enormous and growing ecosystem of online resources, the determined developer can give herself a hefty ML education without ever spending a dime.

五年是技术上的一个新世纪,数据科学领域已经发生了变化。 与过去相比,今天在教室外学习机器学习要容易得多,并且我们的工具集变得更加用户友好 (请参阅PyTorch,TensorFlow 2.0,Keras)。 坚定不移的开发人员与庞大且不断增长的在线资源生态系统配合使用,可以在不花一分钱的情况下为自己提供沉重的ML教育。

The eligible Data Scientist/Machine Learning engineer hiring pool has changed, too. In 2019, the data science competition site Kaggle surveyed ~4000 data scientists. They found that while 52% of respondents had Master’s degrees, only 19% had PhDs. Meanwhile, the majority of respondents had only 3–5 years of experience, and skewed young (between 25 and 29 years old). There is a sizable and growing chunk of ambitious, self-taught data scientists just entering the job market.

合格的数据科学家/机器学习工程师的招聘人数也发生了变化。 在2019年,数据科学竞赛网站Kaggle对约4000名数据科学家进行了调查。 他们发现,尽管52%的受访者拥有硕士学位,但只有19%的拥有博士学位。 同时,大多数受访者只有3-5年的工作经验,而且年龄偏小(25至29岁)。 刚刚进入就业市场的雄心勃勃,自学成才的数据科学家中,有相当大的一部分正在增长。

We take for granted the fact that self-taught software engineers can be very talented (a recruiter for Google recently told me she interviewed a high schooler). Tech recruiters learned long ago that if they only hired MIT grads, they’d hire no one at all. So now we need to make a similar perspective shift in the way we hire data scientists.

我们认为,自学成才的软件工程师可能非常有才华(Google的一名招聘人员最近告诉我她采访了一名高中生)。 科技招聘人员很早以前就了解到,如果他们只雇用MIT毕业生,那么他们根本不会雇用任何人。 因此,现在我们需要在聘用数据科学家的方式上进行类似的观点转变。

This also means we need to start evaluating ML job applicants the way we evaluate software engineers. Instead of focusing on credentials, we need to spend more time building effective interviews that allow candidates to show off their skills no matter how or where they learned them. Building new criteria to hire ML engineers won’t be an easy task, but for employers, the payoff will be worth it.

这也意味着我们需要以评估软件工程师的方式开始评估ML的求职者。 与其专注于证书,我们需要花费更多的时间来进行有效的面试,使候选人无论其学习方式或学习方式都可以展示其技能。 建立聘用机器学习工程师的新标准并非易事,但对于雇主而言,回报是值得的。

让我们在 Twitter Instagram上 连接

Originally published at https://daleonai.com on August 26, 2020.

最初于 2020年8月26日 https://daleonai.com 发布

