

Editor’s Note: Automated Machine Learning is rapidly making the benefits of AI available to companies without fleets of data scientists and Machine Learning Engineers. In this post Wee Hyong Tok, Principle Data Science Manager at Microsoft, explains how Automated ML works and how it can empower businesses to do more.

编者注: 自动化机器学习正在Swift将AI的优势提供给没有数据科学家和机器学习工程师队伍的公司。 在这篇文章中,Microsoft的首席数据科学经理Wee Hyong Tok解释了Automated ML的工作原理,以及它如何使企业能够做更多的事情。

By using machine learning — fueled by data — businesses can engage their customers, optimize operations, empower employees, and fundamentally change and transform products. Automated machine learning will be a catalyst to empower businesses to enable faster business value realizations.

通过使用以数据为动力的机器学习,企业可以吸引客户,优化运营,赋予员工权力并从根本上改变和改造产品。 自动化机器学习将成为推动企业实现更快的业务价值实现的催化剂。

Automated machine learning enables teams to spontaneously create machine learning pipelines and train machine learning models. This helps both nonexperts and experts in the organization work toward the goal of building innovative machine learning solutions. In this report, you will learn how automated machine learning can jump-start adoption of machine learning in your organization.

自动化的机器学习使团队能够自发创建机器学习管道并训练机器学习模型。 这有助于组织中的非专家和专家朝着构建创新的机器学习解决方案的目标努力。 在此报告中,您将学习自动机器学习如何在组织中快速开始采用机器学习。

What Is Automated Machine Learning?


As described by, automated machine learning “provides methods and processes to make machine learning available for non–machine learning experts, to improve efficiency of machine learning, and to accelerate research on machine learning.” In short, automated machine learning empowers non–machine learning experts to get started with machine learning quickly by automating many of the tasks that data scientists perform each day.

如AutoML.org所述,自动化机器学习“提供了使非机器学习专家可以使用机器学习,提高机器学习效率并加速机器学习研究的方法和过程。” 简而言之,自动化机器学习通过使数据科学家每天执行的许多任务自动化,使非机器学习专家能够快速入门机器学习。

Figure 1 shows some of the common tasks handled by automated machine learning. These tasks include the selection of machine learning algorithms, feature engineering, and selecting the best way to tune the machine learning algorithm (commonly known as hyperparameter tuning).

图1显示了自动机器学习处理的一些常见任务。 这些任务包括选择机器学习算法,特征工程以及选择调整机器学习算法的最佳方法(通常称为超参数调整)。

This is represented in a circle, as shown in the diagram, because it is an iterative process, where automated machine learning iterates through different combinations of algorithms, hyperparameter values, and many more tasks to deliver a ranked list of machine learning pipelines, and evaluates them based on the desired performance metric (e.g., accuracy, precision, AUC, or F1 score).


Common tasks handled by automated machine learning


As businesses explore the use of automated machine learning, it is natural to consider the risks and liabilities of using automatically selected machine learning pipelines and models. For example, will the machine learning pipelines and models be black boxes, which are hard to maintain and interpret? In some regulated industries, it is also important to explain predicted results.

随着企业探索自动机器学习的使用,自然会考虑使用自动选择的机器学习管道和模型的风险和责任。 例如,机器学习管道和模型是否会是黑匣子,难于维护和解释? 在某些受管制的行业中,解释预测结果也很重要。

The machine learning pipelines and models that are identified by automated machine learning should be evaluated and reviewed by the same company processes that govern the deployment of AI solutions in the company, ensuring compliance with company policies as well as regulations for the specific industry. To build trust, many of the existing techniques for model interpretability can be used in both the training and inference phases.

由自动机器学习识别的机器学习管道和模型应由管理公司AI解决方案部署的同一公司流程进行评估和审查,以确保符合公司政策和特定行业的法规。 为了建立信任,可以在训练和推理阶段中使用许多现有的模型可解释性技术。

As data scientists explore the use of automated machine learning, a common question that one might ask is: are there any reasons data scientists should not use automated machine learning? The answer is: it depends.

在数据科学家探索自动化机器学习的使用时,人们可能会问的一个常见问题是:数据科学家不应该出于任何原因使用自动化机器学习吗? 答案是:这取决于。

Data scientists who prefer maximum flexibility in performing feature engineering (and who prefer choosing and tuning the machine learning algorithms) might prefer to develop the code without the aid of automated machine learning. In some cases, it may be necessary to use specialized machine learning algorithms not provided by automated machine learning.

数据科学家在执行要素工程时更喜欢最大的灵活性(并且更喜欢选择和调整机器学习算法),他们可能更喜欢在不借助自动机器学习的情况下开发代码。 在某些情况下,可能有必要使用自动化机器学习未提供的专用机​​器学习算法。

Image for post

自动化机器学习对企业至关重要的三个原因 (Three Reasons Why Automated Machine Learning Matters to Businesses)

Automated machine learning empowers businesses to achieve more by providing a head start for anyone who wants to get started with building AI solutions. As Google’s Jeff Dean pointed out in his talk at the 2019 International Conference on Machine Learning (ICML), “Millions of organizations worldwide have machine problems (most don’t even realize this yet) but only tens or hundreds of thousands of people trained to solve these problems” (

自动化机器学习为想要开始构建AI解决方案的任何人提供领先优势,从而使企业能够实现更多目标。 正如Google的Jeff Dean在2019年国际机器学习大会(ICML)上的讲话中所指出的那样,“全球数以百万计的组织都遇到了机器问题(大多数甚至还没有意识到这一点),但是只有成千上万的受过培训的人解决这些问题”(。

Reducing the time to market for AI solutions matters


By reducing the time taken to develop and train machine learning models, AI solutions can be prototyped faster, demonstrate proof of value quickly, and be hardened and improved as they make their way to production deployment.


Empowering data scientists to achieve more matters


Automated machine learning not only empowers nonexperts but also benefits experienced data scientists. Data science teams will be more productive and efficient, as automated machine learning helps data scientists quickly work through various machine learning algorithms and tuning solutions. This enables data scientists to quickly achieve a good baseline and get a head start toward the best models.

自动化的机器学习不仅使非专家有能力,而且还使经验丰富的数据科学家受益。 数据科学团队将提高生产力和效率,因为自动机器学习可帮助数据科学家通过各种机器学习算法和调整解决方案快速工作。 这使数据科学家能够快速达到良好的基准,并抢先迈向最佳模型。

Making machine learning simpler for everyone in the organization matters


Automated machine learning helps make all machine learning simpler for everyone in the business (from data scientists to analysts to anyone in the organization interested in using machine learning to solve business problems). It intelligently figures out the best ways to prepare the data for machine learning, select the machine learning algorithms, tune the hyperparameter values for the algorithms, and optimize the end-to-end machine learning pipeline for the performance metrics that you want to use for model evaluation.

自动化的机器学习可帮助企业中的每个人(从数据科学家到分析师再到组织中有兴趣使用机器学习解决业务问题的任何人)简化所有机器学习。 它智能地找出准备数据以进行机器学习,选择机器学习算法,调整算法的超参数值以及针对要用于性能指标的端到端机器学习管道的最佳方法。模型评估。

Wee Hyong is a Principal Data Science Manager with the AI Platform team at Microsoft. He leads the Engineering and Data Science team for the AI for Earth program. Wee Hyong has worn many hats in his career — developer, program/product manager, data scientist, researcher, and strategist, and his track record of leading successful engineering and data science teams has given him unique super powers to be a trusted AI advisor to many customers.

Wee Hyong是Microsoft AI Platform团队的首席数据科学经理。 他领导着AI for Earth计划的工程和数据科学团队。 Wee Hyong在他的职业生涯中戴了许多帽子-开发人员,程序/产品经理,数据科学家,研究员和战略家,他在成功的领先工程和数据科学团队中的往绩给予他独特的超级能力,使其成为值得信赖的AI顾问很多客户。







