uber开源_uber开源了此框架，用于调试机器学习模型

最新推荐文章于 2021-09-01 09:12:58 发布

weixin_26704853

最新推荐文章于 2021-09-01 09:12:58 发布

阅读量221

点赞数

文章标签： python 机器学习 java 人工智能深度学习

原文链接：https://medium.com/dataseries/uber-open-sourced-this-framework-for-debugging-machine-learning-models-c4a769728c3b

版权

uber开源

I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:

我最近开始了一份有关AI教育的新时事通讯。 TheSequence是无BS(意味着没有炒作，没有新闻等)，它是专注于AI的新闻通讯，需要5分钟的阅读时间。目标是让您了解机器学习项目，研究论文和概念的最新动态。请通过以下订阅尝试一下：

Machine learning programs defer from traditional software applications in the sense that their structure is constantly changing and evolving as the model builds more knowledge. As a result, debugging and interpreting machine learning models is one of the most challenging aspects of real world artificial intelligence(AI) solutions. Debugging, interpretation and diagnosis are active areas of focus of organizations building machine learning solutions at scale. Last year, Uber unveiled Manifold, a framework that utilizes visual analysis techniques to support interpretation, debugging, and comparison of machine learning models. Manifold brings together some very advanced innovations in the areas of machine learning interpretability to address some of the fundamental challenges of visually debugging machine learning models.

机器学习程序从传统的软件应用程序派生而来，其意义是随着模型建立更多的知识，它们的结构不断变化和发展。因此，调试和解释机器学习模型是现实世界人工智能(AI)解决方案最具挑战性的方面之一。调试，解释和诊断是组织大规模构建机器学习解决方案的组织关注的活跃领域。去年，Uber推出了Manifold ，该框架利用视觉分析技术来支持机器学习模型的解释，调试和比较。 Manifold在机器学习可解释性领域汇集了一些非常先进的创新，以解决视觉调试机器学习模型的一些基本挑战。

The challenge of debugging and interpreting machine learning models is nothing new and the industry has produced several tools and frameworks in this area. However, most of the existing stacks focus on evaluating a candidate model using performance metrics such as like log loss, area under curve (AUC), and mean absolute error (MAE) which, although useful, offer little insight in terms of the underlying reasons of the model’s performance. Another common challenge is that most machine learning debugging tools are constrained to a specific types of models(ex: regression or classification) and are very difficult to generalize across broader machine learning architectures. Consequently, data scientists spend tremendous amounts of time trying different model configurations until they can achieve specific performances.

调试和解释机器学习模型的挑战已经不是什么新鲜事了，业界已经在这一领域生产了一些工具和框架。但是，大多数现有堆栈都专注于使用性能指标来评估候选模型，例如对数损失，曲线下面积(AUC)和平均绝对误差(MAE) ，尽管有用，但在根本原因方面提供的见解很少模型的性能另一个常见的挑战是，大多数机器学习调试工具仅限于特定类型的模型(例如，回归或分类)，并且很难在更广泛的机器学习架构中进行概括。因此，数据科学家花费大量时间尝试不同的模型配置，直到他们能够实现特定的性能。

进入歧管 (Entering Manifold)

A company like Uber is operating hundreds of machine learning models across dozens of teams. As a result, debugging and interpretability of those models becomes a key aspect of the machine learning pipeline. With Manifold, the Uber engineering team wanted to accomplish some very tangible goals:

像Uber这样的公司正在数十个团队中运行数百种机器学习模型。结果，这些模型的调试和可解释性成为机器学习管道的关键方面。借助Manifold，Uber工程团队希望实现一些非常切实的目标：

· Debug code errors in a machine learning model.

·在机器学习模型中调试代码错误。

· Understand strengths and weaknesses of one model both in isolation and in comparison, with other models.

·分别了解一个模型与其他模型的优势和劣势。

· Compare and ensemble different models.

·比较和整合不同的模型。

· Incorporate insights gathered through inspection and performance analysis into model iterations.

·将通过检查和性能分析收集的见解纳入模型迭代中。

To accomplish those goals, Manifold segments the machine learning analysis process into three main phases: Inspection, Explanation and Refinement.

为了实现这些目标，Manifold将机器学习分析过程分为三个主要阶段：检查，说明和改进。

· Inspection: In the first part of the analysis process, the user designs a model and attempts to investigate and compare the model outcome with other existing ones. During this phase, the user compares typical performance metrics, such as accuracy, precision/recall, and receiver operating characteristic curve (ROC), to have coarse-grained information of whether the new model outperforms the existing ones.

· 检查：在分析过程的第一部分中，用户设计模型，并尝试调查模型结果并将其与其他现有结果进行比较。在此阶段中，用户将比较典型的性能指标，例如准确性，精度/召回率和接收器工作特性曲线(ROC)，以获取有关新模型是否优于现有模型的粗粒度信息。

· Explanation: This phase of the analysis process attempts to explain the different hypotheses formulated in the previous phase. This phase relies on comparative analysis to explain some of the symptoms of the specific models.

· 解释：分析过程的这一阶段试图解释上一阶段提出的不同假设。此阶段依靠比较分析来解释特定模型的某些症状。

· Refinement: In this phase, the user attempts to verify the explanations generated from the previous phase through encoding the knowledge extracted from the explanation into the model and testing the performance.

· 优化：在此阶段，用户尝试通过将从解释中提取的知识编码到模型中并测试性能来验证从上一阶段生成的解释。

The three steps of the machine learning analysis process materializes on a simple user interface that streamlines the debugging of machine learning models. The Manifold user interface consists of two main dialogs:

机器学习分析过程的三个步骤体现在一个简单的用户界面上，该界面简化了机器学习模型的调试。流形用户界面包含两个主要对话框：

1) Performance Comparison View: Provides a visual comparison between model pairs using a small multiple design, and a local feature interpreter view.

1) 性能比较视图：使用较小的多个设计提供模型对之间的可视比较，并提供局部特征解释器视图。

2) Feature Attribution View: Reveals a feature-wise comparison between user defined subsets and provides a similarity measure of feature distributions.

2) 特征归因视图：显示用户定义的子集之间的特征比较，并提供特征分布的相似性度量。

Image for post — Source: https://arxiv.org/pdf/1808.00196.pdf

Users can debug machine learning models in Manifold using three main steps:

用户可以使用三个主要步骤在Manifold中调试机器学习模型：

1) Compare: First, given a dataset with the output from one or more ML model(s), Manifold compares and highlights performance differences across models or data subsets.

1) 比较：首先，给定一个具有一个或多个ML模型输出的数据集，Manifold比较并突出显示模型或数据子集之间的性能差异。

2) Slice: This step lets users select data subsets of interest based on model performance for further inspection.

2) 切片：此步骤使用户可以根据模型性能选择感兴趣的数据子集以进行进一步检查。

3) Attribute: Manifold then highlights feature distribution differences between the selected data subsets, helping users find the reasons behind the performance outcomes.

3) 属性：歧管然后突出显示所选数据子集之间的特征分布差异，从而帮助用户找到性能结果背后的原因。

流形建筑 (The Manifold Architecture)

From an architecture standpoint, the Manifold workflow takes a group machine learning models as input and produces different data segments based on feature engineering. The feature segments are then processed by a group of encoders that produce a set of new features with intrinsic structures that were not captured by the original models and help users to iterate new models and obtain better performance.

从体系结构的角度来看，Manifold工作流将一组机器学习模型作为输入，并基于特征工程生成不同的数据段。然后，由一组编码器处理特征片段，这些编码器会产生一组具有固有结构的新特征，这些固有特征未被原始模型捕获，并可以帮助用户迭代新模型并获得更好的性能。

The workflow depicted above is implemented in a simple architecture that is based on three main components: data source, backend and frontend. Functionally, the Manifold architecture is based on three main modules:

上面描述的工作流以简单的体系结构实现，该体系结构基于三个主要组件：数据源，后端和前端。在功能上，歧管体系结构基于三个主要模块：

Data transformer, a feature that adapts data formats from other internal services (e.g. Michelangelo) into Manifold’s internal data representation format
数据转换器 ，此功能可将其他内部服务(例如，米开朗基罗)的数据格式转换为Manifold的内部数据表示格式
Computation engine, a feature that is responsible for running clustering and other data-intensive computations
计算引擎 ，负责运行集群和其他数据密集型计算的功能
Front-end components, the UI of the Manifold visual analytics system (its Python package uses a built-in version of JavaScript front-end components)
前端组件，即Manifold视觉分析系统的UI(其Python软件包使用JavaScript前端组件的内置版本)

One of the key capabilities of Manifold is the integration with Uber’s core machine learning platform: Michelangelo. To achieve that, the Uber engineering team relied on JavaScript-based computation frameworks such as TensorFlow.js which remove the need of expensive computation hardware. For more computation intensive processes, Manifold provides a Python-based interface based on Pandas and Scikit-Learn.

Manifold的关键功能之一是与Uber的核心机器学习平台Michelangelo集成。为此，Uber工程团队依靠基于JavaScript的计算框架(例如TensorFlow.js)来消除对昂贵的计算硬件的需求。对于更多的计算密集型流程，Manifold提供了一个基于Pandas和Scikit-Learn的基于Python的界面。

优步的集成块 (Manifold in Action at Uber)

Uber has adopted Manifold across all its data science teams. Recently, the Uber Eats team leveraged Manifold to evaluate a new model that predicts order delivery times. During the implementation, the Uber team integrated an extra set of features which they thought had the potential of improving the performance of the existing model. However, after the first tests, they noticed that the performance of the model was barely affected. Were the data scientists wrong on incorporating the new features?

Uber在其所有数据科学团队中都采用了Manifold。最近，Uber Eats团队利用Manifold评估了预测订单交付时间的新模型。在实施过程中，Uber团队集成了一组额外的功能，他们认为这些功能可能会改善现有模型的性能。但是，在进行第一次测试后，他们注意到该模型的性能几乎没有受到影响。数据科学家在合并新功能方面是否错了？

Using Manifold, the Uber team visualized the original model(green) and the model with the new features(orange). As you can see in the following figure, the test dataset was automatically segmented into four clusters based on performance similarity among data points. For Clusters 0, 1, and 2, the model with additional features provided no performance improvement. However, the performance of the new model (the one with extra features) was slightly better in Cluster 3, as indicated by a log-loss shifted to the left. The results indicate that the extra features help in Cluster 3 which tackles some very specific use cases that were hard to assess by the other clusters.

Uber团队使用Manifold可视化了原始模型(绿色)和具有新功能(橙色)的模型。如下图所示，基于数据点之间的性能相似性，测试数据集被自动分为四个集群。对于群集0、1和2，具有附加功能的模型未提供任何性能改进。但是，新模型(具有附加功能的模型)的性能在集群3中稍好一些，如对数损失向左移动所示。结果表明，额外的功能在群集3中有所帮助，可以解决其他群集很难评估的一些非常特殊的用例。

Manifold represents an important step towards improving the debuggability and interpretability of machine learning models. Even if Uber doesn’t open source Manifold, some of the ideas outlined in the research paper can be incorporated into machine learning tools and frameworks in order to improve the lifecycle of machine learning solutions.

歧管代表着提高机器学习模型的可调试性和可解释性的重要一步。即使Uber不开源Manifold，也可以将研究论文中概述的某些思想纳入机器学习工具和框架中，以改善机器学习解决方案的生命周期。

翻译自: https://medium.com/dataseries/uber-open-sourced-this-framework-for-debugging-machine-learning-models-c4a769728c3b

uber开源

weixin_26704853

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
uber开源_uber开源了此框架，用于调试机器学习模型

uber开源I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to...
复制链接

扫一扫