谷歌开源机器学习框架_Google依靠这些开源指标来构建公平的机器学习系统

最新推荐文章于 2024-08-30 07:58:56 发布

weixin_26718593

最新推荐文章于 2024-08-30 07:58:56 发布

阅读量404

点赞数

文章标签：机器学习 java python 人工智能大数据

原文链接：https://medium.com/dataseries/google-relies-on-these-open-source-indicators-to-build-fair-machine-learning-systems-141a8383236d

版权

谷歌开源机器学习框架

I recently started a new newsletter focus on AI education. TheSequence is a no-BS( meaning no hype, no news etc) AI-focused newsletter that takes 5 minutes to read. The goal is to keep you up to date with machine learning projects, research papers and concepts. Please give it a try by subscribing below:

我最近开始了一份有关AI教育的新时事通讯。 TheSequence是无BS(意味着没有炒作，没有新闻等)，它是专注于AI的新闻通讯，需要5分钟的阅读时间。目标是让您了解机器学习项目，研究论文和概念的最新动态。请通过以下订阅尝试一下：

Ethics is one of the disciplines that must accompany the evolution of artificial intelligence(AI) systems. Building AI agents that provide ethical outcomes is one of the foundational challenges of the next decade of machine learning systems. Among the different aspects of ethical systems, fairness is one that deserves particular attention. The idea of a fair machine learning model is one whose outcomes don’t favor any particular group based on a specific bias. Conceptually, the idea of fair machine learning systems seems incredibly intuitive but how can we materialize it technically. After all, fairness is a concept of ethics that regularly involves subjective opinions. To measure fairness in machine learning models would require a quantitative definition of it. A few days ago, Google took some initial steps to address this challenge with the release of the fairness indicators for TensorFlow.

伦理学是人工智能(AI)系统发展必须遵循的学科之一。建立提供道德结果的AI代理是机器学习系统未来十年的基本挑战之一。在道德体系的不同方面中，公平是值得特别关注的一个方面。公平机器学习模型的思想是，其结果不基于特定偏见而偏向任何特定群体。从概念上讲，公平机器学习系统的想法似乎非常直观，但是如何从技术上实现它。毕竟，公平是一种道德概念，经常涉及主观意见。要衡量机器学习模型中的公平性，需要对其进行定量定义。几天前，Google通过发布TensorFlow的公平性指标，采取了一些初步措施来应对这一挑战。

The idea of quantifying fairness in a machine learning model is far from trivial. Bias can manifest itself across all aspects of the machine learning lifecycle aggregating its impact on the final outcome of the model. To detect this unequal impact, evaluation over individual slices, or groups of users, is crucial as overall metrics can obscure poor performance for certain groups.

在机器学习模型中量化公平的想法绝非易事。偏差可以在机器学习生命周期的所有方面体现出来，从而汇总其对模型最终结果的影响。为了检测到这种不平等的影响，对单个用户或用户组的评估至关重要，因为总体指标可以掩盖某些组的不良绩效。

Image for post — Source: https://github.com/tensorflow/fairness-indicators

Evaluating the impact of different slices of users in a model is an interesting approach to evaluate fairness. It is important to notice that fairness cannot be achieved solely through metrics and measurement; high performance, even across slices, does not necessarily prove that a system is fair. However, this approach is good starting point to identify gaps in the performance of the model across relevant groups of users.

评估模型中不同部分用户的影响是一种评估公平性的有趣方法。重要的是要注意，不能仅通过度量和度量来实现公平。甚至跨片的高性能也不一定证明系统是公平的。但是，此方法是确定相关用户组之间模型性能差距的良好起点。

公平指标 (Fairness Indicators)

Fairness Indicators is a suite of tools that enables computation and visualization of commonly-identified fairness metrics for classification models. It is relevant to highlight that this is not the first attempt to provide fairness evaluation metrics for machine learning models, but previous attempts have had trouble scaling when applied to large datasets. The current architecture of the Fairness Indicator tool suite allows it to evaluate models and datasets of any size.

公平指标是一套工具，可用于对分类模型中常用的公平指标进行计算和可视化。有必要强调的是，这并不是为机器学习模型提供公平性评估指标的首次尝试，但是先前的尝试在应用于大型数据集时难以缩放。 Fairness Indicator工具套件的当前体系结构允许它评估任何大小的模型和数据集。

From a functional standpoint, Fairness Indicators enables a core set of capabilities of evaluating fairness in machine learning models:

从功能的角度来看，公平性指标提供了一组评估机器学习模型中公平性的核心功能：

Evaluate the distribution of datasets
评估数据集的分布
Evaluate model performance, sliced across defined groups of users
评估模型性能，分为定义的用户组
Feel confident about your results with confidence intervals and evals at multiple thresholds
通过置信区间和多个阈值评估，对结果充满信心
Dive deep into individual slices to explore root causes and opportunities for improvement
深入研究各个方面，探索根本原因和改进机会

To enable the aforementioned capabilities, Fairness Indicators computes confidence intervals, which can surface statistically significant disparities, and performs evaluation over multiple thresholds. In the UI, it is possible to toggle the baseline slice and investigate the performance of various other metrics. The user can also add their own metrics for visualization, specific to their use case.

为了启用上述功能，公平指标会计算置信区间，该区间在统计上可能很重要差异，并在多个阈值上执行评估。在用户界面中，可以切换基线切片并调查各种其他指标的性能。用户还可以针对自己的用例添加自己的可视化指标。

The current version of Fairness Indicators is optimized for the TensorFlow stack. Specifically, the release includes the following components:

当前版本的Fairness Indicators已针对TensorFlow堆栈进行了优化。具体来说，该发行版包含以下组件：

Tensorflow Data Analysis (TFDV) [analyze distribution of your dataset]
Tensorflow数据分析(TFDV) [分析数据集的分布]
Tensorflow Model Analysis (TFMA) [analyze model performance]
Tensorflow模型分析(TFMA) [分析模型性能]
Fairness Indicators [an addition to TFMA that adds fairness metrics and the ability to easily compare performance across slices]
公平性指标[TFMA的补充，它增加了公平性指标和轻松比较各个切片性能的能力]
The What-If Tool (WIT) [an interactive visual interface designed to probe your models better]
假设工具(WIT) [旨在更好地探查模型的交互式可视界面]

One of the great capabilities of the Fairness Indicators release is the integration with the What-If Tool (WIT). Clicking on a bar in the Fairness Indicators graph will load those specific data points into the WIT widget for further inspection, comparison, and counterfactual analysis. This is particularly useful for large datasets, where Fairness Indicators can be used to identify problematic slices before the WIT is used for a deeper analysis.

公平性指标版本的一项强大功能是与假设工具(WIT)的集成。单击“公平性指标”图中的条形图，会将这些特定数据点加载到WIT小部件中，以进行进一步检查，比较和反事实分析。这对于大型数据集特别有用，在使用WIT进行更深入的分析之前，可以使用公平性指标来识别有问题的切片。

There are several ways in which machine learning developers can leverage Fairness Indicators today.

如今，机器学习开发人员可以通过多种方式利用公平性指标。

If using TensorFlow models and tools, such as TFX:

如果使用TensorFlow模型和工具，例如TFX：

Access Fairness Indicators as part of the Evaluator component in TFX
将公平指标作为TFX评估程序组件的一部分进行访问
Access Fairness Indicators in TensorBoard when evaluating other real-time metrics
评估其他实时指标时访问TensorBoard中的公平指标

If not using existing TensorFlow tools:

如果不使用现有的TensorFlow工具：

Download the Fairness Indicators pip package, and use Tensorflow Model Analysis as a standalone tool
下载公平性指标pip软件包，并将Tensorflow Model Analysis用作独立工具

For non-TensorFlow models:

对于非TensorFlow模型：

Use Model Agnostic TFMA to compute Fairness Indicators based on the output of any model.
使用模型不可知TFMA可根据任何模型的输出来计算公平性指标。

Fairness Indicators is an initial approach to evaluate fairness in machine learning models. Future efforts might include new metrics as well as remediation strategies and recommendation to address unfairness in specific areas. However, for an initial release, Fairness Indicators provides a very complete set of tools to help developers build more fairer machine learning systems.

公平指标是评估机器学习模型中公平性的一种初始方法。未来的工作可能包括新的指标，补救策略和解决特定领域不公平的建议。但是，对于初始版本，Fairness Indicators提供了非常完整的工具集来帮助开发人员构建更公平的机器学习系统。