传统量化与ai量化对比_量化AI偏差的风险

传统量化与ai量化对比

A Testing Perspective of Unwanted AI Bias

不需要的AI偏差的测试角度

Bias refers to prejudice in favor of or against one thing, person, or group compared with another, usually in a way considered to be unfair.

偏见是指偏向于赞成或反对一个事物,个人或群体与另一个事物,个人或群体的偏见,通常以一种不公平的方式。

世界充满了偏见 (The World is Filled with Bias)

A quick search on bias reveals a list of nearly 200 cognitive biases that psychologists have classified based on human beliefs, decisions, behaviors, social interactions, and memory patterns. Certainly, recent events stemming from racial inequality and social injustice are raising greater awareness of the biases that exist in the world today. Many would argue that our social and economic system is not designed to be fair and is even engineered in a way that marginalizes specific groups and benefits others. However, before we can improve such a system, we first have to be able to identify and measure where and to what degree it is unfairly biased.

对偏见的快速搜索显示心理学家根据人类的信念,决策,行为,社交互动和记忆模式分类的近200种认知偏见的列表 。 当然,由于种族不平等和社会不公正而引起的最近事件正在提高人们对当今世界上存在的偏见的认识。 许多人会争辩说,我们的社会和经济体系的设计不公平,甚至在设计上会边缘化特定群体并使他人受益。 但是,在改进这种系统之前,我们首先必须能够识别和衡量该系统在不公正的地方和偏向程度。

Since the world is filled with bias, it follows that any data we collect from it contains biases. If we then take that data and use it to train AI, the machines will reflect those biases. So how then do we start to engineer AI-based systems that are fair and inclusive? Is it even practical to remove bias from AI-based systems, or is it too daunting of a task? In this article, we explore the world of AI bias and take a look at it through the eyes of someone tasked with testing the system. More specifically, we describe a set of techniques and tools for preventing and detecting unwanted bias in AI-based systems and quantifying the risk associated with it.

由于世界充满了偏差,因此我们从中收集的任何数据都包含偏差。 如果我们随后获取这些数据并将其用于训练AI,则这些机器将反映出这些偏差。 那么,我们如何开始设计公平,包容的基于AI的系统呢? 从基于AI的系统中消除偏见甚至可行,还是一项任务太艰巨? 在本文中,我们探索了AI偏差的世界,并通过负责测试系统的人员的眼光进行了研究。 更具体地说,我们描述了一套技术和工具,用于预防和检测基于AI的系统中的有害偏差并量化与之相关的风险。

并非所有偏见都是平等产生的 (Not All Bias is Created Equally)

While there is definitely some irony in this heading, one of the first things to recognize when designing AI-based systems is that there will be bias, but not all bias necessarily results in unfairness. In fact, if you examine the definition of bias carefully, the phrase “usually in a way considered to be unfair” implies that although bias generally carries a negative connotation, it isn’t always a bad thing. Consider any popular search engine or recommendation system. Such systems typically use AI to predict user preferences. Such predictions can be viewed as a bias in favor of or against some items over others. However, if the problem domain or target audience calls for such a distinction, it represents desired system behavior as opposed to unwanted bias. For example, it is acceptable for a movie recommendation system for toddlers to only display movies rated for children ages 1–3. However, it would not be acceptable for that system to only recommend movies preferred by male toddlers when the viewers could also be female. To avoid confusion, we typically refer to the latter as unwanted or undesired bias.

W¯¯往往微不足道肯定是有本品目的第一件事情认识到一个具有讽刺意味的一些设计时,基于人工智能的系统是会有偏差,但并非所有的偏见必然导致不公平。 实际上,如果仔细检查偏见的定义,短语“通常以一种不公平的方式表示”意味着,尽管偏见通常带有负面含义,但这并不总是一件坏事。 考虑任何流行的搜索引擎或推荐系统。 这样的系统通常使用AI来预测用户偏好。 可以将此类预测视为赞成或反对某些项目相对于其他项目的偏见。 但是,如果问题域或目标受众要求进行这种区分,则它表示所需的系统行为,而不是不需要的偏见。 例如,对于幼儿的电影推荐系统,仅显示额定为1-3岁儿童的电影是可以接受的。 但是,当观众也可能是女性时,该系统仅推荐男性幼儿喜欢的电影是不可接受的。 为避免混淆,我们通常将后者称为不必要或不希望有的偏见。

AI偏差周期 (The AI Bias Cycle)

A recent survey on bias and fairness in machine learning by researchers at the University of Southern California’s Information Sciences Institute defines several categories of bias definitions in data, algorithms, and user interactions. These categories of bias are summed up in a cycle depicted in Figure 1, and can be described as follows:

最近 在机器学习的偏见和公正的调查 ,研究人员在南加州的信息科学研究所大学定义了几种类型的数据,算法和用户交互偏置定义。 这些偏差类别在图1所示的一个周期中汇总,可以描述如下:

Image for post
Figure 1. The Bias Cycle in AI and Machine Learning Systems
图1. AI和机器学习系统中的偏差周期
  1. Data Bias: The cycle starts with the collection of real-world data that is inherently biased due to cultural, historical, temporal, and other reasons. Sourced data is then sampled for a given application which can introduce further bias depending on the sampling method and size.

    数据偏差 :周期始于收集现实世界的数据,这些数据由于文化,历史,时间和其他原因而固有地存在偏差。 然后针对给定的应用对源数据进行采样,这可能会导致进一步的偏差,具体取决于采样方法和大小。

  2. Algorithmic Bias: The design of the training algorithm itself or the way it is used can also result in bias. These are systematic and repeatable errors that cause unfair outcomes such as privileging one set of users over others. Examples include popularity, ranking, evaluation, and emergent bias.

    算法偏差 :训练算法本身的设计或使用方法也会导致偏差。 这些是系统性且可重复的错误,会导致不公平的结果,例如使一组用户享有特权。 例子包括受欢迎程度,排名,评估和紧急偏见。

  3. User Interaction Bias: Both the user interface and user can be the source of bias in the system. As such, care should be taken in how user input, output, and feedback loops are designed, presented, and managed. User interactions typically produce new or updated data that contains further bias, and the cycle repeats.

    用户交互偏差 :用户界面和用户均可成为系统偏差的源头。 因此,应注意如何设计,呈现和管理用户输入,输出和反馈循环。 用户交互通常会产生包含进一步偏差的新数据或更新数据,并且循环会重复。

人工智能中的偏见资源 (Resources on Bias in AI)

Interested in learning more about the unwanted AI bias and the bias cycle? Check out these video resources by Ricardo Baeza-Yates, Director of Graduate Data Science Programs at Northeastern University, and former CTO of NTENT. In the first video, Baeza-Yates does a great job of introducing bias and explaining the bias cycle in l

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值