nlp自然语言处理_自然语言处理中的偏见nlp是一个危险但可解决的问题

最新推荐文章于 2023-06-15 09:59:05 发布

weixin_26630173

最新推荐文章于 2023-06-15 09:59:05 发布

阅读量1.5k

点赞数

文章标签： nlp 自然语言处理 java

原文链接：https://towardsdatascience.com/bias-in-natural-language-processing-nlp-a-dangerous-but-fixable-problem-7d01a12cf0f7

版权

本文探讨了自然语言处理（NLP）领域存在的偏见问题，这是一个潜在危险但可以通过各种策略进行修正的挑战。翻译自数据科学领域的文章，强调了解决NLP偏见的重要性。

摘要由CSDN通过智能技术生成

nlp自然语言处理

Natural language processing (NLP) is one of the biggest areas of machine learning research, and although current linguistic machine learning models achieve numerically-high performance on many language-understanding tasks, they often lack optimization for reducing implicit biases.

自然语言处理(NLP)是机器学习研究的最大领域之一，尽管当前的语言机器学习模型在许多理解语言的任务上实现了数值上的高性能，但它们通常缺乏优化以减少隐性偏差。

Let’s start from the beginning.

让我们从头开始。

What is bias in machine learning models? Essentially, it’s when machine learning algorithms express implicit biases that often pass undetected during testing because most papers test their models for raw accuracy. Take, for example, the following instances of deep learning models expressing gender bias. According to our deep learning models,

机器学习模型中的偏见是什么？ 从本质上讲，这是机器学习算法表达隐性偏差的时候，该偏差通常在测试过程中未被发现，因为大多数论文都在测试其模型的原始准确性。以下列表示性别偏见的深度学习模型为例。根据我们的深度学习模型，

“He is doctor” has a higher likelihood than “She is doctor.” [Source]
“他是医生”比“她是医生”的可能性更高。 [ 来源 ]
Man is to woman as computer programmer is to homemaker. [Source]
男人是女人，计算机程序员是家庭主妇。 [ 来源 ]
Sentences with female nouns are more indicative of anger. [Source]
带有女性名词的句子更能表示愤怒。 [ 来源 ]
Translating “He is a nurse. She is a doctor” into Hungarian and back to English results in “She is a nurse. He is a doctor.” [Source]
翻译“他是一名护士。她是匈牙利的一名医生”，而英语为“她是一名护士”。他是一个医生。” [ 来源 ]

In these examples, the algorithm is essentially expressing stereotypes, which differs from an example such as “man is to woman as king is to queen” because king and queen have a literal gender definition. Kings are defined to be male and queens are defined to be female. Computer progr