Paper小计：Language Models as Knowledge Bases?

最新推荐文章于 2023-11-23 14:06:19 发布

辉辉小学生

最新推荐文章于 2023-11-23 14:06:19 发布

阅读量575

点赞数 1

分类专栏：知识图谱 paper 文章标签：语言模型人工智能自然语言处理

本文链接：https://blog.csdn.net/huihuixiaoxue/article/details/125081895

版权

知识图谱 paper 专栏收录该内容

2 篇文章

订阅专栏

研究发现，预训练的大型语言模型，如BERT，不仅提升了NLP任务的性能，还存储了丰富的关系知识，可以处理填空查询，无需复杂的结构化知识库。在不微调的情况下，BERT也能表现出与传统方法相当的关系知识获取能力，特别是在实体关系和一些事实知识方面。然而，对于N-to-M关系的处理效果较差。BERT在开放领域问答中也展现出良好性能，但依然不及专门的监督系统。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Abstract

大型文本语料库上的 预训练语言模型提升下游NLP任务表现，学习语言知识，也可能存储了训练数据之间的 关系知识，可能能够回答“填空”语句的查询。

与结构化知识库对比，语言模型： 不需要模式工程；允许从业者查询一个开放的关系类，易于扩展到更多的数据，并且 不需要人工监督来进行培训。

对先进的预训练语言模型中的 关系知识进行分析的发现：

1.不用微调，bert也会包含关系知识与传统的NLP方法有一些访问oracle知识

2.bert也非常好在开放领域问题回答基于监督基线

3.某些类型的事实知识比其他人更容易学习标准语言模型训练的方法。

1 Introduction

预训练语言模型捕捉的语言知识，用来微调对各类任务来说意义非凡。

知识库通过支持查询访问注释的金标准关系数据的有效解决方案。需要复杂的NLP管道，包括实体提取、共引用解析、实体链接和关系提取。（这些组件通常需要监督数据和固定模式）

应对方式：对神经语言模型的关系数据进行查询。不需要模式工程。不需要人类注释，而且它们支持一组开放的查询。

探究的问题：

关系知识已经存在于预先训练的现成语言模型中，如ELMo和BERT。他们存储了多少 关系知识？对于不同类型的知识，如关于实体的事实、常识和一般的问题回答，这有什么不同呢？

与自动从文本中提取的符号知识库相比，在没有进行微调的情况下，它们的性能如何呢？

propose:

LAMA: 由一组知识源组成，每个知识源都由一组事实组成。

我们定义，一个预训练的语言模型知道一个事实（主语、关系、宾语），如（但丁出生在佛罗伦萨），如果它能成功预测MASK的对象，如 "但丁出生在 "这样的句子来表达这一事实。我们测试了各种类型的知识：存储在Wikidata中的实体之间的关系、常识性的概念网中的概念之间的关系，以及回答自然语言问题所需的知识SQuAD中的问题。在后一种情况下，我们手动将SQuAD问题的一个子集映射到cloze句子。

结论：

1.BERT模型(BERT-large)捕获了（准确的）关系型的知识，与用现成的关系提取器和基于甲骨文的实体提取器提取的知识库相当。

2.事实知识可以从预训练的语言模型中恢复得很好，然而，对于某些关系（特别是
N-to-M关系）性能非常差。

3.BERT-large在恢复事实知识和语言模型方面一直优于其他语言模型在恢复事实和常识性知识方面一直优于其他语言模型，同时对查询的措辞更加稳健。

4.BERT-large在开放域QA方面取得了显著的结果，在@10时精度达到57.1%，而使用特定任务的监督关系提取系统构建的知识库为63.5%。

2 Background

2.1 Unidirectional Language Models

2.2 Bidirectional “Language Models” 2

3 Related Work

我们的调查试图回答预训练的语言模型在多大程度上储存了事实性和常识性知识的程度。与传统关系提取方法所填充的符号知识库进行比较。

4 The LAMA Probe

We introduce the LAMA (LAnguage Model Analysis) probe to test the factual and commonsense
knowledge in language models：

It provides a setof knowledge sources which are composed of a corpus of facts. Facts are either subject-relationobject triples or question-answer pairs.

We evaluate each model based on how highly it ranks the ground truth token against every other word in a fifixed candidate vocabulary.

assumption： models which rank ground truth tokens high for these cloze state ments have more factual knowledge.

4.1 Knowledge Sources

we cover a variety of sources of factual and commonsense knowledge. For each source, we describe the origin of fact triples (or question answer pairs), how we transform them into cloze

templates, and to what extent aligned texts exist in Wikipedia that are known to express a partic ular fact. We use the latter information in super vised baselines that extract knowledge representa tions directly from the aligned text.

4.2 Models

4.3 Baselines

freq ：For a subject and relation pair......

re：For the relation-based knowledge source......

drqa：for open-domain question answering......

4.4 Metrics

We consider rank-based metrics and compute results per relation along with mean values across all relations. To account for multiple valid objects for a subject-relation pair ( i.e. , for N-Mrelations), we follow Bordes et al. ( 2013 ) and remove from the candidates when ranking at test time all other valid objects in the training data other than the one we test. We use the mean precision at k ( P@k ). For a given fact, this value is 1 if the object is ranked among the top k results, and 0 otherwise.

4.5 Considerations

Manually Defifined Templates

Single Token

Object Slots

Intersection of Vocabularies

5 Results