Prompt-Learning for Fine-Grained Entity Typing-CSDN博客

本文链接：https://blog.csdn.net/weixin_43913077/article/details/120227838

文章目录

输入端
- Promt定义
- - hard-encoding
  - soft-encoding
Answer映射
训练
针对zero-shot的自监督Prompt-learning
引用

输入端

Promt定义

hard-encoding

对于比较明确、简短的任务，使用人工定义。
在这里插入图片描述

soft-encoding

使用的promt如下，其中，[P]为分隔符，[P1]，...,[Pl]为随机初始化的向量。从直觉上来说，经过训练以后，[P1]，...,[Pl]所表示的向量与[MASK] 相近。
在这里插入图片描述

Answer映射

一个实体可能是多类别的，并且有层级关系。比如：LOCATION与CITY，对与这些实体我们把label当作词表中的对应的label words。例如： $\rightarrow v = \{location,city\}$
另外对于一个label，把这个label的相关词（通过该工具查找：https://relatedwords.org）也加入label words。例如：city的相关词为metropolis, town, municipality, urban, suburb, municipal, megalopolis, civilization, downtown, country。

所以city的label words为location, city, metropolis, town, municipality, urban, suburb, municipal, megalopolis, civilization, downtown, country。

最后，用这些label words的加权和来表示这个label的概率：
在这里插入图片描述

训练

在这里插入图片描述
φ为模板的参数，θ为预训练模型参数，loss采用交叉熵。

针对zero-shot的自监督Prompt-learning

上面的都是有训练数据集的情况，而针对没有训练数据的zero-shot问题来说，就不适用了。

作者发现比如一句话：
Steve Jobs found Apple. In this sentence, Steve Jobs is a [MASK]中的Steve Jobs预测为person的概率要远大于location，作者认为预训练模型的知识蕴含了类别信息。
在这里插入图片描述
作者认为相同的实体在不同的句子中具有类似的类型，比如
Steve Jobs”在不同句子中可以为entrepreneur, designer, philanthropist