【论文笔记】Pre-train, Prompt, and Predict

最新推荐文章于 2024-06-28 03:07:35 发布

volcano_66

最新推荐文章于 2024-06-28 03:07:35 发布

阅读量1k

点赞数 23

文章标签：论文阅读 prompt

本文链接：https://blog.csdn.net/volcano_66/article/details/136107158

版权

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Prompt Template Engineering

Prompt shape

cloze prompts(eg:I love this movie, it is a [Z] movie): for tasks that are solved using masked LMs
prefix prompts(eg:I love this movie. What’s the sentiment of the review? [Z]): for generation tasks

for some tasks regarding multiple inputs such as text pair classification, prompt templates must contain space for two inputs, [X1] and [X2],ormore.

Manual Template Engineering

Automated Template Learning

Discrete Prompts(hard prompts)

Prompt Mining: 找到input x和output y之间的桥梁(middle words) [X] middle words [Z]
Prompt Paraphrasing: 复述已有的prompt
Gradient-based Search
Prompt Generation: 当成text generation task
Prompt Scoring

Continuous Prompts(soft prompts)(没太懂)

Prefix Tuning: 保持语言模型（LM）参数固定，在生成模型输入之前添加了一些特定的向量序列
Tuning Initialized with Discrete Prompts:
Hard-SoftPromptHybridTuning

Prompt Answer Engineering

Answer Shape

Tokens
Span
Sentence

Answer Space Design Methods

Manual Design

Unconstrained Spaces: 直接就是 answer z = output y
Constrained Spaces: calculate the probability of an output among multiple choices

Discrete Answer Search

Answer Paraphrasing
Prune-then-Search(eg: top k)
Label Decomposition: decompose each relation label into its constituent words and use them as an answer.

Continuous Answer Search(?)

Multi-Prompt

Prompt Ensembling

在这里插入图片描述

Uniform averaging: selecting K prompts that achieve the highest accuracy on the training set and then use the average log probabilities obtained from the top K prompts to calculate the probability for a single token at [Z] position
Weighted averaging
Majority voting
Knowledge distillation: 多模型集成之后，可以把多个模型的知识蒸馏到一个模型里面。
Prompt ensembling for text generation: generate the output based on the ensembled probability of the next word in the answer sequence

Prompt Augmentation

在这里插入图片描述

Sample Selection: how do we choose the most effective examples?
Sample Ordering: How do we properly order the chosen examples?

Prompt Composition

在这里插入图片描述

Prompt Decomposition

在这里插入图片描述

Training Strategies for Prompting Methods

Promptless Fine-tuning

BERT, RoBERTa

parameters of the pre-trained LM will be updated via gradients induced from downstream training samples

Disadvantages: LMs may overfit or not learn stably on smaller datasets.

Tuning-free Prompting

LAMA, GPT-3

directly generates the answers without changing the parameters of the pre-trained LMs based only on a prompt

Disadvantages: heavy engineering on prompt

Fixed-LM Prompt Tuning

Prefix-Tuning and Prompt-Tuning

语言模型的参数不进行改变，添加提示，并在提示部分引入额外参数。仅对提示部分的参数进行训练。

Disadvantages: Not applicable in zero-shot scenarios. While effective in few-shot scenarios, representation power is limited in large-data settings. Prompt engineering through choice of hyperparameters or seed prompts is necessary. Prompts are usually not human-interpretable or manipulable.

Fixed-prompt LM Tuning

PET-TC , PET-Gen , and LM-BFF

语言模型的参数参与训练，提示部分的参数固定，与上一种方法相反。

Disadvantages: Template or answer engineering are still required, although perhaps not as much as without prompting. LMs fine-tuned on one downstream task may not be effective on another one.

Prompt+LM Tuning

PADA and P-Tuning

全部参数参与微调

Disadvantages: Requires training and storing all parameters of the models. May overfit to small datasets.

Meta-Application

Domain Adaptation

adapting a model from one domain (e.g., news text) to another (e.g., social media text)
Debiasing

perform self-diagnosis and self-debiasing based on biased or debiased instructions
Dataset Construction

Challenges

Selection of Pre-trained LMs

Prompt Design

information extraction and text analysis tasks
Prompting with Structured Information
Entanglement of Template and Answer

How to simultaneously search or learn for the best combination of template and answer

Prompt Answer Engineering

Many-class Classification Tasks.

When there are too many classes, how to select an appropriate answer space
Long-answer Classification Tasks

how to best decode multiple tokens using LMs
Multiple Answers for Generation Tasks

How to better guide the learning process with multiple references

Selection of Tuning Strategy

Multiple Prompt Learning

Theoretical and Empirical Analysis of Prompting

Transferability of Prompts（可迁移性）

Combination of Different Paradigms

Calibration of Prompting Methods（校准）

volcano_66

关注

23
点赞
踩
18

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫