【论文笔记】Pre-train, Prompt, and Predict

Pre-train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing

Prompt Template Engineering

Prompt shape

  • cloze prompts(eg:I love this movie, it is a [Z] movie): for tasks that are solved using masked LMs
  • prefix prompts(eg:I love this movie. What’s the sentiment of the review? [Z]): for generation tasks

for some tasks regarding multiple inputs such as text pair classification, prompt templates must contain space for two inputs, [X1] and [X2],ormore.

Manual Template Engineering

Automated Template Learning

Discrete Prompts(hard prompts)
  • Prompt Mining: 找到input x和output y之间的桥梁(middle words) [X] middle words [Z]
  • Prompt Paraphrasing: 复述已有的prompt
  • Gradient-based Search
  • Prompt Generation: 当成text generation task
  • Prompt Scoring
Continuous Prompts(soft prompts)(没太懂)
  • Prefix Tuning: 保持语言模型(LM)参数固定,在生成模型输入之前添加了一些特定的向量序列
  • Tuning Initialized with Discrete Prompts:
  • Hard-SoftPromptHybridTuning

Prompt Answer Engineering

Answer Shape

  • Tokens
  • Span
  • Sentence

Answer Space Design Methods

Manual Design
  • Unconstrained Spaces: 直接就是 answer z = output y
  • Constrained Spaces: calculate the probability of an output among multiple choices
Discrete Answer Search
  • Answer Paraphrasing
  • Prune-then-Search(eg: top k)
  • Label Decomposition: decompose each relation label into its constituent words and use them as an answer.
Continuous Answer Search(?)

Multi-Prompt

Prompt Ensembling

在这里插入图片描述

  • Uniform averaging: selecting K prompts that achieve the highest accuracy on the training set and then use the average log probabilities obtained from the top K prompts to calculate the probability for a single token at [Z] position
  • Weighted averaging
  • Majority voting
  • Knowledge distillation: 多模型集成之后,可以把多个模型的知识蒸馏到一个模型里面。
  • Prompt ensembling for text generation: generate the output based on the ensembled probability of the next word in the answer sequence

Prompt Augmentation

在这里插入图片描述

  • Sample Selection: how do we choose the most effective examples?
  • Sample Ordering: How do we properly order the chosen examples?

Prompt Composition

在这里插入图片描述

Prompt Decomposition

在这里插入图片描述

Training Strategies for Prompting Methods

Promptless Fine-tuning

BERT, RoBERTa

parameters of the pre-trained LM will be updated via gradients induced from downstream training samples

Disadvantages: LMs may overfit or not learn stably on smaller datasets.

Tuning-free Prompting

LAMA, GPT-3

directly generates the answers without changing the parameters of the pre-trained LMs based only on a prompt

Disadvantages: heavy engineering on prompt

Fixed-LM Prompt Tuning

Prefix-Tuning and Prompt-Tuning

语言模型的参数不进行改变,添加提示,并在提示部分引入额外参数。仅对提示部分的参数进行训练。

Disadvantages: Not applicable in zero-shot scenarios. While effective in few-shot scenarios, representation power is limited in large-data settings. Prompt engineering through choice of hyperparameters or seed prompts is necessary. Prompts are usually not human-interpretable or manipulable.

Fixed-prompt LM Tuning

PET-TC , PET-Gen , and LM-BFF

语言模型的参数参与训练,提示部分的参数固定,与上一种方法相反。

Disadvantages: Template or answer engineering are still required, although perhaps not as much as without prompting. LMs fine-tuned on one downstream task may not be effective on another one.

Prompt+LM Tuning

PADA and P-Tuning

全部参数参与微调

Disadvantages: Requires training and storing all parameters of the models. May overfit to small datasets.

Meta-Application

  • Domain Adaptation

    adapting a model from one domain (e.g., news text) to another (e.g., social media text)

  • Debiasing

    perform self-diagnosis and self-debiasing based on biased or debiased instructions

  • Dataset Construction

Challenges

Selection of Pre-trained LMs

Prompt Design

  • information extraction and text analysis tasks

  • Prompting with Structured Information

  • Entanglement of Template and Answer

    How to simultaneously search or learn for the best combination of template and answer

Prompt Answer Engineering

  • Many-class Classification Tasks.

    When there are too many classes, how to select an appropriate answer space

  • Long-answer Classification Tasks

    how to best decode multiple tokens using LMs

  • Multiple Answers for Generation Tasks

    How to better guide the learning process with multiple references

Selection of Tuning Strategy

Multiple Prompt Learning

Theoretical and Empirical Analysis of Prompting

Transferability of Prompts(可迁移性)

Combination of Different Paradigms

Calibration of Prompting Methods(校准)

  • 23
    点赞
  • 18
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值