Scaling instruct models(缩放指令模型)
This paper introduces FLAN (Fine-tuned LAnguage Net), an instruction finetuning method, and presents the results of its application. The study demonstrates that by fine-tuning the 540B PaLM model on 1836 tasks while incorporating Chain-of-Thought Reasoning data, FLAN achieves improvements in generalization, human usability, and zero-shot reasoning over the base model. The paper also provides detailed information on how each these aspects was evaluated.
Here is the image from the lecture slides that illustrates the fine-tuning tasks and datasets employed in training FLAN. The task selection expands on previous works by incorporating dialogue and program synthesis tasks from Muffin and integrating them with new Chain of Thought Reasoning tasks. It also includes subsets of other task collections, such as T0 and Natural Instructions v2. Some tasks were held-out during training, and they were later used to evaluate the model's performance on unseen tasks.
本文介绍了FLAN(Fine-tuned LAnguage Net),一种指令微调方法,并展示了其应用的结果。研究表明,通过在1836个任务上对540B PaLM模型进行微调,同时结合思维链推理数据,FLAN在泛化能力、人类可用性和零样本推理方面取得了比基础模型更好的改进。论文还提供了关于这些方面如何评估的详细信息。
以下是来自讲座幻灯片的图片,展示了训练FLAN时使用的微调任务和数据集。任务选择扩展了之前的工作,通过纳入Muffin的对话和程序合成任务,并将其与新的思维链推理任务整合。它还包括其他任务集合的子集,如T0和Natural Instructions v2。一些任务在训练过程中被保留,后来用于评估模型在未见任务上的性能。