AI&大模型 | llama2微调手册查看ing-CSDN博客

本文链接：https://blog.csdn.net/weixin_43236007/article/details/135160139

本文介绍了Llama2模型，它是经过2万亿标记训练和额外人类注释增强的版本，具有4096长度的上下文和改进的Transformer架构。文章详细讲述了模型的训练过程，包括预训练、监督微调和人类反馈强化学习(RLHF)的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

提示词说明

<s>[INST] <<SYS>>
{{ system_prompt }}
<</SYS>>

{{ user_message }} [/INST]

其中，

<s> ，<\s>，<<SYS>>，<</SYS>>，[INST]，以及[/INST]是特殊token，

标记着prompt中各个部分的构成。

每一组<s>和</s>之间是一个相对完整的单元，可以理解为一个对话轮次（如果直接给一个文本作为输入，也可以看到模型的输入结果分别是以这两个BOS和EOS token作为结尾的）。

[INST]和[/INST]用于区分在当前这一轮的对话（历史）中，用户输入的部分与模型返回的部分。位于[INST]之后，/[INST]之前的文本，是用户在. 这一轮次(<s></s>包含的文本)对话中所输入的query，而/[INST]之后的文本，是模型针对这一query所作出的回答。

在对话中的第一组单元，可以提供整个对话的背景信息，并以<<SYS>>和<</SYS>>作为特殊标记，位于它们之间的，是对话的背景信息,类似instruction。

{{ system_prompt }}部分是整个对话中的通用前缀，一般用来给模型提供一个身份，作为对话的大背景。

{{ user_message }}部分是用户所提供的信息，可以理解为多轮对话中其中一轮对话的内容。

<s>[INST] <<SYS>>
You are a helpful, respectful and honest assistant. Always answer as helpfully as 
possible, while being safe.  Your answers should not include any harmful, unethical, 
racist, sexist, toxic, dangerous, or illegal content. Please ensure that your 
responses are socially unbiased and positive in nature.

If a question does not make any sense, or is not factually coherent, explain why instead of answering something not correct. If you don't know the answer to a question, please don't share false information.
<</SYS>>

There's a llama in my garden   What should I do? [/INST]

对于多轮，有这个例子

<s>[INST] <<SYS>>

You are are a helpful... bla bla.. assistant

<</SYS>>

Hi there! [/INST] Hello! How can I help you today? </s><s>[INST] What is a neutron 
star? [/INST] A neutron star is a ... </s><s> [INST] Okay cool, thank you! [/INST]