LlaMA2微调实战_llama2微调准备数据-CSDN博客

本文链接：https://blog.csdn.net/rensihui/article/details/131975334

LLaMA2-SFT

LLaMA2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理

Gtihub地址

prompt

text_1 = f"".join(["[INST] <<SYS>>\n    "
   "You are a helpful, respectful and honest assistant. "
   "Always answer as helpfully as possible, while being safe."
   " Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, "
   "or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\n"
   "    If a question does not make any sense, or is not factually coherent, "
   "explain why instead of answering something not correct. "
   "If you don't know the answer to a question, please don't share false information.\n"
   "<</SYS>>\n\n{0} [/INST] "]).format(
    data_point.get('instruction', '').strip() +"\t"+ data_point.get('input', '').strip())
    
    
我们缩短后为
text_1 = f"[INST] <<SYS>>\n    You are a helpful, respectful and honest assistant.<</SYS>>" \
         f"\n\n{0} [/INST] ".format(
    data_point.get('instruction', '').strip() + "\t" + data_point.get('input', '').strip())

踩过的坑(截至20230727)

1. LLaMA2的weights权重不能为fp16(即必须为fp32, 或者bf16), 否则会溢出;

微调样例

地址: llama2_sft/ft_llama2

配置: llama2_sft/ft_llama2/config.py
训练: python train.py
推理: python predict.py
验证: python evaluation.py
接口: python post_api.py