一文学会GLM-4-9B-Chat模型Lora微调（二）！

最新推荐文章于 2024-08-31 23:13:29 发布

longfei.li

最新推荐文章于 2024-08-31 23:13:29 发布

阅读量1.4k

点赞数 14

文章标签：人工智能神经网络

本文链接：https://blog.csdn.net/qq_25893567/article/details/141730385

版权

引言

上篇文章介绍了使用智谱GLM-4开源代码来微调“GLM-4-9B-Chat”模型的步骤和一些关键参数，但是微调完无法直接使用微调后的模型进行推理，需要对模型合并后才能进行部署和推理；这篇文章主要演示下如何验证微调后的模型、合并和部署微调后的模型。

如果还没看过上篇文章的同学，可以先看一遍哈：一文学会GLM-4-9B-Chat模型Lora微调（一）！

模型验证

看完上篇文章的同学微调之后会看到如下"output"文件夹：

在这里插入图片描述
这里的checkpoint-xxx文件夹里保存的是微调完之后的模型权重文件，这里需要注意的是checkpoint-xxx都是基于上一段checkpoint-xxx的权重继续计算得来的，比如checkpoint-150是基于checkpoint-100继续计算得来的。

所以我们验证的时候可以任意选一个checkpoint来测试，这里我们用checkpoint-200的权重来测试；智谱也为我们准备好了验证代码，使用以下命令即可完成模型验证：

python inference.py output/checkpoint-200/

其中“output/checkpoint-200/”为微调后的模型权重位置，可以自行修改；另外还有一点，在验证模型时如果基础模型的位置发生变化需要修改checkpoint-xxx文件夹中"adapter_config.json"文件：
在这里插入图片描述
将"base_model_name_or_path"的值改为基础模型在服务器的实际位置即可。执行成功后有以下输出内容：

模型合并

验证完模型后想部署起来还需要将微调好的权重给合并到基础模型中，模型合并主要使用Peft库的"merge_and_unload()"模型合并方法，合并完之后将新的权重和模型文件保存起来就可以了；以下是模型合并代码：

import torch
from peft import PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM, AutoConfig

# 定义基础模型路径
base_model_path = "/root/shared-storage/glm-4"
# 定义微调后的checkpoint路经
checkpoint_path = "/root/workspace/GLM-4-main/finetune_demo/output/checkpoint-2500"
# 定义合并后模型的输出路经
output_path = "/root/workspace/GLM-4-main/finetune_demo/merge_output"


def apply_lora(model_name_or_path, output_path, lora_path):
    print(f"Loading the base model from {model_name_or_path}")
    base = AutoModelForCausalLM.from_pretrained(
        model_name_or_path, torch_dtype=torch.float16, low_cpu_mem_usage=True, trust_remote_code=True
    )

    print(f"Loading the LoRA adapter from {lora_path}")
    lora_model = PeftModel.from_pretrained(
        base,
        lora_path,
        torch_dtype=torch.float16,
    )

    # 模型合并
    print("Applying the LoRA")
    model = lora_model.merge_and_unload()

    print(f"Saving the target model to {output_path}")
    model.save_pretrained(output_path)

    print(f"Loading the tokenizer from {model_name_or_path}")
    base_tokenizer = AutoTokenizer.from_pretrained(
        model_name_or_path, use_fast=True, padding_side="left", trust_remote_code=True
    )

    print(f"Saving the tokenizer to {output_path}")
    base_tokenizer.save_pretrained(output_path)
    print(f"Updated model and tokenizer saved to {output_path}")


# 调用函数合并并保存模型
apply_lora(base_model_path, output_path, checkpoint_path)