在autoDL的服务器上部署Qwen-7B-Int4并进行微调(用于记录)

weixin_73891211

已于 2024-06-21 16:04:40 修改

阅读量385

点赞数 7

文章标签： python 开发语言

于 2024-06-18 20:14:36 首次发布

本文链接：https://blog.csdn.net/weixin_73891211/article/details/139780616

版权

本文用于记录对【实战】通义千问1.8B大模型微调，实现天气预报功能_哔哩哔哩_bilibili进行的学习,若您认为本文有侵权行为,请联系我进行修改或删除

初始虚拟环境:3090显卡24G显存,pytorch2.1.2(Ubuntu22.04)

Qwen-7B-Int4的github仓:GitHub - QwenLM/Qwen: The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.

建议拉取到本地,上传至阿里云盘后再拉进autoDL的实例,具体过程参照AutoDL帮助文档

下载预依赖库:

# 请注意一定要先运行这一行再pip后面的库,否则会有奇奇怪怪的报错
conda install mpi4py #(此处使用pip报错,暂无有效解决办法)

# 以下建议使用清华镜像源,下载速度更快,不用也可以

pip install transformers -i https://pypi.tuna.tsinghua.edu.cn/simple/
pip install modelscope
pip install tiktoken
pip install transformers_stream_generator
pip install accelerate
pip install optimum
pip install auto-gptq
pip install deepspeed

安装好库以后,开始下载Qwen-7B-Chat-Int4模型到本地并部署

from modelscope import snapshot_download
from transformers import AutoModelForCausalLM, AutoTokenizer

# Downloading model checkpoint to a local dir model_dir
# model_dir = snapshot_download('qwen/Qwen-7B')
# model_dir = snapshot_download('qwen/Qwen-7B-Chat')
# model_dir = snapshot_download('qwen/Qwen-14B')
model_dir = snapshot_download('qwen/Qwen-7B-Chat-Int4',cache_dir='/path') 
# print(model_dir)

# Loading local checkpoints
# trust_remote_code is still set as True since we still load codes from local dir instead of transformers
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
    model_dir,
    device_map="auto",
    trust_remote_code=True
).eval()

进行推理

response, history = model.chat(tokenizer, "你好", history=None)
print(response)
# 你好！很高兴见到你。有什么我可以帮助你的吗？

prompt_template='''
首先,你要分析出用户在问什么

交互格式如下:

问题:用户的原始问题,会先告诉你人名、研究方向、职务三类信息中的一种,可能想要知道人名、研究方向、职务三类信息中的一种.
回答:提炼出用户已经告诉你的信息种类和想要知道的信息种类,按以下格式返回:
    已知:值是人名、研究方向、职务中的一种
    想要知道:值是人名、研究方向、职务中的一种

现在开始.

问题:%s
回答:
'''
Q = '我想知道谁是院长'
Q_list=['我想知道谁是院长','请告诉我刘振丙的职务','刘振丙在研究什么','谁在研究人工智能']
for Q in Q_list:
    prompt=prompt_template%(Q,)
    resp,history=model.chat(tokenizer,prompt,history=None)
    print('Q:%s\nA:%s\n'%(Q,resp))
'''Q:我想知道谁是院长
A:已知：职务是“院长”
想知道：人名

Q:请告诉我刘振丙的职务
A:已知：职务
 想要知道：人名

Q:刘振丙在研究什么
A:已知:职务 是 研究员
想知道:研究方向

Q:谁在研究人工智能
A:已知：人名；想知道：研究方向'''

可以看见,模型不能很好地理解我们的要求,会出现幻觉.此处由于使用的7B参数的模型,幻觉不明显.实测如果使用1.8B参数模型,会产生答非所问的现象.因此,我们生成大量样例去训练模型,让他学会按我们的要求去分析问题.

进行简单的prompt工程:

prompt_template='''
首先,你要分析出用户在问什么

交互格式如下:

问题:用户的原始问题,会先告诉你人名、研究方向、职务三类信息中的一种,可能想要知道人名、研究方向、职务三类信息中的一种.
回答:提炼出用户已经告诉你的信息种类和想要知道的信息种类,按以下格式返回:
    已知:值是人名、研究方向、职务中的一种
    想要知道:值是人名、研究方向、职务中的一种

现在开始.

问题:%s
回答:
'''
Q = '我想知道谁是院长'

在Qwen-7B的官方github文档中,已经给出了SFT的标准格式:

Qwen的SFT数据格式要求:

[
    {
        "id": "identity_0",
        "conversations":[
            {
                "from": "user",
                "value": "你好"
            },
            {
                "from": "assistant",
                "value": "我是一个语言模型,名字叫通义千问."
            }
        ]
    }
]

基于数据集生成1000条随机的SFT数据并保存到root目录下

import random
# ('answer', 'question')
Q_dic = {
    ('job','name'):'{name}的工作是什么',
    ('name','job'):'谁是{job}',
    ('search','name'):'{name}在研究什么',
    ('name','search'):'谁在研究{search}'
}
Q_lst = [('job','name'), ('name','job'), ('search','name'), ('name','search')]
dic = {'name':'人名', 'job':'工作', 'search':'研究方向'}

train_data = []
for i in range(1000):
    name = name_lst[random.randint(0, len(name_lst)-1)]
    job = job_lst[random.randint(0, len(job_lst)-1)]
    search = search_lst[random.randint(0, len(search_lst)-1)]

    Q=Q_lst[random.randint(0, len(Q_lst)-1)]
    question = Q_dic[Q].format(name=name, job=job,search=search)# 问题
    answer = f'已知{dic[Q[1]]},\n想要知道{dic[Q[0]]}'# 答案

    example={
        "id": f"identity_{i}",
        "conversations":[
            {
                "from": "user",
                "value": prompt_template%(question,),
            },
            {
                "from": "assistant",
                "value": answer,
            }
        ]
    }# 生成样例
    train_data.append(example)

with open('train.txt', 'w',encoding='utf-8') as fp:
    fp.write(json.dumps(train_data))

编写一个ft.sh脚本,保存到Qwen-main文件夹底下

#!/bin/bash

bash finetune/finetune_qlora_single_gpu.sh -m /root/.cache/modelscope/hub/qwen/Qwen-7B-Chat-Int4 -d /root/train.txt

切换到终端,运行

cd Qwen-main
bash ft.sh

大概进行一小时的调参,最后保存到/Qwen-main/output_qwen文件夹下.如果想修改保存路径,可以进入Qwen-main/finetune文件夹,找到finetune_qlora_single_gpu.sh文件,修改--output_dir output_qwen \这一行为自定义文件名即可.

然后调用微调后的模型并推理

from peft import AutoPeftModelForCausalLM
from transformers import AutoModelForCausalLM, AutoTokenizer
model_dir = '/root/Qwen-main/output_qwen'

tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoPeftModelForCausalLM.from_pretrained(
    model_dir, # path to the output directory
    device_map="auto",
    trust_remote_code=True
).eval()

model.generation_config.top_p=0 # 只选择概率最高的token,防止出现幻觉
Q_list=['我想知道谁是院长','请告诉我刘振丙的职务','刘振丙在研究什么','谁在研究人工智能']
for Q in Q_list:
    prompt=prompt_template%(Q,)
    resp,history=model.chat(tokenizer,prompt,history=None)
    print('Q:%s\nA:%s\n'%(Q,resp))
'''
Q:我想知道谁是院长
A:已知工作,
想要知道人名

Q:请告诉我刘振丙的职务
A:已知人名,
想要知道工作

Q:刘振丙在研究什么
A:已知人名,
想要知道研究方向

Q:谁在研究人工智能
A:已知研究方向,
想要知道人名
'''

观察输出可以发现大模型已经可以稳定地按照我们的要求去分析问题,微调成功

weixin_73891211

关注

7
点赞
踩
7

收藏

觉得还不错? 一键收藏
0
评论
在autoDL的服务器上部署Qwen-7B-Int4并进行微调(用于记录)

可以看见,模型不能很好地理解我们的要求,会出现幻觉.此处由于使用的7B参数的模型,幻觉不明显.实测如果使用1.8B参数模型,会产生答非所问的现象.因此,我们生成大量样例去训练模型,让他学会按我们的要求去分析问题.安装好库以后,开始下载Qwen-7B-Chat-Int4模型到本地并部署。建议拉取到本地,上传至阿里云盘后再拉进autoDL的实例,具体过程参照。观察输出可以发现大模型已经可以稳定地按照我们的要求去分析问题,微调成功。进行的学习,若您认为本文有侵权行为,请联系我进行修改或删除。
复制链接

扫一扫