一、准备工作
1,qwen代码下载
git clone https://github.com/QwenLM/Qwen.git
2,切换到Qwen目录
3,模型文件下载
git clone https://www.modelscope.cn/qwen/Qwen-1_8B-Chat-Int4.git
二、模型推理
1, ModelScope 本地加载模型
from modelscope import snapshot_download
from transformers import AutoModelForCausalLM, AutoTokenizer
# Downloading model checkpoint to a local dir model_dir
# model_dir = snapshot_download('qwen/Qwen-7B')
# model_dir = snapshot_download('qwen/Qwen-7B-Chat')
# model_dir = snapshot_download('qwen/Qwen-14B')
model_dir = snapshot_download('qwen/Qwen-1_8B-Chat-Int4')
# Loading local checkpoints
# trust_remote_code is still set as True since we still load codes from local dir instead of transformers
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
model_dir,
device_map="auto",
trust_remote_code=True
).eval()
2,推理运行
from modelscope import AutoModelForCausalLM, AutoTokenizer
from modelscope import GenerationConfig
# 可选的模型包括: "qwen/Qwen-7B-Chat", "qwen/Qwen-14B-Chat"
tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen-1_8B-Chat-Int4", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained("qwen/Qwen-1_8B-Chat-Int4", device_map="auto", trust_remote_code=True, fp16=True).eval()
model.generation_config = GenerationConfig.from_pretrained("Qwen/Qwen-1_8B-Chat-Int4", trust_remote_code=True) # 可指定不同的生成长度、top_p等相关超参
response, history = model.chat(tokenizer, "你好", history=None)
print(response)
response, history = model.chat(tokenizer, "轻微的脑梗有什么症状?", history=history)
print(response)
response, history = model.chat(tokenizer, "平时预防需要注意些什么", history=history)
print(response)
三、微调(Qlora)
1,数据准备zy.json(根据自己需要更换不同类型的数据集,模型展示效果不同)
(微调数据集若改变,之后步骤都需重新执行)
2,使用提供的shell脚本微调,文件目录finetune/finetune_qlora_single_gpu.sh
脚本中指定模型和数据路径:
--model_name_or_path /mnt/workspace/Qwen-main/Qwen-1_8B-Chat-Int4 \
--data_path /mnt/workspace/Qwen-main/zy.json \
(2)微调执行python finetune.py...... == bash finetune/finetune_qlora_single_gpu.sh
微调完成:
3,读取微调后模型
from peft import AutoPeftModelForCausalLM
model = AutoPeftModelForCausalLM.from_pretrained(
'output_qwen', # path to the output directory
device_map="auto",
trust_remote_code=True
).eval()
4,模型测试
from transformers import AutoModelForCausalLM, AutoTokenizer
from transformers.generation import GenerationConfig
tokenizer = AutoTokenizer.from_pretrained("/mnt/workspace/Qwen-main/output_qwen", trust_remote_code=True)
model = AutoModelForCausalLM.from_pretrained(
"/mnt/workspace/Qwen-main/output_qwen",
device_map="auto",
trust_remote_code=True
).eval()
response, history = model.chat(tokenizer, "怎样预防骨质疏松?", history=None)
print(response)
成功输出:
附加:
微调模型以古文腔调回答对比
微调前-
微调后-