推理工作
本阶段任务为使用训练完成的模型进行推理预测。
批量推理脚本
我们的脚本就是下面这种格式,然后通过模型来预测结果。具体代码如下
{"id": 1, "question": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. \nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: 袁华接下来要说什么话?\nAI: ", "gold_ans": "没有一点儿三好学生优秀团干部的样子"}
{"id": 2, "question": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. \nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: 洪思聪接下来要说什么话?\nAI: ", "gold_ans": "听说你们蝙蝠三百六十度全方位感应反应超快啊"}
...
import json
import os
import torch
from pipeline.interface import do_generate,get_model
from lora_inference import get_lora_model
if __name__ == '__main__':
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Loading model...')
ckpt_dir = '/root/zsk/mplug-owl/output/sft_v0.1_ft_grad_ckpt/checkpoint-300'
model, tokenizer, processor = get_lora_model(pretrained_ckpt=ckpt_dir, use_bf16=True)
print('Model loaded.')
model = model.to(device)
eval_types = ['action', 'mood', 'reason', 'relationship', 'speech']
eval_file_dir = '/root/zsk/mplug-owl/artificial_test'
for eval_type in eval_types:
with open(f'{eval_file_dir}/{eval_type}.jsonl', 'r') as f:
lines = f.readlines()
ouput_file = f'/root/zsk/mplug-owl/our_result1/our_results_{eval_type}.jsonl'
with open(ouput_file, 'w', encoding='utf-8') as out_f:
for line in lines:
line = line.strip()
if not line:
continue
data = json.loads(line)
prompts = data['question']
image_list = []
# get all images in the folder
list_files = os.listdir(f'/root/zsk/mplug-owl/artificial_test/{eval_type}/{data["id"]}')
for image_file in list_files:
if image_file.endswith('.jpg'):
image_list.append(
f'/root/zsk/mplug-owl/artificial_test/{eval_type}/{data["id"]}/{image_file}')
print("\n##############################################")
print(f'Processing id={data["id"]}, {eval_type}, {len(image_list)} images.\nPrompt={prompts}')
sentence = do_generate([prompts], image_list, model, tokenizer, processor,
use_bf16=True, max_length=2048, top_k=5,
do_sample=False)
print("----------------------------------------------")
print(f'Generated sentence: {sentence}')
result = {
'id': data['id'],
'question': prompts,
'answer': sentence
}
print("##############################################\n")
out_f.write(json.dumps(result, ensure_ascii=False) + '\n')
out_f.flush()
LoRA适配
由于我们使用的LoRA训练得到的模型权重,因此我们需要将get_model方法改为get_LoRA_model,目的是解决两者Key不对应的问题。我们先创建一个符合LoRA架构的初始模型,再通过
model.load_state_dict(torch.load(pretrained_ckpt))
载入权重。解决代码如下:
def get_lora_model(pretrained_ckpt, use_bf16=False):
print(pretrained_ckpt)
model = MplugOwlForConditionalGeneration.from_pretrained(
pretrained_ckpt,
torch_dtype=torch.bfloat16 if use_bf16 else torch.half,
)
peft_config = LoraConfig(
target_modules=r'.*language_model.*\.query_key_value',
inference_mode=True,
r=8,
lora_alpha=32,
lora_dropout=0
)
model = get_peft_model(model, peft_config)
print('start load lora model')
model.load_state_dict(torch.load(pretrained_ckpt))
print('loaded pretrained lora model')
# model.print_trainable_parameters()
for param in model.parameters():
# freeze base model's layers
param.requires_grad = False
image_processor = MplugOwlImageProcessor.from_pretrained(pretrained_ckpt)
tokenizer = AutoTokenizer.from_pretrained(pretrained_ckpt)
processor = MplugOwlProcessor(image_processor, tokenizer)
return model, tokenizer, processor
结果记录
以下展示部分的推理结果和记录图片
{"id": 1, "question": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. \nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: 韩琛接下来会做出怎样的行为?\nAI: ", "answer": "韩琛在办公室里坐着,手里拿着一本书,看起来很专注.他可能正在阅读或思考一些工作相关的问题,或者正在准备参加会议或与同事讨论某个项目或议题.由于他坐在椅子上,他可能正在等待会议开始或结束,或者正在休息一会儿。"}
{"id": 2, "question": "The following is a conversation between a curious human and AI assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. \nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: <image>\nHuman: 沙复明接下来会做出怎样的行为?\nAI: ", "answer": "根据图片,沙复明正在与他的母亲交谈,母亲似乎在听他讲述一些事情."}
...