【书生大模型实战营】InternVL 微调实践闯关任务

lzl2040

于 2024-08-19 18:09:52 发布

阅读量797

点赞数 24

分类专栏： LLM学习文章标签： LLM 大语言模型深度学习人工智能大模型

本文链接：https://blog.csdn.net/qq_41234663/article/details/141304348

版权

LLM学习专栏收录该内容

14 篇文章 1 订阅

订阅专栏

InternVL 微调实践闯关任务

【书生大模型实战营】InternVL 微调实践闯关任务

【书生大模型实战营】InternVL 微调实践闯关任务

任务

follow 教学文档和视频使用QLoRA进行微调模型，复现微调效果，并能成功讲出梗图.
尝试使用LoRA，或调整xtuner的config，如LoRA rank，学习率。看模型Loss会如何变化，并记录调整后效果(选做，使用LoRA或调整config可以二选一)

环境准备

使用InternVL-2B进行试验，首先进行模型文件的软链接：

ln -s /root/share/new_models/OpenGVLab/InternVL2-2B /root/model/

然后配置微调环节，我使用之前创建好的xtuner环境：

conda activate xtuner
apt install libaio-dev
pip install lmdeploy==0.5.3
git clone -b v0.1.23  https://github.com/InternLM/XTuner
cd XTuner
pip install -e '.[deepspeed]'

transformers和streamlit之前安过了，就不安装了。注意这次xtuner版本是v0.1.23，是含有internvl的。

数据集使用的是huggingface上的zhongshsh/CLoT-Oogiri-GO数据集。数据集已经去过重了，只保留中文数据。Oogiri-GO是一个多模态、多语言的幽默数据集，大概就是图片+一段跟这个图片相关的幽默的话。

我们将数据集进行软链接：

ln -s /root/share/new_models/datasets/CLoT_cn_2000 /root/Project/InternLM/datasets/

InternVL推理

使用InternVL尝试对该数据集的图片进行推理，我选择这一张：
在这里插入图片描述
数据集对应的描述为：

为了方便推理，将图片copy到当前目录InternLM：

cp datasets/CLoT_cn_2000/ex_images/MjxjVcrFf9TFLbr2BKR4Py1L5qAic8K4VzEQAsTph0ztWe9vj3d8DGDdAC3tJV0aiaOrSBcsKpBIXIAh6O1CDXcA.jpg MjxjVcrFf9TFLbr2BKR4Py1L5qAic8K4VzEQAsTph0ztWe9vj3d8DGDdAC3tJV0aiaOrSBcsKpBIXIAh6O1CDXcA.jpg

然后创建推理文件test_internvl.py，内容为：

from lmdeploy import pipeline
from lmdeploy.vl import load_image

pipe = pipeline('/root/model/InternVL2-2B')

image = load_image('/root/Project/InternLM/MjxjVcrFf9TFLbr2BKR4Py1L5qAic8K4VzEQAsTph0ztWe9vj3d8DGDdAC3tJV0aiaOrSBcsKpBIXIAh6O1CDXcA.jpg')
response = pipe(('请你根据这张图片，讲一个脑洞大开的梗', image))
print(response.text)

结果为：
在这里插入图片描述
描述的并不有趣。

InternVL 微调攻略

数据集格式为：



# 为了高效训练，请确保数据格式为：
{
    "id": "000000033471",
    "image": ["coco/train2017/000000033471.jpg"], # 如果是纯文本，则该字段为 None 或者不存在
    "conversations": [
      {
        "from": "human",
        "value": "<image>\nWhat are the colors of the bus in the image?"
      },
      {
        "from": "gpt",
        "value": "The bus in the image is white and red."
      }
    ]
  }

然后使用xtuner list-cfg查看配置文件，internvl的配置文件如下：
在这里插入图片描述
选择其中的internvl_v2_internlm2_2b_qlora_finetune配置，使用的是qlora微调，然后使用如下命令复制配置文件到当前文件夹：

xtuner copy-cfg internvl_v2_internlm2_2b_qlora_finetune ./

然后修改配置文件，修改模型地址和数据地址即可：

path = '/root/model/InternVL2-2B'

# Data
data_root = '/root/Project/InternLM/datasets/CLoT_cn_2000/'
data_path = data_root + 'ex_cn.json'
image_folder = data_root
prompt_template = PROMPT_TEMPLATE.internlm2_chat
max_length = 6656

batch_size = 2 # 调小
accumulative_counts = 2

max_epochs = 3
lr = 2e-5

然后使用命令NPROC_PER_NODE=1 xtuner train internvl_v2_internlm2_2b_qlora_finetune_copy.py --deepspeed deepspeed_zero1开始微调。

我用的是30%的A1000，batch_size 为2，同时安装了flash-attn，使用如下命令安装：

pip install flash-attn --no-build-isolation

运行日志如下：
在这里插入图片描述
然后进行权重合并，命令为：

python3 XTuner/xtuner/configs/internvl/v1_5/convert_to_official.py ./internvl_v2_internlm2_2b_qlora_finetune_copy.py /root/Project/InternLM/code/work_dirs/internvl_v2_internlm2_2b_qlora_finetune_copy/iter_1000.pth /root/Project/InternLM/InternVL2-2B/

得到的模型目录为：
在这里插入图片描述