1. InternVL 部署微调实践
依据教程使用处理后的huggingface上的zhongshsh/CLoT-Oogiri-GO据集,并选择一张图片作为case进行推理
@misc{zhong2023clot,
title={Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation},
author={Zhong, Shanshan and Huang, Zhongzhan and Gao, Shanghua and Wen, Weushao and Lin, Liang and Zitnik, Marinka and Zhou, Pan},
journal={arXiv preprint arXiv:2312.02439},
可以看出推理效果并不是很好,因此需要进行微调
2. InternVL 微调攻略
2.1 准备数据集
数据集格式为:
# 为了高效训练,请确保数据格式为: { "id": "000000033471", "image": ["coco/train2017/000000033471.jpg"], # 如果是纯文本,则该字段为 None 或者不存在 "conversations": [ { "from": "human", "value": "<image>\nWhat are the colors of the bus in the image?" }, { "from": "gpt", "value": "The bus in the image is white and red." } ] }
2.2 配置微调参数
修改/root/InternLM/code/XTuner/xtuner/configs/internvl/v2/internvl_v2_internlm2_2b_qlora_finetune.py
2.3 合并权重&&模型转换
官方脚本进行权重合并,并重新推理,效果如下使用其他图片效果如下