书生大模型实战营第3期-多模态部署微调实践

最新推荐文章于 2024-09-30 15:42:36 发布

m0_60917580

最新推荐文章于 2024-09-30 15:42:36 发布

阅读量276

点赞数 10

文章标签：前端服务器 linux

本文链接：https://blog.csdn.net/m0_60917580/article/details/141974722

版权

1. InternVL 部署微调实践

依据教程使用处理后的huggingface上的zhongshsh/CLoT-Oogiri-GO据集，并选择一张图片作为case进行推理

@misc{zhong2023clot,
  title={Let's Think Outside the Box: Exploring Leap-of-Thought in Large Language Models with Creative Humor Generation},
  author={Zhong, Shanshan and Huang, Zhongzhan and Gao, Shanghua and Wen, Weushao and Lin, Liang and Zitnik, Marinka and Zhou, Pan},
  journal={arXiv preprint arXiv:2312.02439},

可以看出推理效果并不是很好，因此需要进行微调

2. InternVL 微调攻略

2.1 准备数据集

数据集格式为：

# 为了高效训练，请确保数据格式为：
{
    "id": "000000033471",
    "image": ["coco/train2017/000000033471.jpg"], # 如果是纯文本，则该字段为 None 或者不存在
    "conversations": [
      {
        "from": "human",
        "value": "<image>\nWhat are the colors of the bus in the image?"
      },
      {
        "from": "gpt",
        "value": "The bus in the image is white and red."
      }
    ]
  }