VLM MobileVLM 部署笔记

AI算法网奇

已于 2024-06-03 16:34:28 修改

阅读量513

点赞数 5

分类专栏：深度学习基础文章标签： vlm

于 2024-06-03 15:16:07 首次发布

本文链接：https://blog.csdn.net/jacke121/article/details/139415700

版权

深度学习基础专栏收录该内容

166 篇文章 17 订阅

订阅专栏

开源项目地址：

GitHub - Meituan-AutoML/MobileVLM: Strong and Open Vision Language Assistant for Mobile Devices

模型是自动下载的

路径：

C:\Users\xxx\.cache\huggingface\hub

models--mtgv--MobileLLaMA-1.4B-Chat

在1060显卡上能跑

LLaMA Meta

2023年7月，Meta公司发布了人工智能模型LLaMA 2的开源商用版本

MobileVLM V2

from scripts.inference import inference_once
# model_path = "mtgv/MobileVLM-1.7B" # MobileVLM
model_path = "mtgv/MobileVLM_V2-1.7B" # MobileVLM V2
image_file = "assets/samples/demo.jpg"
prompt_str = "Who is the author of this book?\nAnswer the question using a single word or phrase."
# (or) What is the title of this book?
# (or) Is this book related to Education & Teaching?

args = type('Args', (), {
    "model_path": model_path,
    "image_file": image_file,
    "prompt": prompt_str,
    "conv_mode": "v1",
    "temperature": 0,
    "top_p": None,
    "num_beams": 1,
    "max_new_tokens": 512,
    "load_8bit": False,
    "load_4bit": False,
})()

inference_once(args)

MobileLLaMA-1.4B 调错


import torch
from transformers import LlamaTokenizer, LlamaForCausalLM

model_path = 'mtgv/MobileLLaMA-1.4B-Chat'

tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
    model_path, torch_dtype=torch.float16, device_map='auto',
)

prompt = 'Q: What is the largest animal?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()

generation_output = model.generate(
    input_ids=input_ids, max_new_tokens=32
)
print(tokenizer.decode(generation_output[0]))

原版报错：

RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)

改进后，发现input_ids需要改为cuda运行， input_ids.cuda()