目录
开源项目地址:
GitHub - Meituan-AutoML/MobileVLM: Strong and Open Vision Language Assistant for Mobile Devices
模型是自动下载的
路径:
C:\Users\xxx\.cache\huggingface\hub
models--mtgv--MobileLLaMA-1.4B-Chat
在1060显卡上能跑
LLaMA Meta
2023年7月,Meta公司发布了人工智能模型LLaMA 2的开源商用版本
MobileVLM V2
from scripts.inference import inference_once
# model_path = "mtgv/MobileVLM-1.7B" # MobileVLM
model_path = "mtgv/MobileVLM_V2-1.7B" # MobileVLM V2
image_file = "assets/samples/demo.jpg"
prompt_str = "Who is the author of this book?\nAnswer the question using a single word or phrase."
# (or) What is the title of this book?
# (or) Is this book related to Education & Teaching?
args = type('Args', (), {
"model_path": model_path,
"image_file": image_file,
"prompt": prompt_str,
"conv_mode": "v1",
"temperature": 0,
"top_p": None,
"num_beams": 1,
"max_new_tokens": 512,
"load_8bit": False,
"load_4bit": False,
})()
inference_once(args)
MobileLLaMA-1.4B 调错
import torch
from transformers import LlamaTokenizer, LlamaForCausalLM
model_path = 'mtgv/MobileLLaMA-1.4B-Chat'
tokenizer = LlamaTokenizer.from_pretrained(model_path)
model = LlamaForCausalLM.from_pretrained(
model_path, torch_dtype=torch.float16, device_map='auto',
)
prompt = 'Q: What is the largest animal?\nA:'
input_ids = tokenizer(prompt, return_tensors="pt").input_ids.cuda()
generation_output = model.generate(
input_ids=input_ids, max_new_tokens=32
)
print(tokenizer.decode(generation_output[0]))
原版报错:
RuntimeError: Expected all tensors to be on the same device, but found at least two devices, cuda:0 and cpu! (when checking argument for argument index in method wrapper_CUDA__index_select)
改进后,发现input_ids需要改为cuda运行, input_ids.cuda()