colab+ngork本地访问多模态大模型

本文链接：https://blog.csdn.net/m0_57057282/article/details/142786769

allenai/Molmo-7B-D-0924
1）colab准备环境，我这里用的是l4

2）安装对应的python库

!pip install transformers Pillow requests einops

!pip install 'accelerate>=0.26.0' bitsandbytes


!pip install --no-deps accelerate bitsandbytes
!pip install chainlit

3）

将下面的代码放到一个新建的py文件，并且运行!chainlit run ui.py --host 0.0.0.0 --port 5000（在ngork后运行）

import chainlit as cl
from transformers import AutoModelForCausalLM, AutoProcessor, GenerationConfig
from PIL import Image
import torch

# Load the processor
processor = AutoProcessor.from_pretrained(
    'allenai/Molmo-7B-D-0924',
    trust_remote_code=True,
    torch_dtype='auto',
    device_map='auto'
)

# Load the model
model = AutoModelForCausalLM.from_pretrained(
    'allenai/Molmo-7B-D-0924',
    trust_remote_code=True,
    torch_dtype='auto',
    device_map='auto'
)

@cl.on_chat_start
async def start():
    await cl.Message("欢迎使用图像分析应用!请上传一张图片，然后输入您的问题或描述要求。").send()

@cl.on_message
async def main(message: cl.Message):
    if not message.elements:
        await cl.Message("请先上传一张图片，然后再输入您的问题。").send()
        return

    image = message.elements[0]
    if not image.mime.startswith("image"):
        await cl.Message("请上传一个有效的图片文件。").send()
        return

    user_prompt = message.content
    if not user_prompt:
        user_prompt = "Describe this image."

    await process_image(image.path, user_prompt)

async def process_image(image_path, user_prompt):
    # Process the image
    inputs = processor.process(
        images=[Image.open(image_path)],
        text=user_prompt
    )

    # Move inputs to the correct device and make a batch of size 1
    inputs = {k: v.to(model.device).unsqueeze(0) for k, v in inputs.items()}

    # Generate output
    output = model.generate_from_batch(
        inputs,
        GenerationConfig(max_new_tokens=200, stop_strings="<|endoftext|>"),
        tokenizer=processor.tokenizer
    )

    # Get generated tokens and decode them to text
    generated_tokens = output[0, inputs['input_ids'].size(1):]
    generated_text = processor.tokenizer.decode(generated_tokens, skip_special_tokens=True)

    # Send the generated text as a message
    await cl.Message(content=generated_text).send()

if __name__ == "__main__":
    cl.run()

在colab可以用；同时执行多个命令，但是要注意；前后都有空格

例如：!ls ; touch a.py

4）运行ngork（先运行这个，在运行上面的命令）

!pip install pyngrok
from pyngrok import ngrok
ngrok.set_auth_token("")（这个需要你去登陆ngork的官网获得）
# 启动 ngrok
public_url = ngrok.connect(5000)
print("公开访问链接:", public_url)

5）访问

https://a87f-107.ngrok-free.app/

成功