BLIP使用教程


原理篇: BLIP2-图像文本预训练论文解读

准备

如果无网络需提前下载相关模型
安装torch、transformers

pip install torch trtransformers

测试

测试blip基于图片生成文本描述能力(Caption);
caption分为两种:
有条件:在输入图片基础上,增加文本描述提示
无条件:仅输入图片。
以下是测试以blip-image-captioning-base模型进行

示例一

示例图如下:
在这里插入图片描述
运行结果如下:
有条件:a photography of a woman and her dog on the beach. 其中a photography of为输入文本提示
无条件:a woman sitting on the beach with her dog

示例二:

以网上下载musk图片示例,
在这里插入图片描述
运行结果如下:
有条件:a photography of a man in a suit and tie
无条件:a man in a suit and tie is surrounded by reporters

several people作为文本提示,输出结果:
several people are standing in the street

the man is interviewed作为文本提示,输出结果:
the man is interviewed by the media

结论

可对图片进行大致概括,但无法输出细节信息。比如:图中为马斯克、背景中人也没没有进行描述。

源代码

# coding=utf-8
import torch
from PIL import Image
from transformers import BlipProcessor, BlipForConditionalGeneration


def blip(img_path):
    # processor = BlipProcessor.from_pretrained("Salesforce/blip-image-captioning-base")
    # model = BlipForConditionalGeneration.from_pretrained("Salesforce/blip-image-captioning-base")

    processor = BlipProcessor.from_pretrained("./pretrained_models/blip-image-captioning-base")
    model = BlipForConditionalGeneration.from_pretrained("./pretrained_models/blip-image-captioning-base", torch_dtype=torch.float16).to("cuda")

    raw_image = Image.open(img_path).convert('RGB')

    # conditional image captioning
    text = "several people"
    inputs = processor(raw_image, text, return_tensors="pt").to("cuda", torch.float16)

    out = model.generate(**inputs)
    print(processor.decode(out[0], skip_special_tokens=True))
    # >>> a photography of a woman and her dog

    # unconditional image captioning
    inputs = processor(raw_image, return_tensors="pt").to("cuda", torch.float16)

    out = model.generate(**inputs)
    print(processor.decode(out[0], skip_special_tokens=True))

def main():
    img_path = "./example/blip/musk.jpg"
    blip(img_path)

if __name__ == "__main__":
    main()
  • 4
    点赞
  • 17
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值