使用Stable Diffusion - interrogate Clip - TypeError: Argument interpolation should be a InterpolationMode or a corresponding Pillow integer constant
在学习使用Stable Diffusion (v1.8.0),尝试查看生成图片的提示词,就使用图生图中的Interrogate Clip, 但遇到问题。
load checkpoint from D:\DiskH\aiwork\sd-webui-aki-v4\models\BLIP\model_base_caption_capfilt_large.pth
None
*** Error interrogating
Traceback (most recent call last):
File "D:\DiskH\aiwork\sd-webui-aki-v4\modules\interrogate.py", line 195, in interrogate
caption = self.generate_caption(pil_image)
File "D:\DiskH\aiwork\sd-webui-aki-v4\modules\interrogate.py", line 175, in generate_caption
gpu_image = transforms.Compose([
File "D:\DiskH\aiwork\sd-webui-aki-v4\py310\lib\site-packages\torchvision\transforms\transforms.py", line 95, in __call__
img = t(img)
File "D:\DiskH\aiwork\sd-webui-aki-v4\py310\lib\site-packages\torch\nn\modules\module.py", line 1511, in _wrapped_call_impl
return self._call_impl(*args, **kwargs)
File "D:\DiskH\aiwork\sd-webui-aki-v4\py310\lib\site-packages\torch\nn\modules\module.py", line 1520, in _call_impl
return forward_call(*args, **kwargs)
File "D:\DiskH\aiwork\sd-webui-aki-v4\py310\lib\site-packages\torchvision\transforms\transforms.py", line 354, in forward
return F.resize(img, self.size, self.interpolation, self.max_size, self.antialias)
File "D:\DiskH\aiwork\sd-webui-aki-v4\py310\lib\site-packages\torchvision\transforms\functional.py", line 439, in resize
raise TypeError(
TypeError: Argument interpolation should be a InterpolationMode or a corresponding Pillow integer constant
---
查了一堆的文件,Pillow 也升级到最新版本,但没有解决问题。最后,实属无奈只能查看一下代码。
sd-webui-aki-v4\modules\interrogate.py
def generate_caption(self, pil_image):
gpu_image = transforms.Compose([
transforms.Resize((blip_image_eval_size, blip_image_eval_size), interpolation=InterpolationMode.BILINEAR),
transforms.ToTensor(),
transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711))
])(pil_image).unsqueeze(0).type(self.dtype).to(devices.device_interrogate)
with torch.no_grad():
caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
return caption[0]
在这个文件的前端添加引用
import torchvision.transforms.functional as F
def generate_caption(self, pil_image):
gpu_image = transforms.Compose([
transforms.Resize((blip_image_eval_size, blip_image_eval_size), interpolation=F.InterpolationMode.BILINEAR),
transforms.ToTensor(),
transforms.Normalize((0.48145466, 0.4578275, 0.40821073), (0.26862954, 0.26130258, 0.27577711))
])(pil_image).unsqueeze(0).type(self.dtype).to(devices.device_interrogate)
with torch.no_grad():
caption = self.blip_model.generate(gpu_image, sample=False, num_beams=shared.opts.interrogate_clip_num_beams, min_length=shared.opts.interrogate_clip_min_length, max_length=shared.opts.interrogate_clip_max_length)
return caption[0]
修改后,这个问题就解决了。
祝大家生活快乐!