MiniCPM-o 是从 MiniCPM-V 升级的最新端侧多模态大模型系列。该系列模型可以以端到端方式,接受图像、视频、文本、音频作为输入,并生成高质量文本和语音输出。
代码:https://github.com/OpenBMB/MiniCPM-o
模型:https://huggingface.co/openbmb/MiniCPM-o-2_6
1. 搭建环境。
# 创建虚拟环境
conda create -n minicpm-o python=3.10
conda activate minicpm-o
# 安装pytorch
conda install pytorch==2.3.1 torchvision==0.18.1 torchaudio==2.3.1 pytorch-cuda=12.1 -c pytorch -c nvidia
git clone https://github.com/OpenBMB/MiniCPM-o.git
cd MiniCPM-o
pip install -r requirements_o2.6.txt
cd web_demos/minicpm-o_2.6
# 启动服务
python chatbot_web_demo_o2.6.py
2. 测试效果。
OCR效果牛逼。
如何避免 flash-attn2 强制依赖
模型初始化阶段如果不指定attn_implementation='flash_attention_2'
,模型推理是不依赖于 flash-attn 库的,但是由于 transformers 的动态导入机制在 4.44.2 版本以前有一些问题,会触发这个报错,这个问题在 transformers==4.45.0
被修复了,具体可参考这个 PR。
ImportError: This modeling file requires the following packages that were not found in your environment: flash_attn. Run `pip install flash_attn`
transformer <= 4.44.2 版本的解决办法可以在模型初始化的时候加上这些代码
from transformers.dynamic_module_utils import *
from unittest.mock import patch
def fixed_get_imports(filename: Union[str, os.PathLike]) -> List[str]:
"""
Extracts all the libraries (not relative imports this time) that are imported in a file.
Args:
filename (`str` or `os.PathLike`): The module file to inspect.
Returns:
`List[str]`: The list of all packages required to use the input module.
"""
with open(filename, "r", encoding="utf-8") as f:
content = f.read()
# filter out try/except block so in custom code we can have try/except imports
content = re.sub(r"\s*try\s*:\s*.*?\s*except\s*.*?:", "", content, flags=re.MULTILINE | re.DOTALL)
# filter out imports under is_flash_attn_2_available block for avoid import issues in cpu only environment
content = re.sub(
r"if is_flash_attn[a-zA-Z0-9_]+available\(\):\s*(from flash_attn\s*.*\s*)+", "", content, flags=re.MULTILINE
)
# Imports of the form `import xxx`
imports = re.findall(r"^\s*import\s+(\S+)\s*$", content, flags=re.MULTILINE)
# Imports of the form `from xxx import yyy`
imports += re.findall(r"^\s*from\s+(\S+)\s+import", content, flags=re.MULTILINE)
# Only keep the top-level module
imports = [imp.split(".")[0] for imp in imports if not imp.startswith(".")]
return list(set(imports))
with patch("transformers.dynamic_module_utils.get_imports", fixed_get_imports):
model = AutoModel.from_pretrained(
'openbmb/MiniCPM-o-2_6', torch_dtype=torch.bfloat16,
trust_remote_code=True,
attn_implementation='sdpa',
init_vision=True,
init_audio=True,
init_tts=True
)
int4 的推理同理,可以替换 get_imports 函数后进行模型初始化,即可绕开 flash-attn 的安装