使用魔搭社区部署AI

最新推荐文章于 2025-03-27 11:27:05 发布

有嚼劲的iD

最新推荐文章于 2025-03-27 11:27:05 发布

阅读量661

点赞数 3

文章标签：人工智能

本文链接：https://blog.csdn.net/xzh_id/article/details/139908943

版权

启动实例

通义千问7B模型启动

使用阿里提供的ModelScope Library，经过长时间的依赖安装，成功运行。

但是ModelScope的实例不够稳定且无法联网调用，这对开发造成了许多困难，因此还是需要本地部署。

本地安装

运行代码：

git clone https://github.com/Dao-AILab/flash-attention
cd flash-attention && pip install .

在本地安装成功后，运行示例代码：

from modelscope import AutoModelForCausalLM, AutoTokenizer
from modelscope import GenerationConfig

# Note: The default behavior now has injection attack prevention off.
tokenizer = AutoTokenizer.from_pretrained("qwen/Qwen-7B-Chat", trust_remote_code=True)

# use auto mode, automatically select precision based on the device.
model = AutoModelForCausalLM.from_pretrained("qwen/Qwen-7B-Chat", device_map="auto", trust_remote_code=True).eval()

# 第一轮对话 1st dialogue turn
response, history = model.chat(tokenizer, "你好", history=None)
print(response)
# 你好！很高兴为你提供帮助。

虽然成功运行，但是运行速度缓慢，且时不时会发生爆RAM的情况，运行失败。

需要编写Flask程序才能使用api接口调用本地的模型服务。

灵活度不高，完全是clone了通义千问的github库，在此基础上直接使用，经过尝试后放弃这个方案，考虑使用一些更加高度集成的工具进行辅助开发。

关注博主即可阅读全文