ModelScope+ LangChain、LlamaIndex、vllm、xinterence

本文链接：https://blog.csdn.net/lovechris00/article/details/136842332

本文介绍了ModelScope，一个开源的AI模型社区，与DashScope的关联，后者作为ModelScope的商业化分支，提供模型服务。还讨论了如何通过LangChain、LlamaIndex和vLLM使用ModelScope中的模型以及XINFERENCEMODEL_SRC在模型部署中的应用。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

文章目录

关于 ModelScope 的基础信息，可见文章：
https://blog.csdn.net/lovechris00/article/details/127877735

hf 很多时候不方便访问，使用 hf-mirror 下载模型，一些框架并无法加载本地模型。（无法像 transformer 一样自如）
这时候不如使用 ModelScope，其实很多框架已经对它支持的不错。

ModelScope 和 DashScope 的关系

在一些框架中使用 ModelScope 的模型时，会要求 DashScope 的key，两个是什么关系呢？
参考文章：https://developer.aliyun.com/article/1377012 :

ModelScope是一个开源技术社区，从其立场来看，它并不承担营收的使命。DashScope可以看作是ModelScope的“孪生兄弟”，它们有着相同的底层架构。
两者的区别在于，ModelScope上的许多开发者是基于模型的checkpoint进行Fine-tune，而DashScope更多地为模型提供商（如百川智能、智谱AI、Stability.AI等）提供服务，通过API的方式向下游厂商提供Fine-tune和Influence链路服务。

ModelScope和DashScope是模型的一体两面，都是MaaS（Model as a Service）的一部分。
相对较小的小模型走开源路线，相对较大的大模型则走商业路线。
例如，智谱AI的ChatGLM-6B模型就在ModelScope上进行了开源，并且已经形成了一定的用户规模和影响力。
未来，它的13B、50B、130B模型将通过DashScope进行商业化落地。
无独有偶，阿里云的通义千问也是同样的情况，Qwen-7B模型是开源的，而Qwen-50B模型未来可能会通过DashScope去做API模式的商业化。

LangChain

Providers - More - ModelScope
https://python.langchain.com/docs/integrations/providers/modelscope
中文 https://python.langchain.com.cn/docs/ecosystem/integrations/modelscope

!pip install modelscope

加载 ModelScope Embedding 类


from langchain_community.embeddings import ModelScopeEmbeddings

model_id = "damo/nlp_corom_sentence-embedding_english-base"

embeddings = ModelScopeEmbeddings(model_id=model_id)

text = "This is a test document."

query_result = embeddings.embed_query(text)

doc_results = embeddings.embed_documents(["foo"])

LlamaIndex

ModelScope LLMS
https://docs.llamaindex.ai/en/stable/examples/llm/modelscope.html
ModelScope LLMS
https://docs.llamaindex.ai/en/stable/examples/llm/modelscope.html

安装包

!pip install llama-index-llms-modelscope

Basic Usage

import sys
from llama_index.llms.modelscope import ModelScopeLLM

llm = ModelScopeLLM(model_name="qwen/Qwen1.5-7B-Chat", model_revision="master")

rsp = llm.complete("Hello, who are you?")
print(rsp)

Use Message request

from llama_index.core.base.llms.types import MessageRole, ChatMessage

messages = [
    ChatMessage(
        role=MessageRole.SYSTEM, content="You are a helpful assistant."
    ),
    ChatMessage(role=MessageRole.USER, content="How to make cake?"),
]
resp = llm.chat(messages)
print(resp)

vllm

vLLM
https://docs.vllm.ai/en/stable/getting_started/quickstart.html
魔搭社区牵手FastChat&vLLM，打造极致LLM模型部署体验
https://www.modelscope.cn/headlines/article/302

vllm 默认从 huggingface 下载模型，如果改为 ModelScope，需要设置以下环境变量：

export VLLM_USE_MODELSCOPE=True

你也可以在代码中设置

import os
os.environ['VLLM_USE_MODELSCOPE'] = 'True'

测试

from vllm import LLM, SamplingParams
llm = LLM(model="qwen/Qwen1.5-1.8B")

prompts = [
    "Hello, my name is",
    "today is a sunny day,",
    "The capital of France is",
    "The future of AI is",
]
sampling_params = SamplingParams(temperature=0.8, top_p=0.95,stop=["<|endoftext|>"])
outputs = llm.generate(prompts, sampling_params,)

# print the output
for output in outputs:
    prompt = output.prompt
    generated_text = output.outputs[0].text
    print(f"Prompt: {prompt!r}, Generated text: {generated_text!r}")

xinterence

这样启动：

XINFERENCE_MODEL_SRC=modelscope xinference-local --host 0.0.0.0 --port 9997

这样下载的模型，将被安装到~/.cache/modelscope/hub/ 下，如：
~/.cache/modelscope/hub/qwen/Qwen-7B-Chat

参考：https://inference.readthedocs.io/zh-cn/latest/getting_started/using_xinference.html#using-xinference

伊织 2024-03-19（二）