LiteLLM - 集成 Qwen、智谱、Moonshot 等国内大模型 API

编程乐园

已于 2025-01-29 19:26:43 修改

阅读量1.5k

点赞数 10

分类专栏： # AI 项目实战文章标签： LiteLLM openai 模型 zhipu moonshot llama_index langchain

于 2024-09-14 09:36:05 首次发布

本文链接：https://blog.csdn.net/lovechris00/article/details/142249313

版权

AI 项目实战专栏收录该内容

20 篇文章

订阅专栏

文章目录

目前 Qwen、智谱、Moonshot 等 LLM 的 API 都做了和 OpenAI 的兼容，所以按照 OpenAI-Compatible Endpoints 教程来配置即可：
https://docs.litellm.ai/docs/providers/openai_compatible

如果采用 Custom API Server (Custom Format) 的方式，可以调通，但 token 不会写入 usage 记录。
https://docs.litellm.ai/docs/providers/custom_llm_server
也可能是我操作使用有问题，欢迎反馈。

配置模型

方式一：
在后台管理页面，添加模型时，选择 OpenAI-Compatible...

在这里插入图片描述

方式二：编写配置文件

config.yaml 写入如下

api_key 写各个平台申请的key

# general_settings: master_key
model_list:
  - model_name: mistralai--Mistral-Nemo-Instruct-2407
    litellm_params:
      model: huggingface/mistralai/Mistral-Nemo-Instruct-2407
      api_key: os.environ/HUGGINGFACE_API_KEY
  - model_name: "my-custom-model"
    litellm_params:
      model: "my-custom-llm/my-model"
  - model_name: "glm-4"
    litellm_params:
      model: "openai/glm-4"
      api_key: "6eeeb...abPJyrc8e"
      api_base: "https://open.bigmodel.cn/api/paas/v4/"
  - model_name: "qwen-plus"
    litellm_params:
      model: "openai/qwen-plus"
      api_key: 'sk-40d1c7...1d4'
      api_base: "https://dashscope.aliyuncs.com/compatible-mode/v1"
  - model_name: "moonshot-v1-8k"
    litellm_params:
      model: "openai/moonshot-v1-8k"
      api_key: 'sk-d7YcsMzyml...SzPCzQMB5e'
      api_base: "https://api.moonshot.cn/v1"

运行服务

litellm --config  /Users/.../config.yaml  --detailed_debug

openai 调用测试

import openai
client = openai.OpenAI(
    api_key="sk-1234",             # pass litellm proxy key, if you're using virtual keys
    base_url="http://0.0.0.0:4000" # litellm-proxy-base url
)

model_name = "glm-4"
model_name = 'qwen-plus'
model_name = 'moonshot-v1-8k'
response = client.chat.completions.create(
    model=model_name,
    messages = [
        {
            "role": "user",
            "content": "你吃了吗？"
        }
    ],
)

print(response)

ChatCompletion(
	id='chatcmpl-66e4e28a9dce9c9fee5e5960', 
	choices=[ Choice(
		finish_reason='stop', 
		index=0, 
		logprobs=None, 
		message=ChatCompletionMessage(
			content='作为一个人工智能助手，我没有生理需求，所以不需要吃饭。但是，我很高兴为您提供帮助。请问有什么问题我可以帮您解答吗？', 
			refusal=None, role='assistant', 
			function_call=None, tool_calls=None)
		)], 
	created=1726276234, 
	model='moonshot-v1-8k', 
	object='chat.completion', 
	service_tier=None, 
	system_fingerprint=None, 
	usage=CompletionUsage(
		completion_tokens=28, 
		prompt_tokens=11, 
		total_tokens=39, 
		completion_tokens_details=None
	)
)

配置 Embedding 模型

官方说明：https://docs.litellm.ai/docs/embedding/supported_embedding
参考：Dashscope Embedding 模型对 OpenAI 的兼容
https://help.aliyun.com/zh/dashscope/developer-reference/openai-embedding-interface

这里我在 config.yaml 中添加

  - model_name: "text-embedding-v1"
    litellm_params:
      model: "openai/text-embedding-v1"
      api_key: 'sk-40d1c7...11d4'
      api_base: "https://dashscope.aliyuncs.com/compatible-mode/v1" 
  - model_name: "zhipu--Embedding-3"
    litellm_params:
      model: "openai/Embedding-3"
      api_key: "6eeeb...yrc8e"
      api_base: "https://open.bigmodel.cn/api/paas/v4/"

注：zhipu 官方模型页面，回调地址是 https://open.bigmodel.cn/api/paas/v4/embeddings，但这里配置时，不需要 embeddings，否则请求会报错，NotFoundError ...'path': '/v4/embeddings/embeddings'

litellm 代码调用

from litellm import embedding

api_base = "http://0.0.0.0:4000/"   

# 无论配置模型的 model_name  前面是否有 openai, 这里必须加上 openai/ 
# model_name = 'openai/zhipu--Embedding-3' # 也可以
model_name = 'openai/text-embedding-v1' 
response = embedding(model = model_name, api_base=api_base, input=["good morning from litellm"], api_key='sk-1234' )

EmbeddingResponse(model='Embedding-3', 
	data=[
		{'embedding': [-0.019012451, 0.001613617, -0.0066719055, -0.0015325546, -0.013648987, 0.009605408, 0.0064048767, 0.02810669, 0.0038928986, 0.021697998, 0.016098022, 0.01928711, -0.0015888214, -0.0029773712, 0.011268616, 0.020355225, 0.011779785, -0.013755798, -0.023406982, 0.034942627, 0.010437012, ..., 0.0014047623, -0.026107788, 0.01939392, 0.011260986, -0.048828125, -0.011276245, 0.01448822, 0.0005726814, -0.011695862, 0.012634277, 0.011489868, -0.021652222, 0.02947998, 0.0013313293, 0.040405273, -0.022705078, 0.04095459, -0.02406311, -0.004421234], 
		'index': 0, 'object': 'embedding'}
	], 
	object='list', 
	usage=Usage(
		completion_tokens=0, prompt_tokens=9, 
		total_tokens=9, completion_tokens_details=None
		)
	)

openai - embeddings

import openai   
from openai import OpenAI

llm_base_url = 'http://localhost:4000/'
llm_api_key  = 'sk-1234'
 
chat_model_id = 'zhipu--GLM-4-Flash' 
embed_model_name = 'openai/text-embedding-v1'  # ali 
embed_model_name = 'openai/Embedding-3'  # zhipu 
 
client = OpenAI(
    base_url = llm_base_url,  
    api_key  = llm_api_key, 
)    

num_embeddings = 10000  
response = client.embeddings.create(
    input="Your text goes here", 
    model=embed_model_name
)
print(response)

embedding_data = response.data[0].embedding
print(len(embedding_data))

llama_index 调用

尝试了

LiteLLM 官方给出的 AzureOpenAI，和 llama_index 推出的 Openai like，效果都不尽如人意。
后来发现 llama_index 推出了 litellm 拓展方案
https://docs.llamaindex.ai/en/stable/examples/llm/litellm/
方法如下：

1、安装

pip install llama-index
pip install llama-index-llms-litellm
pip install llama-index-embeddings-litellm

2、litellm 中的模型配置

...
  - model_name: "zhipu--glm-4"
    litellm_params:
      model: "openai/glm-4"
      api_key: "6ee...yrc8e"
      api_base: "https://open.bigmodel.cn/api/paas/v4/"
  - model_name: "text-embedding-v1"
    litellm_params:
      model: "openai/text-embedding-v1"
      api_key: 'sk-40...11d4'
      api_base: "https://dashscope.aliyuncs.com/compatible-mode/v1" 

...

之所以列这个，是因为，外部调用时，模型名字和此处的设置关系很大。说明并不清晰。

3、调用

import os
from llama_index.llms.litellm import LiteLLM
from llama_index.core.llms import ChatMessage
from llama_index.core import Settings

message = ChatMessage(role="user", content="Hey! how's it going?")

litellm_key = "sk-1234"
litellm_base_url = 'http://localhost:4000/'

# model_name = 'openai/glm-4' # 不行  
model_name = 'openai/zhipu--glm-4'  

llm = LiteLLM(
        model=model_name, 
        api_key=litellm_key,
        api_base=litellm_base_url 
    )

message = ChatMessage(role="user", content="Hey! how's it going?")
llm.chat([message]) 


chat_response = llm.chat([message])

# embedding 
embed_model_name = 'openai/text-embedding-v1' 
embed_model = LiteLLMEmbedding(
    model_name=embed_model_name,
    api_key=litellm_key,
    api_base=litellm_base_url 
)

embed_data = embed_model._get_text_embedding('hello')

# 设置全局 
Settings.llm = llm
Settings.embed_model = embed_model

LangChain 中使用

参考：https://docs.litellm.ai/docs/proxy/user_keys

from langchain.chat_models import ChatOpenAI
from langchain.prompts.chat import (
    ChatPromptTemplate,
    HumanMessagePromptTemplate,
    SystemMessagePromptTemplate,
)
from langchain.schema import HumanMessage, SystemMessage
import os 

os.environ["OPENAI_API_KEY"] = "sk-1234"

# 以下4个模型名字都可以 
model_name = 'openai/qwen-plus'
model_name = 'qwen-plus'

model_name = 'openai/GLM-4-Flash'
model_name = "zhipu--GLM-4-Flash"

chat = ChatOpenAI(
    openai_api_base="http://0.0.0.0:4000",
    model = model_name,
    temperature=0.1,
    extra_body={
        "metadata": {
            "generation_name": "ishaan-generation-langchain-client",
            "generation_id": "langchain-client-gen-id22",
            "trace_id": "langchain-client-trace-id22",
            "trace_user_id": "langchain-client-user-id2"
        }
    }
)

messages = [
    SystemMessage(
        content="你是一个有用的生活小助手"
    ),
    HumanMessage(
        content="今晚吃什么"
    ),
]
response = chat(messages)

print('-- ', response)