使用LocalAI搭建本地AI模型实现OpenAI兼容的REST API

最新推荐文章于 2024-08-28 08:10:09 发布

qq_37836323

最新推荐文章于 2024-08-28 08:10:09 发布

阅读量558

点赞数 5

文章标签： python

本文链接：https://blog.csdn.net/qq_29929123/article/details/140684468

版权

在这篇文章中，我们将介绍如何使用LocalAI在本地搭建一个与OpenAI API兼容的REST API服务，并使用LlamaIndex直接与LocalAI服务器进行交互。

设置LocalAI

首先，让我们在本地设置LocalAI。

git clone git@github.com:mudler/LocalAI.git
cd LocalAI
git checkout tags/v1.40.0

接下来，启动LocalAI服务器并下载lunademo模型。当运行docker compose up时，它实际上会在本地构建LocalAI容器，这可能需要一些时间。v1.40.0版本为多个平台提供了预构建的Docker镜像，但并非所有平台都有，所以本教程本地构建以适应更多情况。

docker compose up --detach
curl http://localhost:8080/models/apply -H "Content-Type: application/json" -d '{
    "id": "model-gallery@lunademo"
}'

使用输出中打印的工作ID监控模型下载，根据你的下载速度，这可能需要几分钟的时间，使用如下命令：

curl -s http://localhost:8080/models/jobs/123abc

列出已下载的模型：

curl http://localhost:8080/v1/models

手动交互

服务器运行后，我们可以在LlamaIndex之外进行测试。实际的聊天调用可能需要几分钟的时间，视使用的模型和计算硬件而定。

curl -X POST http://localhost:8080/v1/chat/completions -H "Content-Type: application/json" -d '{
    "model": "lunademo",
    "messages": [{"role": "user", "content": "How are you?"}],
    "temperature": 0.9
}'

输出示例：

{
    "created":123,
    "object":"chat.completion",
    "id":"abc123",
    "model":"lunademo",
    "choices":[
        {
            "index":0,
            "finish_reason":"stop",
            "message":{
                "role":"assistant",
                "content":"I'm doing well, thank you. How about yourself?... (省略)"
            }
        }
    ],
    "usage":{
        "prompt_tokens":0,
        "completion_tokens":0,
        "total_tokens":0
    }
}

LlamaIndex 交互

接下来，让我们编写一些代码，通过LlamaIndex与LocalAI进行交互。

%pip install llama-index-llms-openai-like

from llama_index.core.llms import LOCALAI_DEFAULTS, ChatMessage
from llama_index.llms.openai_like import OpenAILike

MAC_M1_LUNADEMO_CONSERVATIVE_TIMEOUT = 10 * 60  # sec

model = OpenAILike(
    **LOCALAI_DEFAULTS,
    model="lunademo",
    is_chat_model=True,
    timeout=MAC_M1_LUNADEMO_CONSERVATIVE_TIMEOUT,
)
response = model.chat(messages=[ChatMessage(content="How are you?")])
print(response)

示例输出：

assistant: I'm doing well, thank you. How about yourself?

Do you have any questions or concerns regarding your health?... (省略)

参考资料

LocalAI GitHub仓库

常见错误与解决方法

Docker镜像构建失败：确保你的系统安装了Docker并且版本是最新的。
API请求超时：模型加载时间较长，建议适当增加超时时间。
无响应或错误响应：检查LocalAI服务器是否正确启动，并且模型是否正确下载和加载。

如果你觉得这篇文章对你有帮助，请点赞，关注我的博客，谢谢!

qq_37836323

关注

5
点赞
踩
5

收藏

觉得还不错? 一键收藏
0
评论
复制链接

分享到 QQ

分享到新浪微博

扫一扫