【LocalAI】（9）：本地使用CPU运行LocalAI，一次运行4个大模型，embedding模型，qwen-1.5-05b模型，生成图模型，语音转文字模型

本文链接：https://blog.csdn.net/freewebsys/article/details/138370316

本文介绍了LocalAI项目，一个与OpenAIAPI兼容的本地推理服务，支持多种模型在CPU和GPU硬件上运行。详细讲解了使用Docker启动本地镜像、测试接口以及模型替换的步骤。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

1，关于LocalIA项目

LocalAI 是一个用于本地推理的，与 OpenAI API 规范兼容的 REST API。
它允许您在本地使用消费级硬件运行 LLM（不仅如此），支持与 ggml 格式兼容的多个模型系列。支持CPU硬件/GPU硬件。

项目地址：
https://localai.io/

视频地址：

【LocalAI】（9）：本地使用CPU运行LocalAI，一次运行4个大模型，embedding模型，qwen-1.5-05b模型，生成图模型，语音转文字模型

在这里插入图片描述

2，启动方法项目地址

https://gitee.com/fly-llm/localai-run-llm/blob/master/DockerREADME.md

AIO CPU 镜像是：

Use this image with CPU-only.
Please keep using only C++ backends so the base image is as small as possible (without CUDA, cuDNN, python, etc).

3，使用docker 启动本地镜像


git clone https://gitee.com/fly-llm/localai-run-llm.git

# 开启日志：
docker run -p 8080:8080 -e DEBUG=true --name local-ai -it \
-v `pwd`/aio:/aio -v `pwd`/models:/build/models localai/localai:latest-aio-cpu

4，第1个模型embedding,测试接口


curl -X 'POST' http://0.0.0.0:8080/v1/embeddings \
 -H "Content-Type: application/json" \
 -d '{
  "input": "测试ebmeddings",
  "model": "text-embedding-ada-002"
}'

5，第2个模型LLM测试接口


curl -X 'POST' 'http://0.0.0.0:8080/v1/chat/completions' \
-H 'Content-Type: application/json' -d '{
    "model": "qwen-1.5-0.5b-chat",
    "messages": [
        {
            "role": "user",
            "content": "北京景点?"
        }
    ],
    "temperature": 1
}'

6，第3个模型stablediffusion,测试接口


## 生成图片 Model name: stablediffusion                                                 

curl http://localhost:8080/v1/images/generations -H "Content-Type: application/json" -d '{
  "prompt": "floating hair, portrait, ((loli)), ((one girl)), cute face, hidden hands, asymmetrical bangs, beautiful detailed eyes, eye shadow, hair ornament, ribbons, bowties, buttons, pleated skirt, (((masterpiece))), ((best quality)), colorful|((part of the head)), ((((mutated hands and fingers)))), deformed, blurry, bad anatomy, disfigured, poorly drawn face, mutation, mutated, extra limb, ugly, poorly drawn hands, missing limb, blurry, floating limbs, disconnected limbs, malformed hands, blur, out of focus, long neck, long body, Octane renderer, lowres, bad anatomy, bad hands, text",
  "size": "256x256"
}'

7，第4个模型语音转文字


curl http://localhost:8080/v1/audio/transcriptions -H "Content-Type: multipart/form-data" -F file="@$PWD/voice-test.mp3" -F model="whisper-1"

8，第5个模型tts测试接口，没有调试成功

curl http://localhost:8080/tts -H "Content-Type: application/json" -d '{
    "model":"voice-en-us-amy-low",
    "input": "Hi, this is a test."
}'

9，模型地址aio

替换了镜像里面的aio 文件夹。把模型都使用国内镜像进行下载地址替换了。
使用了modescope 下载地址。

download_files:
- filename: "bge-base-zh-v1.5-ggml-model-q4_0.bin"
  sha256: "da4d976e3988977ec4d9fde6653a8fe954b71a0c502c30eda6f84234556cde54"
  uri: "https://www.modelscope.cn/api/v1/models/flyiot/bge-base-zh-v1.5-ggml/repo?Revision=master&FilePath=ggml-model-q4_0.bin"