在本地轻松运行MLX模型：使用MLXPipeline类的完整指南

本文链接：https://blog.csdn.net/mmlihaio/article/details/143667163

# 在本地轻松运行MLX模型：使用MLXPipeline类的完整指南

## 引言
随着机器学习的发展，许多优秀的开源模型被整合到平台上以便于共享和使用。MLX社区在Hugging Face Model Hub上提供了超过150个开源模型，这些模型可以通过MLXPipeline类本地运行，或通过其托管的推理端点调用。本文将详细介绍如何使用MLXPipeline类本地加载和运行模型，为开发者提供实用的知识和代码示例。

## 主要内容

### 安装必要的软件包
要开始使用MLXPipeline类，您需要安装以下Python软件包：
```bash
%pip install --upgrade --quiet mlx-lm transformers huggingface_hub

模型加载

MLXPipeline类支持通过模型ID或使用已有的transformers管道对象加载模型。

通过模型ID加载：

from langchain_community.llms.mlx_pipeline import MLXPipeline

# 使用API代理服务提高访问稳定性
pipe = MLXPipeline.from_model_id(
    "mlx-community/quantized-gemma-2b-it",
    pipeline_kwargs={"max_tokens": 10, "temp": 0.1},
)

使用已有pipeline对象加载：

from mlx_lm import load

model, tokenizer = load("mlx-community/quantized-gemma-2b-it")
pipe = MLXPipeline(model=model, tokenizer=tokenizer)

创建链式流程

借助加载的模型，您可以创建一个链式流程，将PROMPT与模型结合进行推理。

from langchain_core.prompts import PromptTemplate

template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

chain = prompt | pipe

question = "What is electroencephalography?"

print(chain.invoke({"question": question}))

代码示例

以下是一个完整的代码示例，演示如何安装库、加载模型并创建一个简单的链：

# 安装必要的软件包
%pip install --upgrade --quiet mlx-lm transformers huggingface_hub

# 导入必要的模块
from langchain_community.llms.mlx_pipeline import MLXPipeline
from langchain_core.prompts import PromptTemplate

# 加载模型
pipe = MLXPipeline.from_model_id(
    "mlx-community/quantized-gemma-2b-it",
    pipeline_kwargs={"max_tokens": 10, "temp": 0.1},
)

# 创建提示模板
template = """Question: {question}
Answer: Let's think step by step."""
prompt = PromptTemplate.from_template(template)

# 形成链
chain = prompt | pipe

# 提问并获取答案
question = "What is electroencephalography?"
print(chain.invoke({"question": question}))