本地运行LLama 3.2的三种方法

AI学习不迷路

于 2024-10-10 10:05:20 发布

阅读量517

点赞数 6

文章标签： llama 人工智能自然语言处理大语言模型 AI大模型语言模型 LLM

本文链接：https://blog.csdn.net/qkh1234567/article/details/142813946

版权

大型语言模型（LLMs）已经彻底改变了AI领域，小型模型也在崛起。因此，即使是在旧的PC和智能手机上运行先进的LLMs也成为了可能。为了给大家一个起点，我们将探索三种不同的方法来本地与LLama 3.2进行交互。

先决条件

在我们深入探讨之前，请确保你已经：

安装并运行了Ollama
已经拉取了LLama 3.2模型（在终端中使用 ollama pull llama3.2）

现在，让我们来探索这三种方法！

Ollama的Python包提供了一种简便的方法，可以在你的Python脚本或Jupyter笔记本中与LLama 3.2进行交互。

import ollama


response = ollama.chat(
    model="llama3.2",
    messages=[
        {
            "role": "user",
            "content": "Tell me an interesting fact about elephants",
        },
    ],
)
print(response["message"]["content"])

这种方法非常适合简单的同步交互。但如果你想要流式接收响应呢？Ollama为你提供了AsyncClient：

import asyncio
from ollama import AsyncClient


async def chat():
    message = {
        "role": "user",
        "content": "Tell me an interesting fact about elephants"
    }
    async for part in await AsyncClient().chat(
        model="llama3.2", messages=[message], stream=True
    ):
        print(part["message"]["content"], end="", flush=True)


# Run the async function
asyncio.run(chat())

方法二：使用Ollama API

对于那些更喜欢直接使用API或想要将LLama 3.2集成到非Python应用程序中的人，Ollama提供了一个简单的HTTP API。

curl http://localhost:11434/api/chat -d '{
    "model": "llama3.2",
    "messages": [
        {
            "role": "user",
            "content": "What are God Particles?"
        }
    ],
    "stream": false
}'

这种方法为你提供了从任何能够发出HTTP请求的语言或工具与LLama 3.2进行交互的灵活性。

方法三：使用Langchain构建高级应用程序

对于更复杂的应用程序，特别是涉及文档分析和检索的应用程序，Langchain与Ollama和LLama 3.2可以无缝集成。

以下代码片段展示了加载文档、创建嵌入和执行相似性搜索的过程：

from langchain_community.document_loaders import DirectoryLoader, UnstructuredWordDocumentLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_community.embeddings import OllamaEmbeddings
from langchain_community.llms import Ollama
from langchain_community.vectorstores import Chroma


# 加载文档
loader = DirectoryLoader('/path/to/documents', glob="**/*.docx", loader_cls=UnstructuredWordDocumentLoader)
documents = loader.load()


# 将文档拆分为多个块
text_splitter = RecursiveCharacterTextSplitter(chunk_size=500, chunk_overlap=0)
splits = text_splitter.split_documents(documents)


# 创建嵌入和向量存储
embeddings = OllamaEmbeddings(model="nomic-embed-text")
vectorstore = Chroma.from_documents(documents=splits, embedding=embeddings)


# 初始化LLama 3.2
llm = Ollama(model="llama3.2", base_url="http://localhost:11434")


# 执行相似性搜索并生成回答
query = "What was the main accomplishment of Thomas Jefferson?"
similar_docs = vectorstore.similarity_search(query)
context = "\n".join([doc.page_content for doc in similar_docs])
response = llm(f"上下文: {context}\n问题: {query}\n回答:")
print(response)

这种方法允许你构建能够使用LLama 3.2强大的语言理解能力来理解和推理大量文本数据的应用程序。

开始生成提示了！

下载成功后，你可以在终端中使用以下命令运行模型：

$ ollama run llama3.2:3b-instruct-q8_0

现在你已经可以生成一些图片提示了。假设我想为我的博客文章创建一个缩略图，于是我运行了下面的提示词：

Generate a random image prompt that I can use as a thumbnail for my article.

这是生成的结果：

Here's a random image prompt for you:


"A misty, moonlit forest with an old, gnarled tree in the center, its branches twisted and tangled like a giant's fingers. In the background, a subtle glow
emanates from a faint, ethereal light that seems to be seeping from the very earth itself."


Feel free to use this prompt as is or modify it to fit your article's theme and style!