使用Ollama从头构建Embedding和RAG系统

最新推荐文章于 2024-08-05 08:50:01 发布

落难Coder

最新推荐文章于 2024-08-05 08:50:01 发布

阅读量5.3k

点赞数 3

分类专栏： LLMs 文章标签： embedding 大模型

本文链接：https://blog.csdn.net/u014297502/article/details/137401706

版权

LLMs 专栏收录该内容

12 篇文章 2 订阅

订阅专栏

本文介绍了如何使用RAG（检索增强生成）技术，通过Ollama库操作文档，利用嵌入向量描述抽象概念，实现对文档内容的高效检索，以找到与查询最相关的部分，并演示了如何构建脚本以在本地处理自定义文档。

摘要由CSDN通过智能技术生成

检索增强生成（RAG）赋予大型语言模型新的能力，使其能够与任何大小的文档或数据集进行互动。接下来，请跟随我一起了解如何解析和操作文档，探讨如何利用嵌入向量来描述抽象概念，实现一种简单而强大的方法，以找出文档中与特定查询最相关的部分，并最终构建一个脚本，使本地托管的大型语言模型能够处理您自己的文档。

创建环境

# set up environment
ollama pull nomic-embed-text
python -m venv .venv
source .venv/bin/activate
python -m pip install ollama numpy

RAG+Ollama

import ollama
import time
import os
import json
import numpy as np
from numpy.linalg import norm


# open a file and return paragraphs
def parse_file(filename):
    with open(filename, encoding="utf-8-sig") as f:
        paragraphs = []
        buffer = []
        for line in f.readlines():
            line = line.strip()
            if line:
                buffer.append(line)
            elif len(buffer):
                paragraphs.append((" ").join(buffer))
                buffer = []
        if len(buffer):
            paragraphs.append((" ").join(buffer))
        return paragraphs


def save_embeddings(filename, embeddings):
    # create dir if it doesn't exist
    if not os.path.exists("embeddings"):
        os.makedirs("embeddings")
    # dump embeddings to json
    with open(f"embeddings/{filename}.json", "w") as f:
        json.dump(embeddings, f)


def load_embeddings(filename):
    # check if file exists
    if not os.path.exists(f"embeddings/{filename}.json"):
        return False
    # load embeddings from json
    with open(f"embeddings/{filename}.json", "r") as f:
        return json.load(f)


def get_embeddings(filename, modelname, chunks):
    # check if embeddings are already saved
    if (embeddings := load_embeddings(filename)) is not False:
        return embeddings
    # get embeddings from ollama
    embeddings = [
        ollama.embeddings(model=modelname, prompt=chunk)["embedding"]
        for chunk in chunks
    ]
    # save embeddings
    save_embeddings(filename, embeddings)
    return embeddings


# find cosine similarity of every chunk to a given embedding
def find_most_similar(needle, haystack):
    needle_norm = norm(needle)
    similarity_scores = [
        np.dot(needle, item) / (needle_norm * norm(item)) for item in haystack
    ]
    return sorted(zip(similarity_scores, range(len(haystack))), reverse=True)


def main():
    SYSTEM_PROMPT = """You are a helpful reading assistant who answers questions 
        based on snippets of text provided in context. Answer only using the context provided, 
        being as concise as possible. If you're unsure, just say that you don't know.
        Context:
    """
    # open file
    filename = "peter-pan.txt"
    paragraphs = parse_file(filename)

    embeddings = get_embeddings(filename, "nomic-embed-text", paragraphs)

    prompt = input("what do you want to know? -> ")
    # strongly recommended that all embeddings are generated by the same model (don't mix and match)
    prompt_embedding = ollama.embeddings(model="nomic-embed-text", prompt=prompt)["embedding"]
    # find most similar to each other
    most_similar_chunks = find_most_similar(prompt_embedding, embeddings)[:5]

    response = ollama.chat(
        model="mistral",
        messages=[
            {
                "role": "system",
                "content": SYSTEM_PROMPT
                + "\n".join(paragraphs[item[1]] for item in most_similar_chunks),
            },
            {"role": "user", "content": prompt},
        ],
    )
    print("\n\n")
    print(response["message"]["content"])


if __name__ == "__main__":
    main()

落难Coder

关注

3
点赞
踩
16

收藏

觉得还不错? 一键收藏
打赏
0
评论
使用Ollama从头构建Embedding和RAG系统

检索增强生成（RAG）赋予大型语言模型新的能力，使其能够与任何大小的文档或数据集进行互动。接下来，请跟随我一起了解如何解析和操作文档，探讨如何利用嵌入向量来描述抽象概念，实现一种简单而强大的方法，以找出文档中与特定查询最相关的部分，并最终构建一个脚本，使本地托管的大型语言模型能够处理您自己的文档。
复制链接

扫一扫