基于 LlamaIndex 构建自己的 RAG 知识库

  1. 创建虚拟环境用于运行

    1. 运行 InternLM 的基础环境,命名为 llamaindex
      conda create -n llamaindex python=3.10
    2. 查看存在的环境 
      conda env list
    3. 激活刚刚创建的环境 
      conda activate llamaindex
    4. 安装基本库pytorch,torchvision ,torchaudio,pytorch-cuda 并指定通道(建议写上对应的版本号)
      1. conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.7 -c pytorch -c nvidia
  2. 安装 Llamaindex

    1. 此操作在对应的虚拟环境中安装 Llamaindex和相关的包
      pip install llama-index==0.10.38 llama-index-llms-huggingface==0.2.0 "transformers[torch]==4.41.1" "huggingface_hub[inference]==0.23.1" huggingface_hub==0.23.1 sentence-transformers==2.7.0 sentencepiece==0.2.0
  3. 下载 Sentence Transformer 模型

    1. 为了方面管理建立对应的路径,在根目录下创建2个文件(
      mkdir llamaindex_demo
      mkdir model
    2. 然后在llamaindex_demo目录下创建下载脚本(
      touch llamaindex_demo/download_hf.py
    3. 在download_hf.py文件中写入
      1. import os

        # 设置环境变量
        os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'

        # 下载模型下载源词向量模型Sentence Transformer
        os.system('huggingface-cli download --resume-download sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2 --local-dir /root/model/sentence-transformer')

    4. 执行下载模型脚本
      python download_hf.py
    5. 如果上面的步骤不存在nltk此处需要手动下载nltk模型(cd /root
      git clone https://gitee.com/yzy0612/nltk_data.git  --branch gh-pages
      cd nltk_data
      mv packages/*  ./
      cd tokenizers
      unzip punkt.zip
      cd ../taggers
      unzip averaged_perceptron_tagger.zip)
  4. LlamaIndex HuggingFaceLLM

    1. 下载模型internlm2-chat-1_8b (pip install internlm2-chat-1_8b )
    2. 如果有对应的模型可以软链接出来ln -s 模型路径 要复制到哪里的路径如(
      cd ~/model
      ln -s /root/share/new_models/Shanghai_AI_Laboratory/internlm2-chat-1_8b/ ./
    3. 创建运行模型脚本 touch 
      touch ~/llamaindex_demo/llamaindex_internlm.py
    4. 编辑llamaindex_internlm.py文件(
      from llama_index.llms.huggingface import HuggingFaceLLM
      from llama_index.core.llms import ChatMessage
      llm = HuggingFaceLLM(
          model_name="/root/model/internlm2-chat-1_8b",
          tokenizer_name="/root/model/internlm2-chat-1_8b",
          model_kwargs={"trust_remote_code":True},
          tokenizer_kwargs={"trust_remote_code":True}
      )
      
      rsp = llm.chat(messages=[ChatMessage(content="xtuner是什么?")])
      print(rsp)
    5. 运行模型 
      python llamaindex_internlm.py
  5. LlamaIndex RAG

    1. 安装 LlamaIndex 词嵌入向量依赖(

      pip install llama-index-embeddings-huggingface llama-index-embeddings-instructor

    2. 如果上面步骤报错请根据提示安装对应的插件版本(如 pip install huggingface-hub==0.23.5)

    3. 获取知识库(创建data 把xtuner包中文件移动到对应的目录cd ~/llamaindex_demo
      mkdir data
      cd data
      git clone https://github.com/InternLM/xtuner.git
      mv xtuner/README_zh-CN.md ./)

    4. 创建运行模型代码

      llamaindex_RAG.py
    5. llamaindex_RAG.py文件内容(from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
      
      from llama_index.embeddings.huggingface import HuggingFaceEmbedding
      from llama_index.llms.huggingface import HuggingFaceLLM
      
      embed_model = HuggingFaceEmbedding(
          model_name="/root/model/sentence-transformer"
      )
      
      Settings.embed_model = embed_model
      
      llm = HuggingFaceLLM(
          model_name="/root/model/internlm2-chat-1_8b",
          tokenizer_name="/root/model/internlm2-chat-1_8b",
          model_kwargs={"trust_remote_code":True},
          tokenizer_kwargs={"trust_remote_code":True}
      )
      Settings.llm = llm
      
      documents = SimpleDirectoryReader("/root/llamaindex_demo/data").load_data()
      index = VectorStoreIndex.from_documents(documents)
      query_engine = index.as_query_engine()
      response = query_engine.query("xtuner是什么?")
      
      print(response))
    6. 运行
      python llamaindex_RAG.py
  6. 浏览器上运行对话

    1. 安装服务依赖
      pip install streamlit==1.36.0
    2. 创建运行脚本app.py

      import streamlit as st
      from llama_index.core import VectorStoreIndex, SimpleDirectoryReader, Settings
      from llama_index.embeddings.huggingface import HuggingFaceEmbedding
      from llama_index.llms.huggingface import HuggingFaceLLM

      st.set_page_config(page_title="llama_index_demo", page_icon="🦜🔗")
      st.title("llama_index_demo")

      # 初始化模型
      @st.cache_resource
      def init_models():
          embed_model = HuggingFaceEmbedding(
              model_name="/root/model/sentence-transformer"
          )
          Settings.embed_model = embed_model

          llm = HuggingFaceLLM(
              model_name="/root/model/internlm2-chat-1_8b",
              tokenizer_name="/root/model/internlm2-chat-1_8b",
              model_kwargs={"trust_remote_code": True},
              tokenizer_kwargs={"trust_remote_code": True}
          )
          Settings.llm = llm

          documents = SimpleDirectoryReader("/root/llamaindex_demo/data").load_data()
          index = VectorStoreIndex.from_documents(documents)
          query_engine = index.as_query_engine()

          return query_engine

      # 检查是否需要初始化模型
      if 'query_engine' not in st.session_state:
          st.session_state['query_engine'] = init_models()

      def greet2(question):
          response = st.session_state['query_engine'].query(question)
          return response

            
      # Store LLM generated responses
      if "messages" not in st.session_state.keys():
          st.session_state.messages = [{"role": "assistant", "content": "你好,我是你的助手,有什么我可以帮助你的吗?"}]    

          # Display or clear chat messages
      for message in st.session_state.messages:
          with st.chat_message(message["role"]):
              st.write(message["content"])

      def clear_chat_history():
          st.session_state.messages = [{"role": "assistant", "content": "你好,我是你的助手,有什么我可以帮助你的吗?"}]

      st.sidebar.button('Clear Chat History', on_click=clear_chat_history)

      # Function for generating LLaMA2 response
      def generate_llama_index_response(prompt_input):
          return greet2(prompt_input)

      # User-provided prompt
      if prompt := st.chat_input():
          st.session_state.messages.append({"role": "user", "content": prompt})
          with st.chat_message("user"):
              st.write(prompt)

      # Gegenerate_llama_index_response last message is not from assistant
      if st.session_state.messages[-1]["role"] != "assistant":
          with st.chat_message("assistant"):
              with st.spinner("Thinking..."):
                  response = generate_llama_index_response(prompt)
                  placeholder = st.empty()
                  placeholder.markdown(response)
          message = {"role": "assistant", "content": response}
          st.session_state.messages.append(message)

    3. 运行

      streamlit run app.py
    4. 默认端口8503( http://localhost:8503)
    5. 最终效果
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值