(25-5-02)基于本地知识库的自动问答系统（LangChain+ChatGLM+ModelScope/Huggingface部署）:实现Web端的问答系统（2）_loader = unstructuredfileloader(filepath) docs = l-CSDN博客

本文链接：https://blog.csdn.net/asd343442/article/details/139424477

（6）方法get_knowledge_based_answer的功能是，基于已知信息回答用户的问题。此方法接收用户的查询、网络检索内容、以及其他参数如历史长度、温度等参数，根据输入的参数构建了一个适用于语言模型的提示模板，包括已知信息、问题等。然后，加载已经构建好的知识向量存储，并使用检索和语言模型，结合已知信息和用户问题，返回一个基于知识的答案。

    def get_knowledge_based_answer(self,
                                   query,
                                   web_content,
                                   top_k: int = 6,
                                   history_len: int = 3,
                                   temperature: float = 0.01,
                                   top_p: float = 0.1,
                                   history=[]):
        self.llm.temperature = temperature
        self.llm.top_p = top_p
        self.history_len = history_len
        self.top_k = top_k
        if web_content:
            prompt_template = f"""基于以下已知信息，简洁和专业的来回答用户的问题。
                                如果无法从中得到答案，请说 "根据已知信息无法回答该问题" 或 "没有提供足够的相关信息"，不允许在答案中添加编造成分，答案请使用中文。
                                已知网络检索内容：{web_content}""" + """
                                已知内容:
                                {context}
                                问题:
                                {question}"""
        else:
            prompt_template = """基于以下已知信息，请简洁并专业地回答用户的问题。
                如果无法从中得到答案，请说 "根据已知信息无法回答该问题" 或 "没有提供足够的相关信息"。不允许在答案中添加编造成分。另外，答案请使用中文。

                已知内容:
                {context}

                问题:
                {question}"""
        prompt = PromptTemplate(template=prompt_template,
                                input_variables=["context", "question"])
        self.llm.history = history[
            -self.history_len:] if self.history_len > 0 else []
        vector_store = FAISS.load_local('faiss_index', self.embeddings)

        knowledge_chain = RetrievalQA.from_llm(
            llm=self.llm,
            retriever=vector_store.as_retriever(
                search_kwargs={"k": self.top_k}),
            prompt=prompt)
        knowledge_chain.combine_documents_chain.document_prompt = PromptTemplate(
            input_variables=["page_content"], template="{page_content}")

        knowledge_chain.return_source_documents = True

        result = knowledge_chain({"query": query})
        return result

（7）方法load_file用于加载文件并根据文件类型进行处理，返回加载的文档内容。方法load_file接收文件路径作为输入参数，并根据文件扩展名进行不同的处理：

如果文件是 Markdown 格式（.md），则使用指定的加载器加载文件内容。
如果文件是 PDF 格式（.pdf），则使用 PDF 文本分割器拆分文本，并加载内容。
如果文件是文本文件（.txt），则使用指定的加载器加载文件内容，并根据指定的编码方式进行解码。
如果文件类型无法识别，则使用指定的加载器加载文件内容，并假定为非结构化文本。

    def load_file(self, filepath):
        if filepath.lower().endswith(".md"):
            loader = UnstructuredFileLoader(filepath, mode="elements")
            docs = loader.load()
        elif filepath.lower().endswith(".pdf"):
            loader = UnstructuredFileLoader(filepath)
            textsplitter = ChineseTextSplitter(pdf=True)
            docs = loader.load_and_split(textsplitter)
        elif filepath.lower().endswith(".txt"):
            loader = UnstructuredFileLoader(filepath,encoding='utf8') 
            textsplitter = ChineseTextSplitter(pdf=False)
            docs = loader.load_and_split(textsplitter)
        else:
            loader = UnstructuredFileLoader(filepath, mode="elements")
            textsplitter = ChineseTextSplitter(pdf=False)
            docs = loader.load_and_split(text_splitter=textsplitter)
        return docs

（8）函数update_status用于更新对话历史记录，并将新的状态添加到历史记录中。首先，将新的状态添加到历史记录中，然后打印该状态。最后，它返回更新后的历史记录。

def update_status(history, status):
    history = history + [[None, status]]
    print(status)
    return history

（9）下面这段代码定义了一个名为 init_model的函数，用于初始化语言模型。首先，创建了一个名为knowladge_based_chat_llm的KnowledgeBasedChatLLM类的实例。然后，尝试加载模型配置。如果成功加载了模型配置，会打印输出"开始加载模型配置"和"模型配置加载成功"的信息，并调用语言模型的 _call方法，并传递 "你好" 作为提示信息，以确保模型已成功加载。最后，如果出现异常，它将打印详细的异常信息，并返回"模型未成功加载，请重新选择模型后点击"重新加载模型"按钮"的提示信息。

knowladge_based_chat_llm = KnowledgeBasedChatLLM()
def init_model():
    try:
        print("开始加载模型配置")
        knowladge_based_chat_llm.init_model_config()
        print("模型配置加载成功")
        knowladge_based_chat_llm.llm._call("你好")
        return """初始模型已成功加载，可以开始对话"""
    except Exception as e:
        print(f"加载模型出错: {e}")  # 打印详细的异常信息
        return """模型未成功加载，请重新选择模型后点击"重新加载模型"按钮"""

（10）函数clear_session返回一个空字符串和 None。通常，这种函数可能被用来清除或重置会话状态。

def clear_session():
    return '', None

(25-5-02)基于本地知识库的自动问答系统（LangChain+ChatGLM+ModelScope/Huggingface部署）:实现Web端的问答系统（2）

未完待续