大模型RAG：大模型如何利用长上下文打造前沿RAGLLMs（二）智谱GLM Long应用实战

AI Agent开发

已于 2024-09-12 10:59:44 修改

阅读量677

点赞数 7

文章标签：人工智能大模型 AI大模型 AI RAG LLM 学习

于 2024-08-26 12:57:15 首次发布

本文链接：https://blog.csdn.net/m0_56255097/article/details/141561288

版权

智谱GLM Long应用实战

100万上下文长文本模型GLM-4-Long来了。这为将 RAG 系统与长上下文 LLMs 的集成提供了一个有前景的方向。

100万的上下文，大约相当2本红楼梦或者125篇科研论文的长度，极大的提高了模型的上下文理解能力，丰富了大模型应用落地能力。

GLM-4-Long模型属于GLM-4系列通用大模型，专为处理超长文本和记忆型任务设计。本文档将向你介绍智谱BigModel开放平台最新的长文本模型GLM-4-Long，并会带您一起了解如何使用最新的GLM-4-Long模型为您的生活和工作提高效率

该内容演示了 GLM-4-Long 如何通过将讲座笔录转换为笔记格式的任务来处理长文本。在日常生活中，我们经常会遇到需要阅读长篇文章的情况，这可能会很耗时。例如，翘掉课程或临近截止日期的学生可能需要快速掌握长篇文章的要点。在这种情况下，对讲座笔录等长篇文章进行总结就非常有用。

首先，设置环境变量并初始化 ZhipuAI client。

import os
from zhipuai import ZhipuAI

os.environ["ZHIPUAI_API_KEY"] = "your api key"
client = ZhipuAI()

然后，打开我们的讲座笔录文件，指定其字数。

lecture_transcript_path = "data/lecture_transcript.txt"

with open(lecture_transcript_path, "r") as file:
    lecture_text = file.read()

WORD_COUNT = 20899

现在，安装 NLP 库 spaCy，并加载 en_core_web_sm 模型，这是一个用于预处理文件的小型英语模型。然后，我们准备了两个方法将文件分成句子存入列表中，再将句子组织成具有指定最大长度（即总字数的五十分之一）的句块。

import spacy

nlp = spacy.load("en_core_web_sm")

def preprocess_text(text):
    doc = nlp(text)
    sentences = [sent.text for sent in doc.sents]
    return sentences

def chunk_text(sentences, max_chunk_size=WORD_COUNT/50):
    chunks = []
    current_chunk = []
    current_length = 0

    for sentence in sentences:
        sentence_length = len(sentence.split())
        if current_length + sentence_length > max_chunk_size:
            chunks.append(" ".join(current_chunk))
            current_chunk = []
            current_length = 0
        current_chunk.append(sentence)
        current_length += sentence_length

    if current_chunk:
        chunks.append(" ".join(current_chunk))

    return chunks

以两轮的形式，我们用GLM-4-Long模型把讲座笔录先概括成笔记的形式，这样可以避免一次性概括造成的信息损失。这两轮的处理的概括程度是由“概括比例”来决定的，我们可以自行决定输出的笔记字数与原始文本之间0到1的比例。我在这里两轮用的都是0.2的比例，也就是输入1000字的讲座笔录得到200字的笔记。

def summarize_chunk(chunk, summary_ratio):
    response = client.chat.completions.create(
            model="glm-4-long",
            messages=[
                {
                    "role": "system",
                    "content": f"You are an assistant that reads a long lecture transcript and summarizes it to a short and concise note-taking format. The summary should be around {summary_ratio*100}% of the original length."
                },
                {
                    "role": "user",
                    "content": chunk
                },
            ],
            top_p=0.7,
            temperature=0.9
        )
    summarized_text = response.choices[0].message.content
    return summarized_text

def summarize_text(text, summary_ratio):
    sentences = preprocess_text(text)

    max_chunk_size = int(WORD_COUNT / 50)
    chunks = chunk_text(sentences, max_chunk_size)

    summarized_chunks = []
    for chunk in chunks:
        summarized_chunk = summarize_chunk(chunk, summary_ratio)
        if summarized_chunk:
            summarized_chunks.append(summarized_chunk)

    summarized_text = " ".join(summarized_chunks)

    return summarized_text

first_summary_ratio = 0.2
first_summarized_text = summarize_text(lecture_text, first_summary_ratio)

second_summary_ratio = 0.2
final_summarized_text = summarize_text(first_summarized_text, second_summary_ratio)

最后，再次让模型把讲座的笔记形式保存为markdown格式，得到清晰的格式化的笔记，保存到本地以便我们快速阅读。

markdown_notes = client.chat.completions.create(
            model="glm-4-long",
            messages=[
                {
                    "role": "system",
                    "content": "Convert the summary to markdown format. Organize information into headings and subheadings, with no big paragraphs and no more than 5 bullet points under a subheading.",
                },
                {
                    "role": "user",
                    "content": final_summarized_text,
                }
            ],
            top_p=0.7,
            temperature=0.9
        )

with open("data/summarized_notes.md", "w") as file:
    file.write(markdown_notes.choices[0].message.content)