学习RAG踩坑解决记录

最新推荐文章于 2024-05-28 18:51:57 发布

酸菜鱼_2323

最新推荐文章于 2024-05-28 18:51:57 发布

阅读量188

点赞数 2

文章标签：学习

本文链接：https://blog.csdn.net/qq_40623047/article/details/138704125

版权

学习 https://zhuanlan.zhihu.com/p/675509396 和 https://zhuanlan.zhihu.com/p/668082024 文章了解RAG，使用langchain实现一种简单的RAG问答应用示例。

问题一：import langchain 报错

pydantic.errors.PydanticUserError: If you use `@root_validator` with pre=False (the default) you MUST specify `skip_on_failure=True`. Note that `@root_validator` is deprecated and should be replaced with `@model_validator`.
 
For further information visit https://errors.pydantic.dev/2.4/u/root-validator-pre-skip

解决：https://blog.csdn.net/liaoningxinmin/article/details/134590965
python3.8.1版本 langchain 0.0.27

将pydantic库的版本降低到1.10.13

from langchain.document_loaders import TextLoader时在lanchain中找不到这个模块
解决：
langchain版本太低，直接pip install --upgrade langchain也无法提升；
原来是python版本太低，将python版本提升为3.8.1，之后再更新langchain自动升级解决
documents = text_splitter.split_documents(documents)拆分文档的时候发现没有拆分，还是原来的长度
解决：
可能是langchain版本更新了，函数用法改变，修改为https://blog.csdn.net/qq_44894943/article/details/137018519
https://blog.csdn.net/weixin_44238683/article/details/137914108
（CharacterTextSplitter基于字符（默认’\n\n’）进行分割）

# 文档分割
from langchain.text_splitter import CharacterTextSplitter
# # 创建拆分器
text_splitter = CharacterTextSplitter(
    separator="\n",  # 指定TextSplitterTextSplitter分隔符
    chunk_size=128,  # 指定每个块的大小（以字符为单位）。在这个例子中，每个块的大小是200个字符。
    chunk_overlap=0,  # 指定块之间的重叠部分的大小（以字符为单位）。在这个例子中，块之间的重叠部分大小是200个字符。
    length_function=len,
    is_separator_regex=False,  # 表示是否识别separator为正则规则
    add_start_index=True,  # 截断索引
)
# 拆分文档
documents = text_splitter.create_documents([document[0].page_content],metadatas=[document[0].metadata])
documents

酸菜鱼_2323

关注

2
点赞
踩
1

收藏

觉得还不错? 一键收藏
0
评论
学习RAG踩坑解决记录

学习 https://zhuanlan.zhihu.com/p/675509396 和 https://zhuanlan.zhihu.com/p/668082024 文章了解RAG，使用langchain实现一种简单的RAG问答应用示例。可能是langchain版本更新了，函数用法改变，修改为https://blog.csdn.net/qq_44894943/article/details/137018519。（CharacterTextSplitter基于字符（默认’\n\n’）进行分割）
复制链接

扫一扫