Milvus中如何实现全文检索（Full Text Seach）?

最新推荐文章于 2025-03-10 07:32:16 发布

晨欣

最新推荐文章于 2025-03-10 07:32:16 发布

阅读量2.6k

点赞数 15

文章标签： milvus 全文检索 python

本文链接：https://blog.csdn.net/weixin_41338279/article/details/144353482

版权

在前两篇文章中（Milvus python库 pymilvus 常用操作详解之Collection（上）和 Milvus python库 pymilvus 常用操作详解之Collection（下）），我们了解了Milvus基于dense vector和sparse vector实现的混合向量检索，这篇文章让我们着重了解一下基于sparse vector的检索，即全文检索。

何为全文检索？

以下来自 milvus 官方文档原文：

Full text search is a feature that retrieves documents containing specific terms or phrases in text datasets, then ranking the results based on relevance. This feature overcomes semantic search limitations, which might overlook precise terms, ensuring you receive the most accurate and contextually relevant results. Additionally, it simplifies vector searches by accepting raw text input, automatically converting your text data into sparse embeddings without the need to manually generate vector embeddings.

全文搜索是一种功能，可以在文本数据集中检索包含特定术语或短语的文档，并根据相关性对结果进行排序。该功能克服了语义搜索的限制，语义搜索可能会忽略精确的术语，而全文搜索则确保您获得最准确且在语境上相关的结果。此外，它通过接受原始文本输入来简化向量搜索，自动将您的文本数据转换为稀疏嵌入，而无需手动生成向量嵌入。(ChatGPT-4o翻译)

简而言之，全文检索是一种基于精准关键词匹配的检索方式，相较于基于深度学习生成的向量检索（适合语义匹配检索场景），全文检索在需要基于精确关键词进行匹配的检索场景下表现更佳。当然各取其长将两者结合起来实现混合检索也是非常建议的选择。（感兴趣可以前往 BGE-M3模型结合Milvus向量数据库强强联合实现混合检索）