RAG系统涉及到对用户query做检索召回,好的query更有利于召回正确的结果。所以query analysis要做的就是对输入query做一个改写,让其语义更完善或格式更清晰。
query改写的策略有很多:
- query分解:如果一个问题包含了多个独立子问题,可以将问题分解,然后分别独立执行。示例如下:
system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Perform query decomposition. Given a user question, break it down into distinct sub questions that \
you need to answer in order to answer the original question.
If there are acronyms or words you are not familiar with, do not try to rephrase them."""
query_analyzer.invoke(
{
"question": "how to use multi-modal models in a chain and turn chain into a rest api"
}
)
[SubQuery(sub_query='How to use multi-modal models in a chain?'),
SubQuery(sub_query='How to turn a chain into a REST API?')]
- query扩展:如果检索方法对query内容很敏感,可以生成多个不同版本的query来检索,增加相关内容的召回率。
system = """You are an expert at converting user questions into database queries. \
You have access to a database of tutorial videos about a software library for building LLM-powered applications. \
Perform query expansion. If there are multiple common ways of phrasing a user question \
or common synonyms for key words in the question, make sure to return multiple versions \
of t