LightRAG`是由香港大学研究团队推出的一种检索增强生成(Retrieval-Augmented Generation, RAG)系统。该系统通过整合图结构索引和双层检索机制,显著提升了大型语言模型在信息检索中的准确性和效率。`LightRAG` 不仅能够捕捉实体间的复杂依赖关系,还能全面理解信息,处理具体和抽象查询,确保用户获得既相关又丰富的响应。
核心特点
1. 图结构索引:
`LightRAG` 利用图结构索引,有效地捕捉和表示实体之间的复杂依赖关系。这种索引方式使得模型能够更好地理解信息的内在结构,提高检索的准确性。
2. 双层检索机制:
`LightRAG` 采用了双层检索机制,第一层进行初步检索,筛选出候选结果;第二层则对候选结果进行深入分析,进一步提升检索的精确度。这种机制确保了在大规模数据集中也能高效、准确地找到相关信息。
3. 全面理解信息:
- 通过图结构索引和双层检索机制,`LightRAG` 能够全面理解信息,不仅处理具体查询,还能应对抽象查询。无论用户的问题是明确的还是模糊的,`LightRAG` 都能提供既相关又丰富的响应。
4. 快速适应新数据:
- `LightRAG` 具备快速适应新数据的能力,能够在动态环境中保持高效和准确。系统基于增量更新算法,能够及时整合新数据,而无需重建整个知识库。这使得 `LightRAG` 在不断变化的信息环境中依然保持出色的性能。
技术细节
1. 图结构索引:
- 图结构索引通过节点和边来表示实体及其关系,能够捕捉实体间的复杂依赖关系。这种索引方式不仅提高了检索的准确性,还增强了模型的理解能力。
2. 双层检索机制:
- 第一层检索:利用初步检索算法,从大规模数据集中筛选出候选结果。
- 第二层检索:对候选结果进行深入分析,进一步提升检索的精确度。这种双层机制确保了在大量数据中也能高效、准确地找到相关信息。
3. 增量更新算法:
- `LightRAG` 采用增量更新算法,能够及时整合新数据,而无需重新构建整个知识库。这使得系统在动态环境中能够快速适应新信息,保持高效和准确。
应用场景
1. 搜索引擎:
- 在搜索引擎中,`LightRAG` 可以显著提升搜索结果的相关性和丰富性,提供更好的用户体验。
- 通过图结构索引和双层检索机制,`LightRAG` 能够更准确地理解用户的查询意图,返回更符合需求的搜索结果。
2. 智能客服:
- 在智能客服系统中,`LightRAG` 可以快速响应用户的咨询,提供准确、友好的服务。
- 通过全面理解信息,`LightRAG` 能够处理各种类型的查询,无论是具体问题还是抽象概念。
3. 知识管理系统:
- 在知识管理系统中,`LightRAG` 可以高效地管理和检索知识,帮助用户快速找到所需信息。
- 通过增量更新算法,`LightRAG` 能够及时整合新知识,保持系统的最新状态。
4. 科研辅助:
- 在科研辅助中,`LightRAG` 可以帮助研究人员快速查找相关文献和数据,提高科研效率。
- 通过图结构索引,`LightRAG` 能够捕捉文献之间的关联,提供更全面的科研支持。
`LightRAG` 是一种先进的检索增强生成系统,通过整合图结构索引和双层检索机制,显著提升了大型语言模型在信息检索中的准确性和效率。无论是在搜索引擎、智能客服、知识管理系统还是科研辅助中,`LightRAG` 都能够提供高质量的检索和生成服务,满足多样化的应用需求。
项目具体介绍网上资料很多,这里不再赘述。也可参考:https://arxiv.org/abs/2410.05779
及https://sites.google.com/view/chaoh。
下面就开始我们的旅程,使用本地ollama服务器,实现LightRAG。
LigthRag已经提供ollama本地llm接口使用示例,代码文件为./examples/lightrag_ollama_demo.py。下面我们就进行代码分步解读:
一、导入必要的库
import os
import logging
from lightrag import LightRAG, QueryParam
from lightrag.llm import ollama_model_complete, ollama_embedding
from lightrag.utils import EmbeddingFunc
# 导入必要的库
import nest_asyncio
# 应用 nest_asyncio
nest_asyncio.apply()
二、创建工作目录
WORKING_DIR = "./dickens"
logging.basicConfig(format="%(levelname)s:%(message)s", level=logging.INFO)
if not os.path.exists(WORKING_DIR):
os.mkdir(WORKING_DIR)
三、配置Rag参数,启动Rag读取文件
根据我们本地ollama服务的地址和模型名称,填写下面信息即可。这里用到两个模型mistral是llm模型,nomice-embed-text是嵌入式模型,负责将文档转换为嵌入式向量。
注意:我们进行rag的目标文件book.txt,在文章的最后提供
rag = LightRAG(
working_dir=WORKING_DIR,
llm_model_func=ollama_model_complete,
llm_model_name="mistral:latest",
llm_model_max_async=4,
llm_model_max_token_size=32768,
llm_model_kwargs={"host": "http://localhost:11434", "options": {"num_ctx": 32768}},
embedding_func=EmbeddingFunc(
embedding_dim=768,
max_token_size=8192,
func=lambda texts: ollama_embedding(
texts, embed_model="nomic-embed-text", host="http://localhost:11434"
),
),
)
#!curl https://raw.githubusercontent.com/gusye1234/nano-graphrag/main/tests/mock_data.txt > ./book.txt
i=0
with open("./book.txt", "r", encoding="utf-8") as f:
rag.insert(f.read())
print("完成行数:", ++i)
四、执行查询
示例提供了四种查询:1、普通查询、2、本地查询、3、全局查询、4、混合查询。
对于负责的任务,全局和混合查询准确率较高。
# Perform naive search
print(
rag.query("What are the top themes in this story?", param=QueryParam(mode="naive"))
)
# Perform local search
print(
rag.query("What are the top themes in this story?", param=QueryParam(mode="local"))
)
# Perform global search
print(
rag.query("What are the top themes in this story?", param=QueryParam(mode="global"))
)
# Perform hybrid search
print(
rag.query("What are the top themes in this story?", param=QueryParam(mode="hybrid"))
)
五、使用在线收费llm模型+本地ollama_embedding模型的模式
由于本地服务器性能的原因,我使用本地llm模型运行几个小时都未能完成rag数据加载任务。不得已采用阿里源收费的魔塔在线模型服务,结合本地ollama_embedding模型的方式进行Rag数据读取,完美运行通过。下面是代码修改过程。
1)定义llm_model_func函数:
api_key和qwen2-72b-instruct需要填写您自己的模型名称及key值,其他的可以不需要调整。
from lightrag.llm import openai_complete_if_cache
async def llm_model_func(prompt, system_prompt=None, history_messages=[], **kwargs) -> str:
return await openai_complete_if_cache(
"qwen2-72b-instruct", # LLM模型名称
prompt, system_prompt=system_prompt, history_messages=history_messages,
api_key="sk-xxxxxxxx", # LLM_api_key
base_url="https://dashscope.aliyuncs.com/compatible-mode/v1", # LLM_url
**kwargs
)
2)修改前面的第三步 "三、配置Rag参数,启动Rag读取文件",如下:
rag = LightRAG(
working_dir=WORKING_DIR,
llm_model_func=llm_model_func,
embedding_func=EmbeddingFunc(
embedding_dim=768,
max_token_size=8192,
func=lambda texts: ollama_embedding(
texts, embed_model="nomic-embed-text", host="http://localhost:11434"
),
),
)
i=0
with open("./book.txt", "r", encoding="utf-8") as f:
rag.insert(f.read())
print("完成行数:", ++i)
执行日志:
INFO:lightrag:Logger initialized for working directory: ./dickens
INFO:lightrag:Load KV llm_response_cache with 0 data
INFO:lightrag:Load KV full_docs with 0 data
INFO:lightrag:Load KV text_chunks with 0 data
INFO:lightrag:Loaded graph from ./dickens/graph_chunk_entity_relation.graphml with 0 nodes, 0 edges
INFO:nano-vectordb:Load (0, 768) data
INFO:nano-vectordb:Init {'embedding_dim': 768, 'metric': 'cosine', 'storage_file': './dickens/vdb_entities.json'} 0 data
INFO:nano-vectordb:Load (0, 768) data
INFO:nano-vectordb:Init {'embedding_dim': 768, 'metric': 'cosine', 'storage_file': './dickens/vdb_relationships.json'} 0 data
INFO:nano-vectordb:Load (2, 768) data
INFO:nano-vectordb:Init {'embedding_dim': 768, 'metric': 'cosine', 'storage_file': './dickens/vdb_chunks.json'} 2 data
INFO:lightrag:[New Docs] inserting 1 docs
INFO:lightrag:[New Chunks] inserting 2 chunks
INFO:lightrag:Inserting 2 vectors to chunks
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:lightrag:[Entity Extraction]...
INFO:httpx:HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
⠙ Processed 1 chunks, 17 entities(duplicated), 0 relations(duplicated)
INFO:httpx:HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK"
⠹ Processed 2 chunks, 49 entities(duplicated), 4 relations(duplicated)
INFO:lightrag:Inserting 46 vectors to entities
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:lightrag:Inserting 4 vectors to relationships
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK"
INFO:lightrag:Writing graph with 46 nodes, 4 edges
完成行数: 0
六、关系视图可视化:
python graph_visual_with_html.py
七、提问测试
下面我们就开始进行提问了
1、普通查询
# Perform naive search
if 1:
print(
rag.query("根据提供的材料,给我查找所有关于股票市场相关信息示例?", param=QueryParam(mode="naive"))
)
从返回的日志可以看出,回答的内容是模型自己的知识库,并非我们文档的内容。
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK" INFO:lightrag:Truncate 2 to 2 chunks The provided documents contain detailed information about stock market summaries from both the Shanghai Stock Exchange (SSE) and the Shenzhen Stock Exchange (SZSE). Below is an overview of the data available: ### Shanghai Stock Exchange (SSE) The SSE provides a summary of the stock market which includes the following data points: - **Circulating Share Capital**: The total number of shares that are freely tradable on the market. - **Total Market Value**: The total value of all listed stocks. - **Average P/E Ratio**: The average price-to-earnings ratio of all listed companies. - **Number of Listed Companies**: Total number of companies listed on the exchange. - **Listed Stocks**: Total number of stocks listed on the exchange. - **Circulating Market Value**: The total value of the circulating share capital. - **Report Time**: The date when the report was published. - **Total Shares**: The total number of shares issued by all listed companies. ### Shenzhen Stock Exchange (SZSE) For the SZSE, there is a more detailed breakdown by security category: - **Stocks**: Total number of stocks listed, their trading volume, total market value, and circulating market value. - **Main Board A-Shares**: Specific data for A-shares listed on the main board. - **Main Board B-Shares**: Specific data for B-shares listed on the main board. - **Small-Medium Enterprises Board**: Data for stocks listed on the SME board. - **Growth Enterprise Market (GEM) A-Shares**: Data for A-shares listed on the GEM. - **Funds**: Trading volume and total market value of funds listed on the exchange. - **ETFs**: Data specific to Exchange-Traded Funds. - **LOFs**: Data for Listed Open-Ended Funds. - **Closed-End Funds**: Information about closed-end funds. - **Graded Funds**: Details about graded investment funds. - **Bonds**: Total trading volume and total market value of bonds. - **Bond Spot**: Trading volume and total market value of bond spot transactions. - **Bond Repurchase**: Trading volume for bond repurchase agreements. - **ABS**: Trading volume and total market value of Asset-Backed Securities. - **Options**: Information regarding options traded on the exchange. These summaries can be accessed using the `akshare` library in Python by calling the `stock_sse_summary()` function for the SSE data and the `stock_szse_summary(date)` function for the SZSE data, specifying the date for the latter if needed. The data provided offers insights into the overall health and activity of the stock markets in China, including metrics such as trading volumes, market values, and the number of listed securities across different categories
2、本地查询
if 1:
# Perform local search
print(
rag.query("根据提供的材料,给我查找所有关于股票市场相关信息示例?", param=QueryParam(mode="local"))
)
从返回的日志可以看出,回答的内容是模型自己的知识库,也并非我们文档的内容,如下:
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK" INFO:lightrag:Local query uses 46 entites, 4 relations, 2 text units The provided materials offer comprehensive insights into the stock markets, specifically focusing on the Shanghai and Shenzhen Stock Exchanges in China. Below is a detailed summary of the data available, categorized into sections for clarity: ### Shanghai Stock Exchange (SSE) The SSE provides a summary of the overall market conditions through the `stock_sse_summary` function accessible via the AKShare library. This function retrieves recent trading day data regarding various aspects of the stock market. Key information includes: - **Circulating Share Volume**: The total number of shares that are freely tradable on the market. - **Total Market Capitalization**: The aggregate market value of all listed companies. - **Average Price-to-Earnings Ratio**: An indicator of the market's valuation level. - **Number of Listed Companies**: Total count of companies listed on the exchange. - **Listed Stocks**: The number of stocks currently traded on the exchange. - **Floating Market Value**: The market value of freely tradable shares. - **Report Date**: The date on which the data was reported. For instance, as of December 30, 2021, the SSE reported: - A circulating share volume of 40,403.47 billion, - A total market capitalization of 516,714.68 billion yuan, - An average P/E ratio of 17.92, - 2,036 listed companies, - 2,078 listed stocks, - A floating market value of 432,772.13 billion yuan, - And a total share volume of 46,234.03 billion. ### Shenzhen Stock Exchange (SZSE) The SZSE offers a detailed breakdown of statistics on different securities categories through the `stock_szse_summary` function. This function requires a specific date as an input parameter and returns data on: - **Security Categories**: Various types of securities listed on the exchange. - **Quantity**: The number of securities listed. - **Trading Volume**: The total amount of money exchanged in transactions. - **Total Market Capitalization**: The total value of all securities. - **Floating Market Value**: The market value of freely tradable securities. As an example, on June 19, 2020, the SZSE reported: - 2,284 listed securities, - A trading volume of 464.7749 billion yuan, - A total market capitalization of 27.06514 trillion yuan, - A floating market value of 21.04546 trillion yuan. ### Specific Securities Categories The SZSE further categorizes the securities into distinct groups, providing detailed statistics for each category: - **Main Board A-shares**: 460 listed securities with a trading volume of 97.7595 billion yuan, a total market capitalization of 7.864787 trillion yuan, and a floating market value of 6.94399 trillion yuan. - **Main Board B-shares**: 46 listed securities with a trading volume of 86.2682 million yuan, a total market capitalization of 47.59658 billion yuan, and a floating market value of 47.06385 billion yuan. - **Small and Medium Enterprises Board**: 960 listed securities with a trading volume of 201.3526 billion yuan, a total market capitalization of 1.130741 trillion yuan, and a floating market value of 866.9555 billion yuan. - **Growth Enterprise Market (GEM) A-shares**: 818 listed securities with a trading volume of 165.5765 billion yuan, a total market capitalization of 7.845345 trillion yuan, and a floating market value of 5.384854 trillion yuan. - **Investment Funds**: 551 listed securities with a trading volume of 13.62524 billion yuan, a total market capitalization of 241.7277 billion yuan, and a floating market value of 241.7277 billion yuan. - **Exchange-Traded Funds (ETFs)**: 100 listed securities with a trading volume of 11.65436 billion yuan, a total market capitalization of 162.8294 billion yuan, and a floating market value of 162.8294 billion yuan. - **Listed Open-Ended Funds (LOFs)**: 250 listed securities with a trading volume of 733.5768 million yuan, a total market capitalization of 40.43156 billion yuan, and a floating market value of 40.43156 billion yuan. - **Closed-End Funds**: 1 listed security with a trading volume of 552,757 yuan, a total market capitalization of 762.244 million yuan, and a floating market value of 762.244 million yuan. - **Structured Funds (Graded Funds)**: 200 listed securities with a trading volume of 1.236746 billion yuan, a total market capitalization of 37.70451 billion yuan, and a floating market value of 37.70451 billion yuan. - **Bonds**: 7,174 listed securities with a trading volume of 137.1389 billion yuan. - **Physical Bonds**: 6,599 listed securities with a trading volume of 29.11357 billion yuan, a total market capitalization of 36.83881 trillion yuan, and a floating market value of 1.823072 trillion yuan. - **Bond Repurchase Agreements**: 13 listed securities with a trading volume of 105.4592 billion yuan. - **Asset-Backed Securities (ABS)**: 562 listed securities with a trading volume of 2.566134 billion yuan, a total market capitalization of 484.9642 billion yuan, and a floating market value of 484.9642 billion yuan. - **Options**: 108 listed securities with a trading volume of 244.156 million yuan. These summaries provide valuable insights into the health and dynamics of the Chinese stock markets, including the size, liquidity, and composition of the securities listed on these exchanges. 所提供的材料提供了对股票市场的全面见解,特别是专注于中国的上海和深圳证券交易所。以下是可用数据的详细总结,为清晰起见,分为几节: #上海证券交易所 上交所通过AKShare库中的“stock_sse_summary”函数提供整体市场状况的摘要。此函数检索有关股票市场各个方面的最近交易日数据。主要信息包括: - ** 流通股数量 **:指市场上可自由交易的股份总数。 - ** 总市值 **:所有上市公司的总市值。 - ** 平均市盈率 **:市场估值水平的指标。 - ** 上市公司数量 **:在交易所上市的公司总数。 - ** 上市股票 **:目前在交易所交易的股票数量。 - ** 浮动市值 **:自由流通股的市值。 - ** 上报日期 **:数据上报的日期。 例如,截至2021年12月30日,上交所报告: - 流通股量404034.7亿股, - 总市值516714.68亿元, - 平均市盈率为17.92倍, - 2,036家上市公司, - 2,078只上市股票, - 浮市值432772.13亿元, - 总发行量为462340.3亿股。 ###深圳证券交易所 深交所通过“stock_szse_summary”功能,提供不同证券类别的详细统计数据。该函数需要特定日期作为输入参数,并返回以下数据: - ** 证券类别 **:指在交易所上市的各类证券。 - ** 数量 **:列出的证券数量。 - ** 交易量 **:交易的总金额。 - ** 总市值 **:所有证券的总价值。 - ** 浮动市值 **:可自由交易证券的市值。 例如,2020年6月19日,深交所报告: - 2,284种上市证券, - 成交量4647.49亿元, - 总市值270.6514万亿元, - 浮市值210.4546亿元。 #具体证券类别 深交所进一步将证券分类为不同的组别,并为每个类别提供详细的统计数据: - ** 主板A股 **:上市证券460只,成交量9775.95亿元,总市值78647.87亿元,流通市值6943.99亿元。 - ** 主板B股 **:上市证券46只,成交量8626.82万元,总市值47596.58亿元,流通市值47063.85亿元。 - ** 中小企业板 **:上市证券960只,交易量20135.26亿元,总市值11307.41亿元,流通市值86695.55亿元。 - ** 创业板A股 **:上市证券818只,成交量16557.65亿元,总市值78453.45亿元,流通市值53848.54亿元。 - ** 投资基金 **:上市证券551只,成交量13625.24亿元,总市值24172.77亿元,流通市值24172.77亿元。 - ** 交易所交易基金(ETF)**:上市证券100只,交易量11654.36亿元,总市值16282.94亿元,浮动市值16282.94亿元。 - ** 上市开放式基金(LOFs)**:上市证券250只,交易量73357.68万元,总市值40431.56亿元,浮动市值40431.56亿元。 - ** 封闭式基金 **:1只上市证券,成交量552,757元,总市值76224. 40万元,浮动市值76224. 40万元。 - ** 结构化基金(分级基金)**:上市证券200只,交易量12367.46亿元,总市值37704.51亿元,浮动市值37704.51亿元。 - ** 债券 **:上市证券7,174只,成交量13713.89亿元。 - ** 实物债券 **:上市证券6,599只,成交量2911357亿元,总市值3683881亿元,流通市值1823072亿元。 - ** 债券回购协议 **:13只上市证券,成交量10545.92亿元。 - ** 资产支持证券(ABS)**:上市证券562只,交易量25661.34亿元,总市值48496.42亿元,浮动市值48496.42亿元。 - ** 期权 **:上市证券108只,成交量24415.6万元。 这些摘要为中国股市的健康和动态提供了宝贵的见解,包括在这些交易所上市的证券的规模,流动性和构成。
3、全局查询、
if 1:
# Perform global search
print(
rag.query("根据提供的材料,给我查找所有关于股票市场相关信息示例?", param=QueryParam(mode="global"))
)
全局查询模式下,模型能完美识别我们的问题,并根据文档提供我们想要的答案。这次回答是成功的!!
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK" INFO:lightrag:Global query uses 5 entites, 4 relations, 1 text units INFO:httpx:HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK" ### 股票市场信息概览 #### 上海证券交易所(SSE)市场总貌 **接口**: `stock_sse_summary` - **目标地址**: http://www.sse.com.cn/market/stockdata/statistic/ - **描述**: 提供上海证券交易所的市场总貌数据,包括流通股本、总市值、平均市盈率等关键指标。 - **输入参数**: 无需输入参数。 - **输出参数**: - 项目: 包括主板、科创板等不同板块的统计数据。 - 股票、科创板、主板: 分别展示各板块的具体数据,如流通股本、总市值和平均市盈率等。 **数据示例**: ```markdown 项目 股票 科创板 主板 0 流通股本 40403.47 413.63 39989.84 1 总市值 516714.68 55719.6 460995.09 2 平均市盈率 17.92 71.0 16.51 3 上市公司 2036 377 1659 4 上市股票 2078 377 1701 5 流通市值 432772.13 22274.3 410497.83 6 报告时间 20211230 20211230 20211230 8 总股本 46234.03 1211.5 45022.54 ``` #### 深圳证券交易所(SZSE)市场总貌 **接口**: `stock_szse_summary` - **目标地址**: http://www.szse.cn/market/overview/index.html - **描述**: 提供深圳证券交易所市场总貌数据,按证券类别统计,包括股票、债券、基金等多种类型。 - **输入参数**: - `date`: 需要指定日期,格式为"YYYYMMDD"。当前交易日的数据需在交易所收盘后获取。 - **输出参数**: - 证券类别: 包括股票、主板A股、主板B股、中小板等。 - 数量: 各类证券的数量,单位为只。 - 成交金额、总市值、流通市值: 相关的金融数据,单位分别为元。 **数据示例**: ```markdown 证券类别 数量 成交金额 总市值 流通市值 0 股票 2284 4.647749e+11 2.706514e+13 2.104546e+13 1 主板A股 460 9.775950e+10 7.864787e+12 6.943990e+12 2 主板B股 46 8.626816e+07 4.759658e+10 4.706385e+10 3 中小板 960 2.013526e+11 1.130741e+13 8.669555e+12 4 创业板A股 818 1.655765e+11 7.845345e+12 5.384854e+12 5 基金 551 1.362524e+10 2.417277e+11 2.417277e+11 6 ETF 100 1.165436e+10 1.628294e+11 1.628294e+11 7 LOF 250 7.335768e+08 4.043156e+10 4.043156e+10 8 封闭式基金 1 5.527571e+05 7.622440e+08 7.622440e+08 9 分级基金 200 1.236746e+09 3.770451e+10 3.770451e+10 10 债券 7174 1.371389e+11 NaN NaN 11 债券现券 6599 2.911357e+10 3.683881e+13 1.823072e+12 12 债券回购 13 1.054592e+11 NaN NaN 13 ABS 562 2.566134e+09 4.849642e+11 4.849642e+11 14 期权 ``` ### 结论 以上信息展示了通过`AKShare`库访问上海证券交易所和深圳证券交易所的市场总貌数据的方法及数据示例。这些数据覆盖了流通股本、总市值、平均市盈率以及各类证券的详细统计数据,对于分析中国股市的整体状况提供了全面的视角。
4、混合查询
if 1:
# Perform hybrid search
print(
rag.query("根据提供的材料,给我查找所有关于股票市场相关信息示例?", param=QueryParam(mode="hybrid"))
)
混合查询模式下,模型能完美识别我们的问题,并根据文档提供我们想要的答案。回答的内容根据简洁,明了。这次回答也是成功的!
INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK" INFO:lightrag:Local query uses 46 entites, 4 relations, 2 text units INFO:httpx:HTTP Request: POST http://localhost:11434/api/embeddings "HTTP/1.1 200 OK" INFO:lightrag:Global query uses 5 entites, 4 relations, 1 text units INFO:httpx:HTTP Request: POST https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions "HTTP/1.1 200 OK" ## 股票市场信息概览 ### AKShare与股票市场数据的集成 AKShare是一个强大的组织,它提供了一个库来访问股票市场的数据,包括来自上海证券交易所和深圳证券交易所的信息。通过其API,AKShare能够从这两个主要的金融市场获取和检索数据。 ### 上海证券交易所与AKShare的集成 上海证券交易所是一个重要的金融平台,它不仅提供了股票交易的场所,还详细记录了市场上的各种统计数据,如上市公司的概况、交易量和市值等。AKShare通过`stock_sse_summary`函数与上海证券交易所进行数据交互,这个事件代表了从上海证券交易所获取市场总貌数据的过程。这包括了市场资本化水平、平均市盈率以及上市公司的数量等关键指标。 ### 深圳证券交易所与AKShare的集成 同样地,深圳证券交易所也是一个活跃的金融市场,它涵盖了多种证券类别,如股票、债券和基金,并提供了详细的市场数据。AKShare通过`stock_szse_summary`事件从深圳证券交易所获取这些数据,该事件涉及从深圳证券交易所提取总结性数据,涵盖了各类证券的数量、交易量和市场价值等重要统计信息。 ### 数据获取与分析 上海证券交易所和深圳证券交易所通过AKShare的API向用户提供数据。`stock_sse_summary`和`stock_szse_summary`事件分别代表了从两个交易所获取数据的过程。这些数据对于理解市场动态、分析投资机会以及进行市场研究至关重要。 ### 实时行情数据示例 以下是通过AKShare从上海证券交易所获取的实时行情数据的一个示例: | 名称 | 类型 | 描述 | |----------|--------|------------| | 项目 | object | - | | 股票 | object | - | | 科创板 | object | - | | 主板 | object | - | 请注意,上述示例中的“-”表示具体描述未在提供的材料中明确给出。实际上,这些字段可能包含了具体的数值和统计信息,如股票价格、成交量、科创板的公司数量或主板的市场总值等。 总之,AKShare与上海证券交易所和深圳证券交易所之间的数据交互为用户提供了深入洞察中国股票市场的工具,帮助投资者和分析师更好地理解和利用市场信息。
附录:book.txt文档:
## [AKShare](https://github.com/akfamily/akshare) 股票数据
### A股
#### 股票市场总貌
##### 上海证券交易所
接口: stock_sse_summary
目标地址: http://www.sse.com.cn/market/stockdata/statistic/
描述: 上海证券交易所-股票数据总貌
限量: 单次返回最近交易日的股票数据总貌(当前交易日的数据需要交易所收盘后统计)
输入参数
| 名称 | 类型 | 描述 |
|-----|-----|-----|
| - | - | - |
输出参数-实时行情数据
| 名称 | 类型 | 描述 |
|-----|--------|-----|
| 项目 | object | - |
| 股票 | object | - |
| 科创板 | object | - |
| 主板 | object | - |
接口示例
```python
import akshare as ak
stock_sse_summary_df = ak.stock_sse_summary()
print(stock_sse_summary_df)
```
数据示例
```
项目 股票 科创板 主板
0 流通股本 40403.47 413.63 39989.84
1 总市值 516714.68 55719.6 460995.09
2 平均市盈率 17.92 71.0 16.51
3 上市公司 2036 377 1659
4 上市股票 2078 377 1701
5 流通市值 432772.13 22274.3 410497.83
6 报告时间 20211230 20211230 20211230
8 总股本 46234.03 1211.5 45022.54
```
##### 深圳证券交易所
###### 证券类别统计
接口: stock_szse_summary
目标地址: http://www.szse.cn/market/overview/index.html
描述: 深圳证券交易所-市场总貌-证券类别统计
限量: 单次返回指定 date 的市场总貌数据-证券类别统计(当前交易日的数据需要交易所收盘后统计)
输入参数
| 名称 | 类型 | 描述 |
|------|-----|-------------------------------------|
| date | str | date="20200619"; 当前交易日的数据需要交易所收盘后统计 |
输出参数
| 名称 | 类型 | 描述 |
|------|---------|---------|
| 证券类别 | object | - |
| 数量 | int64 | 注意单位: 只 |
| 成交金额 | float64 | 注意单位: 元 |
| 总市值 | float64 | - |
| 流通市值 | float64 | - |
接口示例
```python
import akshare as ak
stock_szse_summary_df = ak.stock_szse_summary(date="20200619")
print(stock_szse_summary_df)
```
数据示例
```
证券类别 数量 成交金额 总市值 流通市值
0 股票 2284 4.647749e+11 2.706514e+13 2.104546e+13
1 主板A股 460 9.775950e+10 7.864787e+12 6.943990e+12
2 主板B股 46 8.626816e+07 4.759658e+10 4.706385e+10
3 中小板 960 2.013526e+11 1.130741e+13 8.669555e+12
4 创业板A股 818 1.655765e+11 7.845345e+12 5.384854e+12
5 基金 551 1.362524e+10 2.417277e+11 2.417277e+11
6 ETF 100 1.165436e+10 1.628294e+11 1.628294e+11
7 LOF 250 7.335768e+08 4.043156e+10 4.043156e+10
8 封闭式基金 1 5.527571e+05 7.622440e+08 7.622440e+08
9 分级基金 200 1.236746e+09 3.770451e+10 3.770451e+10
10 债券 7174 1.371389e+11 NaN NaN
11 债券现券 6599 2.911357e+10 3.683881e+13 1.823072e+12
12 债券回购 13 1.054592e+11 NaN NaN
13 ABS 562 2.566134e+09 4.849642e+11 4.849642e+11
14 期权 108 2.441560e+08 NaN NaN
```