Spring3初始化
创建工程
选择基础依赖
依赖引入
- 接口文档 knife4j
<!-- 接口文档 -->
<dependency>
<groupId>com.github.xiaoymin</groupId>
<artifactId>knife4j-openapi3-jakarta-spring-boot-starter</artifactId>
<version>4.4.0</version>
</dependency>
- 数据库 (Mbatis-plus)
引入后需要删除Mybatis相关依赖,避免冲突
<!-- Mybatis-plus -->
<dependency>
<groupId>com.baomidou</groupId>
<artifactId>mybatis-plus-spring-boot3-starter</artifactId>
<version>3.5.12</version>
</dependency>
启动类添加注解
@MapperScan("com.cxcs.portsai.mapper")
- hutool 工具类
<!-- 工具 -->
<dependency>
<groupId>cn.hutool</groupId>
<artifactId>hutool-all</artifactId>
<version>5.8.37</version>
</dependency>
- yml配置
spring:
application:
name: ports-weather-ai-backend
# 数据库
datasource:
driver-class-name: com.mysql.cj.jdbc.Driver
url: jdbc:mysql://localhost:3306/ports
username: root
password: cxcs
server:
port: 8095
servlet:
context-path: /api
# 接口文档
springdoc:
swagger-ui:
path: /swagger-ui.html
tags-sorter: alpha
operations-sorter: alpha
api-docs:
path: /v3/api-docs
group-configs:
- group: 'default'
paths-to-match: '/**'
packages-to-scan: com.cxcs.portsai.controller
# knife4j 增强
knife4j:
enable: true
setting:
language: zh_cn
mybatis-plus:
configuration:
map-underscore-to-camel-case: false # 取消驼峰命名, 实体类字段直接转数据库属性
log-impl: org.apache.ibatis.logging.stdout.StdOutImpl
global-config:
db-config:
logic-delete-field: isDelete # 全局逻辑删除的实体字段名
logic-delete-value: 1 # 逻辑已删除值(默认为 1)
logic-not-delete-value: 0 # 逻辑未删除值(默认为 0)
通用代码
- 自定义异常
BusinessException
- 响应包装类
BaseResponse
- 全局异常处理器
GlobalExceptionHandler
- 请求包装类
PageRequest
- 全局跨域处理
@Configuration
public class CorsConfig implements WebMvcConfigurer {
@Override
public void addCorsMappings(CorsRegistry registry) {
// 覆盖所有请求
registry.addMapping("/**")
// 允许发送 Cookie
.allowCredentials(true)
// 放行哪些域名(必须用 patterns,否则 * 会和 allowCredentials 冲突)
.allowedOriginPatterns("*")
.allowedMethods("GET", "POST", "PUT", "DELETE", "OPTIONS")
.allowedHeaders("*")
.exposedHeaders("*");
}
}
Controller 测试
@RestController
@RequestMapping("/health")
public class HealthController {
@GetMapping
public BaseResponse<String> healthTest() {
return ResultUtils.success("ok");
}
}
数据库业务
- MybatisX 代码生成
生成对应代码,将其移动到项目中
- 移动完成后修改 model.entity 实体类
id属性 为 雪花算法(随机生成)
删除数据时会彻底删除,@TableLogic 逻辑删除,字段标为 1 表示删除
@TableId(type = IdType.ASSIGN_ID)
private Long id;
@TableLogic
private Integer isDelete;
- 创建枚举类 model.enums
@Getter
public enum UserRoleEnum {
USER("用户", "user"),
ADMIN("管理员", "admin");
private final String text;
private final String value;
UserRoleEnum(String text, String value) {
this.text = text;
this.value = value;
}
/**
* 根据 value 获取枚举
*
* @param value 枚举值的value
* @return 枚举值
*/
public static UserRoleEnum getEnumByValue(String value) {
if (ObjUtil.isEmpty(value)) {
return null;
}
for (UserRoleEnum anEnum : UserRoleEnum.values()) {
if (anEnum.value.equals(value)) {
return anEnum;
}
}
return null;
}
}
- 封装类
接入LangChian4j
Junit 测试
<properties>
<java.version>21</java.version>
<langchain4j.version>1.0.0-beta3</langchain4j.version>
</properties>
<dependencies>
<!-- langChain4j : https://docs.langchain4j.info/get-started -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j</artifactId>
<version>${langchain4j.version}</version>
</dependency>
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai</artifactId>
<version>${langchain4j.version}</version>
</dependency>
public class LLMTest {
/**
* 测试使用框架内置的 openai模型(无需魔法)
*/
@Test
void openAITest() {
ChatLanguageModel model = OpenAiChatModel
.builder()
.baseUrl("http://langchain4j.dev/demo/openai/v1")
.apiKey("demo")
.modelName("gpt-4o-mini")
.build();
String answer = model.chat("你是什么框架,什么模型?");
System.out.println(answer);
}
}
输出结果如下:
我是基于OpenAI的GPT-3模型,这是一个大型的语言模型,使用深度学习技术进行自然语言处理。我的框架是Transformer,这种架构在处理文本生成和理解方面非常有效。如果你需要关于模型或框架的具体信息,随时可以问我!
SpringBoot整合
SpringBoot 整合: Spring Boot 集成 | LangChain4j 中文文档
- pom 配置
只需要一个配置即可
<!-- langChain4j : https://docs.langchain4j.info/get-started -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-open-ai-spring-boot-starter</artifactId>
<version>${langchain4j.version}</version>
</dependency>
- yml 配置 模型
使用内嵌模型
# 框架模型配置
langchain4j:
open-ai:
chat-model:
api-key: demo
model-name: gpt-4o-mini
# 请求响应日志
log-requests: true
log-responses: true
base-url: http://langchain4j.dev/demo/openai/v1
# 配置日志级别 才能参考langChain中配置的日志
logging:
level:
root: debug
- SpringBoot Test 测试
执行时,会启动Spring执行该测试类,自动依赖注入
@SpringBootTest
public class LLMTest {
@Autowired
OpenAiChatModel openAiChatModel;
@Test
void springBootModel() {
String answer = openAiChatModel.chat("你是什么大模型?");
System.out.println(answer);
}
}
接入阿里百炼
百炼平台: 百炼控制台
百炼平台模型列表: 模型列表_大模型服务平台百炼(Model Studio)-阿里云帮助中心
- pom.xml 加入 阿里模型
DashScope (通义千问) | LangChain4j 中文文档
<!-- 阿里百炼大模型 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-community-dashscope-spring-boot-starter</artifactId>
<version>${langchain4j.version}</version>
</dependency>
- yml 增加配置
langChain4j:
community:
dashscope:
chat-model:
api-key: ${DASH_SCOPE_API_KEY}
model-name: qwen-max
- 电脑系统配置 百炼api_key的 环境变量
百炼平台: 百炼控制台
配置完成,重启 idea 更新生效
- SpringBoot test 测试
/**
* (阿里百炼)通义千问接入
*/
@Autowired
private QwenChatModel qwenChatModel;
@Test
void qianWenTest() {
String answer = qwenChatModel.chat("你是什么模型?");
System.out.println(answer);
}
AiService
原理
代理模式
AiServices会组装Assistant接口以及其他组件,并使用反射机制创建一个实现Assistant接口的代理对象。
这个代理对象会处理输入和输出的所有转换工作。在这个例子中,chat方法的输入是一个字符串,但是大
模型需要一个 UserMessage 对象。所以,代理对象将这个字符串转换为 UserMessage ,并调用聊天语
言模型。chat方法的输出类型也是字符串,但是大模型返回的是 AiMessage 对象,代理对象会将其转换
为字符串。
简单理解就是:代理对象的作用是输入转换和输出转换
默认方式
引入依赖
<!--langchain4j高级功能-->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-spring-boot-starter</artifactId>
</dependency>
- 创建 assistant/Assistant 接口,封装用户输入chat方法
public interface Assistant {
String chat(String userMessage);
}
- 新建测试类
通过 AiServices.create
创建AiService服务
@SpringBootTest
public class AiServiceTest {
@Autowired
private QwenChatModel qwenChatModel;
@Test
void testChat() {
Assistant assistant = AiServices.create(Assistant.class, qwenChatModel);
String answer = assistant.chat("你是什么大模型");
System.out.println(answer);
}
}
注解
升级:使用@AiService
注解
- 项目启动是注解创建 Aiservice 实现
- 更加便捷高效
- 若整个项目只声明(@Autowired)了一个模型, 那直接注解即可
- 若声明了多个模型,则需要再直接中确定使用哪个模型,否则报错
// wiringMode = EXPLICIT 定义模型需要声明
// qwenChatModel 为变量名,
/* @Autowired
private QwenChatModel qwenChatModel;
*/
@AiService(wiringMode = EXPLICIT, chatModel = "qwenChatModel")
public interface Assistant {
@Autowired
private Assistant assistant;
@Test
void testAiService () {
String answer = assistant.chat("你是什么大模型");
System.out.println(answer);
}
模型名称错误,找不到对应变量的模型 报错:
Caused by: org.springframework.beans.factory.BeanCreationException: Error creating bean with name 'assistant': Cannot resolve reference to bean 'qwenChatModels' while setting bean property 'chatLanguageModel'
记忆对话
基础用法
- 简单方式: 每次会话携带之前的内容(繁琐)
@Autowired
private QwenChatModel qwenChatModel;
// 多轮对话 集合
@Test
public void ChatMemoryList() {
//第一轮对话
UserMessage userMessage1 = UserMessage.userMessage("我是CxCS");
ChatResponse chatResponse1 = qwenChatModel.chat(userMessage1);
AiMessage aiMessage1 = chatResponse1.aiMessage();
//输出大语言模型的回复
System.out.println(aiMessage1.text());
//第二轮对话 (将上一轮对话存入集合)
UserMessage userMessage2 = UserMessage.userMessage("你知道我是谁吗");
ChatResponse chatResponse2 = qwenChatModel.chat(Arrays.asList(userMessage1,
aiMessage1, userMessage2));
AiMessage aiMessage2 = chatResponse2.aiMessage();
//输出大语言模型的回复
System.out.println(aiMessage2.text());
}
- 使用 chatMemory 实现记忆对话
- 使用 `MessageWindowChatMemory.withMaxMessages` 方法限定总的消息数量
(用户、回复、预设各自算一条)
@Test
public void ChatMemoryMessage() {
// 限定消息条数
MessageWindowChatMemory chatMemory = MessageWindowChatMemory.withMaxMessages(10);
//创建AIService
Assistant assistant = AiServices
.builder(Assistant.class)
.chatLanguageModel(qwenChatModel)
.chatMemory(chatMemory)
.build();
//调用service的接口
String answer1 = assistant.chat("我是CxCS");
System.out.println(answer1);
String answer2 = assistant.chat("我是谁");
System.out.println(answer2);
}
AiService实现
AI Services | LangChain4j 中文文档
使用 AiService 实现聊天记忆 【推荐】
- 新建配置类
config/MemoryChatAssistantConfig
为AiServce创建初始配置 注入
@Configuration
public class MemoryChatAssistantConfig {
/**
* 项目启动 注入记忆聊天数量 到服务中
*/
@Bean
public ChatMemory chatMemory() {
return MessageWindowChatMemory.withMaxMessages(10);
}
}
- 创建新的assistant服务专用于 记忆聊天
assistant/MemoryChatAssistant
@AiService(
wiringMode = EXPLICIT,
chatModel = "qwenChatModel",
chatMemory = "chatMemory"
)
public interface MemoryChatAssistant {
String chat(String userMessage);
}
- 现在配置了记忆聊天功能,直接使用assistant即可
@Autowired
private MemoryChatAssistant memoryChatAssistant;
@Test
public void ChatMemoryAssistant() {
String answer1 = memoryChatAssistant.chat("我是CxCS");
System.out.println(answer1);
String answer2 = memoryChatAssistant.chat("我是谁");
System.out.println(answer2);
}
隔离对话
使用 id 标记不同会话 chatMemoryProvider 设置会话id
框架内使用
- 新建 assistant/SeparateChatAssistant
@AiService(
wiringMode = EXPLICIT,
chatModel = "qwenChatModel",
chatMemory = "chatMemory",
chatMemoryProvider = "chatMemoryProvider"
)
public interface SeparateChatAssistant {
/**
* 分离聊天记录
* @param memoryId 聊天id
* @param userMessage 用户消息
* @return
*/
String chat(@MemoryId int memoryId, @UserMessage String userMessage);
}
- 新加配置类
@Configuration
public class SeparateChatAssistantConfig {
@Bean
ChatMemoryProvider chatMemoryProvider() {
return memoryId -> MessageWindowChatMemory.builder()
.id(memoryId)
.maxMessages(10)
.build();
}
}
- 测试 每次对话换入 对应 id
@Autowired
private SeparateChatAssistant separateChatAssistant;
@Test
public void ChatMemoryProvider() {
String answer1 = separateChatAssistant.chat(1,"我是CxCS");
System.out.println(answer1);
String answer2 = separateChatAssistant.chat(1,"我是谁");
System.out.println(answer2);
// 模拟 新会话`
String answer3 = separateChatAssistant.chat(2,"我是谁");
System.out.println(answer3);
}
原理
MessageWindowChatMemory.withMaxMessages(10);
MessageWindowChatMemory
类中定义了 存储接口ChatMemoryStore
ChatMemoryStore
使用List<ChatMessage> getMessages(Object var1);
集合存储聊天数据-
ChatMemoryStore
有两个实现类
1. SingleSlotChatMemoryStore 普通实现类,使用List集合存储 (默认)
2. InMemoryChatMemoryStore 隔离实现类,使用HashMap 列表存储,用id标记
private final Map<Object, List<ChatMessage>> messagesByMemoryId = new ConcurrentHashMap();
持久化存储
推荐使用非关系型数据库,如 MongoDB,使用BSON,键值对存储
时间关系,这里直接使用 MySQL 持久化会话记录
- 首先数据库 ai_chat 聊天表需要有一个 memoryId 字段用于 框架标识
DROP TABLE IF EXISTS `ai_chat`;
CREATE TABLE `ai_chat` (
`id` bigint(20) unsigned NOT NULL AUTO_INCREMENT COMMENT '主键',
`memoryId` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL COMMENT '会话id 有框架生成',
`name` varchar(100) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL COMMENT '会话名称',
`messages` mediumtext CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NULL COMMENT '消息列表(JSON 对象数组)',
`aiModel` varchar(50) CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci NOT NULL COMMENT 'AI模型(如 gpt-4、ernie-bot 等)',
`status` int(11) NOT NULL DEFAULT 0 COMMENT '状态(0-正常,1-禁用)',
`userId` bigint(20) unsigned NOT NULL COMMENT '创建人(用户ID)',
`createTime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP COMMENT '创建时间',
`updateTime` datetime NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP COMMENT '更新时间',
`isDelete` tinyint(3) unsigned NOT NULL DEFAULT 0 COMMENT '是否删除(逻辑删除)',
PRIMARY KEY (`id`) USING BTREE,
INDEX `idx_userId`(`userId`) USING BTREE COMMENT '用户ID索引'
) ENGINE=InnoDB
CHARACTER SET=utf8mb4 COLLATE=utf8mb4_unicode_ci COMMENT='AI聊天表' ROW_FORMAT=DYNAMIC;
- MyBtisX 生成对应service、mapper、entity (见前期工作)
- 创建
store/MysqlChatMemoryStore
类实现ChatMemoryStore
接口
覆盖接口的三个方法 【获取记录、更新记录i、删除记录】
该类使用 @Component 在上下文注册,便于调用 service 服务和供chatMemoryProvider使用
在该类中 执行service层 对数据库进行相关操作
- 直接在AIService的配置类
SeparateChatAssistantConfig
注入store进行持久化操作
@Configuration
public class SeparateChatAssistantConfig {
@Autowired
private MysqlChatMemoryStore chatMemoryStore;
@Bean
ChatMemoryProvider chatMemoryProvider() {
return memoryId -> MessageWindowChatMemory.builder()
.id(memoryId)
.maxMessages(10)
.chatMemoryStore(chatMemoryStore) // 将store注入,数据持久化
.build();
}
}
- 测试 无需更改上一次 SeparateChatAssistant 相关的测试类,直接执行即可
/**
* 隔离记忆聊天
* 对话助手
*/
@Autowired
private SeparateChatAssistant separateChatAssistant;
@Test
public void ChatMemoryProvider() {
String answer1 = separateChatAssistant.chat(1,"我是CxCS");
System.out.println(answer1);
String answer2 = separateChatAssistant.chat(1,"我是谁");
System.out.println(answer2);
// 模拟 新会话`
String answer3 = separateChatAssistant.chat(2,"我是谁");
System.out.println(answer3);
}
执行结果 (MySQL数据库更新)
执行流程
通过 DeBug 明确流程
- 先通过 memoryId 去store的 getMessages() 方法 获取对话记录
若不存在,该方法返回 一个空的linkedList
- 对返回的 list 存入当前输入的内容
- 调用 store 的 updateMessages() 方法更新/创建聊天记录 (若没有该memoryId的记录,必须创建)
- 再次调用 getMessages() 获取记录
若仍然没有记录 返回一个空集合,会出现报错
- 重新获取记录后,方式给 ai 模型,最后将返回的记录存入数据库
Prompt预设
基础用法
- SeparateChatAssistant 新增方法
使用 @SystemMessage 注解在方法上注入
/**
* 设置prompt提示词
*
* @param memoryId
* @param userMessage
* @return
*/
@SystemMessage("你是我的好朋友,请用广东话回答问题。") //系统消息提示词
String chatPrompt(@MemoryId int memoryId, @UserMessage String userMessage);
- 测试
@SpringBootTest
public class PromptTest {
@Autowired
private SeparateChatAssistant separateChatAssistant;
@Test
public void testSystemMessage() {
String answer = separateChatAssistant.chatPrompt(3,"今天几号");
System.out.println(answer);
}
}
- 在提示词中 标记当前日期
{{current_date}}
@SystemMessage("你是我的好朋友,请用广东话回答问题。今天是{{current_date}}")
模板配置身份
将提前通过ai配置好的prompt身份模板,📎system-prompt.txt存放在resources目录下
/**
* prompt设置身份
*
* @param memoryId
* @param userMessage
* @return
*/
@SystemMessage(fromResource = "system-prompt.txt")
String chatWeather(@MemoryId int memoryId, @UserMessage String userMessage);
继续测试
@Test
public void testWeatherSystemMessage() {
String answer = separateChatAssistant.chatWeather(4,"今天几号, 你现在是什么身份");
System.out.println(answer);
}
测试结果:成功赋予身份
用户提示词
用户提示词 @userMassge 注解,每一次都会将注解中的内容封装到 用户提问中
- 使用 @userMassge 设置每次提问的内容
- @v 注解占位,将 userMessage 封装到最终的 @userMassge中去
@UserMessage("你是我的好朋友,请用粤语回答问题。{{msg}}")
String chatUser(@MemoryId int memoryId, @V("msg") String userMessage);
测试
/**
* 每次提问 用户提示词预设
*/
@Test
public void testV() {
String answer1 = separateChatAssistant.chatUser(5, "我是CxCS");
System.out.println(answer1);
String answer2 = separateChatAssistant.chatUser(5, "我是谁");
System.out.println(answer2);
}
结果粘贴如下: 成功,每次请求都携带提示词
系统与用户预设【组合使用】
@SystemMessage(fromResource = "prompt-test.txt")
String chat3(
@MemoryId int memoryId,
@UserMessage String userMessage,
@V("username") String username,
@V("role") String role
);
prompt-test.txt 文件中设置占位符
你是我的好朋友,我是{{username}},我是一名{{role}},请用广东话回答问题,回答问题的时候适当添加表情
符号。
今天是 {{current_date}}。
测试
@Test
public void testUserInfo() {
String answer = separateChatAssistant.chat3(1, "我是谁,我是什么身份", "CxCS", "ai应用工程师");
System.out.println(answer);
}
测试结果
嘿CxCS呀,你系我嘅好朋友嚟嘅,仲系一名AI应用工程师添!😀 你真係好叻喎! 👍
Agent 初步
正常会话
- 新建 assistant/PortWeatherAgent 智能体服务 (使用预设模板)
@AiService(
wiringMode = EXPLICIT,
chatModel = "qwenChatModel",
chatMemoryProvider = "agentChatMemoryProvider"
)
public interface PortWeatherAgent {
@SystemMessage(fromResource = "system-prompt.txt")
String chat(@MemoryId String memoryId, @UserMessage String userMessage);
}
- 对应服务配置类
@Configuration
public class PortWeatherAgentConfig {
@Autowired
private MysqlChatMemoryStore mysqlChatMemoryStore;
@Bean
ChatMemoryProvider agentChatMemoryProvider() {
return memoryId -> MessageWindowChatMemory.builder()
.id(memoryId)
.maxMessages(20)
.chatMemoryStore(mysqlChatMemoryStore)
.build();
}
}
- Controller 控制层调用
@RestController
@RequestMapping("/agent")
public class AiAgentConmtroller {
@Autowired
private PortWeatherAgent portWeatherAgent;
@Operation(summary = "智能体会话")
@PostMapping("/chat")
public String chat(@RequestBody AiChatEventRequest aiChatEventRequest) {
return portWeatherAgent.chat(aiChatEventRequest.getMemoryId(), aiChatEventRequest.getMessage());
}
}
接口文档提示截图:
流式输出
每个模型有专门的流式对话的模型
这里使用千问的流式对话模型 qwen-plus
langchain4j:
community:
dashscope:
# 流式输出大模型
streaming-chat-model:
api-key: ${DASH_SCOPE_API_KEY}
model-name: qwen-plus
- 新建 流式智能体, 调用 流式模型
- chat 方法返回 Flux<String> (SpringBoot 的流式响应)
@AiService(
wiringMode = EXPLICIT,
streamingChatModel = "qwenStreamingChatModel",
chatMemoryProvider = "agentChatMemoryProvider"
)
public interface PortWeatherStreamAgent {
@SystemMessage(fromResource = "system-prompt.txt")
Flux<String> chat(@MemoryId String memoryId, @UserMessage String userMessage);
}
- Controller层
- 响应格式需设置为 text/event-stream 才有流式效果
@Operation(summary = "智能体流式会话")
@PostMapping(value = "/chat/stream", produces = "text/event-stream;charset=utf-8")
public Flux<String> streamChat(@RequestBody AiChatEventRequest aiChatEventRequest) {
return portWeatherStreamAgent.chat(aiChatEventRequest.getMemoryId(), aiChatEventRequest.getMessage());
}
}
Funtion Call
RAG
中国天气公报PDF 下载地址: 天气预报_天气公报
文档解析
- 使用默认txt解析
FileSystemDocumentLoader 加载文档
TextDocumentParser langChain4j默认文档解析器
ClassLoader.getSystemClassLoader().getResource("document/1.json").toURI()
加载资源目录
@SpringBootTest
public class RAGTest {
@Test
public void testReadDocument() throws URISyntaxException {
// 读取 资源目录下文档
Path documentPath = Paths.get(ClassLoader.getSystemClassLoader().
getResource("document/1.json").toURI());
DocumentParser documentParser = new TextDocumentParser(); // 文档解析器
Document document = FileSystemDocumentLoader.loadDocument(documentPath, documentParser);
System.out.println(document.text());
}
}
文档解析器种类:
- 来自 langchain4j 模块的文本文档解析器(TextDocumentParser),它能够解析纯文本格式的文件 (例如 TXT、HTML、MD、JSON 等)。
- 来自 langchain4j-document-parser-apache-pdfbox 模块的 Apache PDFBox 文档解析器 (ApachePdfBoxDocumentParser),它可以解析 PDF 文件。
- 来自 langchain4j-document-parser-apache-poi 模块的 Apache POI 文档解析器 (ApachePoiDocumentParser),它能够解析微软办公软件的文件格式(例如 DOC、DOCX、PPT、 PPTX、XLS、XLSX 等)。
- 来自 langchain4j-document-parser-apache-tika 模块的 Apache Tika 文档解析器 (ApacheTikaDocumentParser),它可以自动检测并解析几乎所有现有的文件格式。
- 解析pdf文档
引入 pdfbox 依赖
<!-- pdf文档解析器 langChain4j -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-document-parser-apache-pdfbox</artifactId>
<version>${langchain4j.version}</version>
</dependency>
@Test
public void testParsePDF() throws URISyntaxException {
Path documentPath = Paths.get(ClassLoader.getSystemClassLoader().
getResource("document/天气预报_强对流天气预报_强天气落区.pdf").toURI());
Document document = FileSystemDocumentLoader.loadDocument(
documentPath,
new ApachePdfBoxDocumentParser()
);
System.out.println(document);
}
向量转换存储
将解析的文件内容 进行 文档分割 ,再将分割后的数据转为向量 存储到向量数据库
暂时使用内存存储 向量数据 进行测试
- 引入依赖 (内存存储)
<!-- 简单的rag实现 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-easy-rag</artifactId>
<version>${langchain4j.version}</version>
</dependency>
- 测试
/**
* 文档分割器 分割 再向量存储
*
*/
@Test
public void testDocumentSplitter() throws URISyntaxException {
Path documentPath = Paths.get(ClassLoader.getSystemClassLoader().
getResource("document/天气预报_强对流天气预报_强天气落区.pdf").toURI());
DocumentParser documentParser = new TextDocumentParser(); // 文档解析器
Document document = FileSystemDocumentLoader.loadDocument(documentPath, documentParser);
InMemoryEmbeddingStore<TextSegment> embeddingStore = new InMemoryEmbeddingStore<>(); // 内存
// 自定义文档分割器
// 按段落分割文档:每个片段包含不超过 300个token,并且有 30个token的重叠部分保证连贯性
// 注意:当段落长度总和小于设定的最大长度时,就不会有重叠的必要。
DocumentByParagraphSplitter documentSplitter = new DocumentByParagraphSplitter(
300,
30,
//token分词器:按token计算
new HuggingFaceTokenizer());
// 按字符计算
// DocumentByParagraphSplitter documentSplitter = new DocumentByParagraphSplitter(300, 30);
EmbeddingStoreIngestor
.builder()
.embeddingStore(embeddingStore)
.documentSplitter(documentSplitter)
.build()
.ingest(document);
}
向量数据库
文本向量模型
使用阿里云文本向量模型
通用文本向量同步接口API详情_大模型服务平台百炼(Model Studio)-阿里云帮助中心
- yml 引入
community:
dashscope:
# 文本向量模型
embedding-model:
api-key: ${DASH_SCOPE_API_KEY}
model-name: text-embedding-v3
- 测试类
@SpringBootTest
public class EmbeddingTest {
@Autowired
private EmbeddingModel embeddingModel;
@Test
public void testEmbeddingModel(){
Response<Embedding> embed = embeddingModel.embed("你好,我是CxCS");
System.out.println("向量维度:" + embed.content().vector().length);
System.out.println("向量输出:" + embed.toString());
}
}
- 执行测试类输出向量数据:
向量维度:1024
向量输出:Response { content = Embedding { vector = [-0.081431076, 0.008456076, -0.13545518, -0.040912993, -0.043006033, -0.08285276, -0.004504477, 0.026459174, 6.620964E-4, 0.03907665, 0.04960108, 0.014641401, 0.012094213, -3.231808E-5, 0.034949806, -0.008999081, 0.02766366, -0.052483946, -0.03789191, -0.04628381, -0.059789836, -0.017445285, -0.01330857, 0.025807569, 0.053629193, 0.06377846, -0.045849405, -0.033409644, 0.057420358, -0.036766406, -0.0024336518, 0.024919014, 0.022312587, -0.05627511, 0.0069356607, -0.0050943783, -0.026064262, -0.023833003, -0.089408316, 0.020555224, 0.03666768, 0.016497493, 0.03220516, -0.06105356, 0.0059533142, -0.07744245, 0.005770667, -0.048258375, 0.022194114, 0.02193742, -0.007562585, 0.03305422, 0.017652614, -0.021108102, 0.028631197, -0.008421521, -0.0034974487, -0.03378481, 0.01596436, -0.0015487997, -0.03570014, -1.6398147E-4, 0.018146254, -0.028078318, -0.013456662, 0.07894312, 0.011709172, 0.04142638, -0.03333066, -0.027466204, -0.008549868, -0.011235276, 0.009073127, -0.019824635, -0.061803892, -0.038583007, 0.019034809, -0.026399938, -0.012439761, 0.0068171867, 0.08609105, 0.015085679, 0.00929033, 0.0037072464, -0.019360613, 0.040439095, -0.007360192, 0.06298863, -0.04715262, 7.324403E-4, 0.012558235, -0.027150273, -0.01966667, -0.014799367, -0.047626514, -0.036272764, -0.03052678, 0.036588695, 0.020219548, 0.03573963, 0.016448129, 0.03696386, 0.012439761, -0.08348462, -0.013338189, 0.0055337194, 0.009788907, -0.013960176, -0.033824302, -0.02679485, 0.07503348, 0.050706837, -0.02018993, -7.929114E-4, 0.041228924, -0.013575137, 0.0056127016, -0.014463691, -0.006185326, 0.030033138, 0.06417337, 0.025511384, 0.014828986, -0.0033888477, 0.058842044, -0.026597394, -0.046165336, 0.06638488, 0.0051338696, 0.027584678, 0.040991977, 0.0772055, 0.005889141, 0.010633034, -0.03098093, -0.005963187, 0.0035863041, 0.010623162, 0.044506703, -0.021463525, 0.030684745, 0.031672027, -0.010998328, -0.022233605, -0.04387484, -0.06681929, 0.06796454, 0.00506476, 0.0030358941, 0.025945788, 0.0036356684, -0.009171856, -0.0021226576, 0.020752681, 0.030033138, -0.018817607, -0.01802778, 0.006293927, -0.00829811, 0.011521588, 0.037358776, -3.0559482E-4, -0.044348735, -0.041544855, 0.049482606, 0.026005024, 0.0016277822, 0.011709172, 0.0057756035, -0.020417005, 0.02193742, -0.022549536, 0.01480924, -0.03342939, 0.020673698, 0.031968214, 0.09706963, -0.0020806983, -0.02811781, -0.075862795, -0.027564932, -0.0011878244, -0.02175971, -0.03147457, -0.0254324, -0.014197124, 0.003285183, 1.8696664E-4, -0.0042403787, 0.056354094, -0.01383183, -0.020338023, 0.016615966, -0.04786346, 0.058447134, 0.016388891, 0.003453021, -0.037003353, -0.024603084, -0.005662066, -0.050390907, 0.012528617, 0.012864293, -0.01764274, -0.048258375, -0.021463525, -0.01673444, -0.013456662, -0.0047118063, -0.05256293, -0.030763727, 0.010405959, 0.03125737, -0.025274435, -0.022194114, 0.029421022, 0.024583338, -0.0017067649, -0.02207564, -0.0043317024, -0.018422695, 0.032777783, -0.022687756, 0.05745985, 0.021779455, 0.005671939, 0.020555224, 0.026005024, -0.01149197, -0.0034826396, -0.004158928, 0.006871487, 0.023062922, 0.03951105, 0.014473564, 0.022608772, -0.027327983, -0.038898937, -0.019163156, 0.029440768, -0.00229173, -0.0062198807, 0.008456076, 6.7752274E-4, -0.008826307, 0.042729594, 0.01652711, 0.023694783, 0.005341199, 0.013190096, 0.024089696, -0.009872827, 0.0024570997, -0.0012686582, 0.037674706, 0.00714299, -0.005671939, -0.0014944992, -0.0037664834, 0.005627511, -0.03319244, -0.016388891, 0.025610112, 0.037951145, 0.029559242, 0.0015599065, 0.005148679, 0.011462351, -0.032580327, 0.022904957, -0.009349567, -0.034495655, 0.009951809, 0.0070788166, -0.04292705, 0.002068357, 0.008762133, -0.03415998, 0.052799877, 0.00551891, 0.044822633, -0.014039159, 0.011156294, -0.011886883, 0.014562419, 0.003058108, 0.01512517, -0.025926042, 0.020772427, -0.027999336, -0.036035817, 0.0021843628, 0.046994653, 0.0047266157, -0.018472059, -0.009255774, 0.040912993, -0.19319147, -0.006782632, 0.046125844, 0.011373496, -0.024267407, -0.03915563, -0.057104427, 0.021996656, -0.040182404, 0.059631873, 0.01753414, -0.021345051, -0.053984616, 0.025866805, -0.04869278, -0.01058367, -8.5585064E-4, -0.006175453, 0.027446458, -0.049127184, -0.038188092, -0.04055757, 0.04292705, -0.020653954, -0.00344068, 0.028157301, 0.045138564, -0.02484003, -0.051852085, -0.021779455, -0.005696621, -0.0053362628, -9.268116E-4, 0.0016290164, 0.019567942, 0.010712016, -0.005400436, -0.054537494, 0.010475069, -0.0051338696, 0.018916335, 0.046915673, -0.006856678, -0.0036233272, 0.025945788, -0.027505694, 0.0077353595, 0.016497493, 0.00506476, 0.014374835, 0.007044262, 5.757092E-4, -0.009813589, -0.031593043, -0.016151944, -0.0068961694, -0.026281465, 0.035719886, -0.018906463, -0.0066592214, -0.005726239, -0.04577042, 0.042413663, 0.015401609, -0.02207564, 0.009038572, -0.0014266234, -0.056196127, 0.025353419, -0.022115132, 0.026834343, -0.024543846, -0.00876707, -0.02961848, -0.053036824, -6.3741434E-4, -0.029539498, -0.062830664, -0.046204828, -0.11736816, -0.008939845, 0.0076514403, -0.01945934, -0.030921692, -0.03978749, -0.027703151, 0.0014599442, 0.042255696, 0.015599065, 0.23726377, 0.04474365, -0.0056571295, -0.056393586, 0.06966266, -0.023990968, 0.026676377, -0.018965699, 0.03052678, -0.019054554, -0.0027520503, -0.032185413, -0.03252109, -5.254195E-4, -0.014789494, 0.04411179, -0.031237623, -0.024405627, 0.05106226, -0.039353088, -0.028236283, -0.0048302803, 0.017158972, -0.0047710435, -0.041821294, -0.01701088, 0.01718859, 0.04806092, -0.06208033, 0.070886895, -0.0034974487, 0.03392303, -0.009858017, 0.02207564, 0.055564266, -0.029361786, 0.030763727, -0.032718547, -0.024247661, 0.03992571, -0.049798537, -0.02493876, 0.025807569, 0.019795017, 0.0055584013, -0.0252152, -0.002640981, 0.00901389, 0.03260007, -0.0016302505, 0.027900608, -0.019024936, -0.018531295, 0.032817274, 0.005509037, -0.025787823, 0.02693307, -0.05382665, -0.035305228, 0.02493876, 0.04292705, 0.032856766, -0.029519752, 0.01652711, -0.040952485, 0.028947128, -0.0058496497, -0.04742906, 0.001959756, -0.007774851, -0.017613122, -0.01837333, -0.015687922, 0.0032185414, 0.020693444, 0.026084008, 0.002162149, 0.029223567, 0.013575137, 0.0640549, 0.013555391, -0.00609647, -0.037260048, -0.009369312, 0.020792173, -0.05595918, -0.001195229, 0.0058842045, -0.023694783, -0.038227584, -0.0064617647, -0.020417005, -0.014621656, 0.025471892, 0.03171152, -0.036430728, 0.02588655, 0.021404287, -0.027051544, 0.034179725, -2.5125571E-5, -0.025866805, 0.02979619, -6.7937386E-4, -0.006466701, -0.006022424, 0.019133538, 0.016843041, -0.009828399, -0.017070116, -0.026459174, -0.009418677, 0.009517404, 0.06531862, -0.03650971, -0.017701978, 0.006269245, 0.0041860784, 0.080957174, 0.01778096, 0.018610278, 0.02671587, -0.045257036, 0.028413994, -0.02934204, -0.03737852, -0.034456164, -0.028808907, -0.029144583, 0.028512722, -0.023220887, -0.011324132, -0.010465196, 0.01871888, 0.044546194, -0.0034949805, 0.011511716, 0.03378481, -0.010850236, -0.004025645, -0.0037343965, -0.020357769, -0.004213229, 0.03846453, -0.022589026, 0.03933334, -0.008431394, 0.027446458, 0.009655625, 0.057301886, 0.015115297, 0.028769417, -0.011314259, -0.009818526, -0.013407298, -0.029243313, -0.013821957, -0.027762389, 0.0089497175, -0.041821294, -0.039590035, -0.010386214, 0.032837022, 0.12281796, -0.008199383, -0.023141906, -0.0031395587, 0.05283937, 0.020930393, -0.005015396, 0.032422364, -0.008268492, -0.059710853, 1.7771087E-4, -0.03978749, -0.03273829, -0.026301209, -0.013654119, 0.013979922, 0.01711948, 0.031494316, -0.009971554, 0.029421022, 0.014197124, 0.037358776, 0.011156294, 0.022115132, -0.03111915, 0.032718547, -0.01921252, -0.052286487, 0.10307231, -0.0016154412, -0.024899269, 0.035640903, 0.0116499355, -0.020535478, -0.0047512976, -0.0027989463, -0.014957332, 0.002328753, 0.01718859, -0.021581998, 0.0026014897, -0.02288521, -0.034002014, 0.008974399, -0.0026434492, -0.041821294, -0.021325305, -0.05264191, -1.7030626E-4, -0.014049032, 0.0015327563, 0.02393173, -0.005247407, 0.0032925876, -0.024583338, -0.056788497, -0.050667346, 0.002712559, 0.055761725, 0.03489057, -0.028808907, -0.0153423725, -0.033725575, 0.0055732103, -0.008925035, -0.014384708, 0.02671587, 0.004013304, -0.036766406, -0.037595723, -0.0141181415, 0.03743776, 0.030210849, 0.021404287, 0.051536154, 0.021206832, 0.013900939, 0.03566065, -0.0017894498, -0.013476408, 0.022411317, 0.022944449, 0.05714392, -0.033745322, 0.0021584467, -0.0018918804, 0.01564843, -0.017474903, 0.026241973, 0.02221386, -0.034811586, -0.019725908, 0.015411482, -0.055445794, 7.546542E-4, -0.02416868, 0.019163156, -0.009398931, 0.014009541, -0.040676046, 0.027545186, -0.03356761, -0.026103754, 0.010247994, -0.0026014897, -0.037003353, -0.06227779, -0.021285813, 0.0049882457, -0.04383535, 0.01893608, 0.03803013, 0.011136549, -0.017307065, 0.028690433, 0.019123664, -0.0035566858, -0.013318443, 0.0011495672, -0.0263407, 0.0046081417, -0.0042724656, -0.025945788, 0.02043675, 2.2090449E-4, -0.022312587, -0.013071622, 0.009063255, -0.02539291, -0.043953825, 0.0039812173, -0.03789191, -0.011640063, 0.023773765, 0.023003686, 0.021246323, 0.043716874, 0.005054887, 0.038227584, -0.04055757, -0.022766737, 0.0070541343, -0.03273829, 0.018067272, -0.024879523, 5.886673E-4, -0.01051456, 0.012874166, -0.0077551054, -0.0050943783, -0.0027545185, -0.004171269, 0.0016388892, -0.00308279, 0.01662584, -0.009714861, 0.037319284, -0.014621656, -0.037674706, 0.023023432, 9.601324E-4, -0.0029593797, 0.002400331, -0.010662653, -0.001229784, 0.027861116, -0.0046106097, 0.026439428, 0.011699299, -0.016546857, -0.032501344, -0.015816268, -0.013446789, 0.023398599, -0.026202481, 0.01096871, 0.013930558, 0.02738722, 0.050588362, 0.034791842, -0.04754753, 0.043163996, -0.06887284, -0.005844713, 0.0047710435, 1.6799232E-4, 0.034495655, -0.0019844382, 0.020061584, -9.0459775E-4, -0.036233272, 0.0020017156, 0.024306899, -0.0052720895, 0.007922943, -0.01914341, 0.0051437425, -0.008920099, 0.04415128, -0.005844713, -0.025353419, 0.029756699, -0.030862456, -0.0020387387, 0.02484003, 0.060421698, -0.015174534, 0.0019992474, 0.05434004, -0.010524433, 0.015559575, -0.027742643, 0.024227915, -0.047310583, -0.034396928, 0.023299871, -0.0065160654, 0.016882533, -0.034179725, -0.007552712, -0.042413663, 0.03406125, -0.006738204, -0.025550874, -0.0044822632, 0.025906296, 0.03546319, -0.017435411, -0.01526339, 0.0030383624, 0.022036148, -0.1612035, 0.02598528, 0.0034135298, 0.0017277446, -0.018748498, 0.0012859356, 0.048100412, -0.035542175, 0.0054498003, -0.0029717206, -0.021147594, 0.02438588, 0.01659622, -0.022825975, 0.03615429, -0.019330993, 0.020634208, -0.025827315, 0.026952816, 0.0053510717, -0.015836013, 0.024464864, 0.030013392, -0.0046056737, -0.019567942, -0.014128014, 0.010001173, 0.046639234, -0.046086352, -0.023675038, 0.03161279, -0.023655292, 0.0324816, 0.03797089, 0.0029421023, 0.0042107604, -0.029776445, -0.026005024, -0.009823462, -0.034910314, 0.011018074, 0.009137301, 0.011018074, -0.039807238, 0.012242305, 0.021957166, 0.0054399273, -0.0038084427, -0.013180223, -0.038583007, 0.036825642, -0.0199826, 0.0082141915, 0.020318277, 0.023339361, 0.016388891, -0.024642576, 4.9425836E-4, 0.0071726083, 0.02566935, -0.019479087, 0.03951105, -0.035186753, -0.031277113, -0.03560141, 0.037990637, 0.0018104295, 0.027703151, -0.0044797948, 0.017455157, -0.0016672736, -0.0399652, 0.03321219, -0.039688762, 7.349085E-4, -0.018393075, 0.040083677, -0.02389224, -0.026103754, -0.03220516, 0.03402176, -8.1913604E-5, -0.0057213027, 0.012173195, 0.023339361, -0.0058002854, 0.026873834, -0.06883334, -0.021048866, -0.039096393, -0.03489057, -0.018926209, -0.019232266, 0.004627887, 0.0024188424, 0.0014364963, -0.021404287, 0.032086685, 0.0154411, -0.015826141, -0.016230926, -0.015569448, -0.024820285, 0.01278531, -0.023675038, -0.022292841, -0.0033616973, 0.006422274, -0.0040108357, 0.020041838, -0.010405959, -0.0018931144, 0.013180223, 0.008999081, -0.032639563, -0.02116734, 0.0138614485, -1.2711264E-4, 0.015135043, -0.0025940852, -0.025550874, -0.0040799454, -0.017247828, -0.036766406, 0.031494316, -0.024879523, 0.06831996, 0.025847059, -0.015026442, -0.032639563, 9.786439E-4, -0.057854764, 0.050627854, 0.0054794187, -9.897508E-4, 6.5716E-4, 0.054103088, 0.027545186, -0.03591734, 5.686131E-4, 0.0026903453, 0.038306568, -0.0064716376, -0.03273829, 0.03275804, -0.012735946, -0.012814929, 0.015569448, -0.017790833, -0.024366135, -0.0017030626, 0.016852915, 0.031277113, 0.019301375, -0.03279753, -0.015460846, 0.020792173, 0.03562116, 0.009586515, 0.0068616145, 0.02221386, 0.005563338, -0.016517239, -0.028729925, -0.01226205, -0.00616558, 0.004109564, 0.03074398, 0.008618978, 0.010326977, 0.019271757, 0.05145717, -0.01655673, 0.05137819, 0.031375844, -0.046994653, -0.01103782, 0.020022092, 0.07282197, 0.009774098, 0.0054053725, -0.008441267, 0.0126076, -0.007577394, -0.029322295, -0.04924566, -0.011304386, 0.008046353, -0.0076909317, -7.5897353E-4, 0.017158972, 0.011965865, 0.0051634884, -0.019261884, 0.04561246, 0.017859943, -0.013318443, -0.0025237412, -0.0064420193, 0.014374835, -0.015036315, -0.04478314, 0.046086352, 0.010030792, -0.03305422, 0.011097057, -0.018452313, 0.005074633, 0.015806396, 0.047666006, 0.025294181, -0.03629251, -0.002353435, 0.011136549, 0.014907968, 0.020298531, -0.012183067, 0.09004018, 0.002376883, -0.0067332676, 0.021739963, -0.0094384225, -0.03455489, -0.005356008, 3.0513204E-4, -0.010642907, -0.056077655, 0.024010714, 0.005617638, -0.017721724, -0.020081328, -0.0053214533, -0.020535478, -0.009843208, 0.030230595, -0.026399938, 0.024919014, 0.040636554, 0.009971554, -0.0026064261, 0.044625174, 0.006703649, 0.0010304763, -0.004598269] }, tokenUsage = TokenUsage { inputTokenCount = 6, outputTokenCount = null, totalTokenCount = 6 }, finishReason = null, metadata = {} }
Pinecone 向量存储
- 引入依赖
<!-- pinecone 向量数据库 -->
<dependency>
<groupId>dev.langchain4j</groupId>
<artifactId>langchain4j-pinecone</artifactId>
<version>${langchain4j.version}</version>
</dependency>
- 配置向量存储 config PineconeEmbeddingStoreConfig
@Configuration
public class PineconeEmbeddingStoreConfig {
@Autowired
private EmbeddingModel embeddingModel;
@Bean
public EmbeddingStore<TextSegment> embeddingStore() {
//创建向量存储
EmbeddingStore<TextSegment> embeddingStore = PineconeEmbeddingStore.builder()
.apiKey(System.getenv("PINECONE_API_KEY"))
.index("port-ai-index") // 索引
.nameSpace("port-ai-namespace") // 空间
.createIndex(PineconeServerlessIndexConfig.builder()
.cloud("AWS") // 指定索引部署在 AWS 云服务上
.region("us-east-1") // 索引所在的 AWS 区域为 us-east-1。
.dimension(embeddingModel.dimension()) // 索引的向量维度
.build())
.build();
return embeddingStore;
}
}
- 智能体配置
@AiService(
wiringMode = EXPLICIT,
chatModel = "qwenChatModel",
chatMemoryProvider = "agentChatMemoryProvider",
tools = "weatherTools"
)
public interface WeatherAnalysisAgent {
// @SystemMessage(fromResource = "analysis-prompt.txt")
String analysisChat(@MemoryId String memoryId, @UserMessage String userMessage);
}
- 该向量模型可能不支持 stream 流式会话,(未验证)
- controller 测试
@Operation(summary = "气象数据解析会话")
@PostMapping("/chat/analysis")
public String weatherAnalysis(@RequestBody AiChatEventRequest aiChatEventRequest) {
return weatherAnalysisAgent.analysisChat(aiChatEventRequest.getMemoryId(), aiChatEventRequest.getMessage());
}
- 运行:后端成功返回 文档中的数据、图片链接