【AI应用开发】基于DeepSeek API的流式对话系统实现：Spring Boot+React打造实时响应体验

果冻kk

已于 2025-05-14 10:56:11 修改

阅读量1.4k

点赞数 51

分类专栏： AI应用实战实战与手撕文章标签： spring boot react.js 人工智能

于 2025-05-13 19:28:45 首次发布

本文链接：https://blog.csdn.net/Pte_moon/article/details/147929749

版权

实战与手撕同时被 2 个专栏收录

9 篇文章

订阅专栏

AI应用实战

3 篇文章

订阅专栏

🔥 本文深入讲解如何为DeepSeek AI对话系统增加流式输出功能，在V1版本基础上进行升级完善。完整展示Spring Boot后端和React前端的流式数据处理，呈现一个近乎实时的AI对话体验！

V1版本没看的小伙伴移步请移步：
【实战教程】零基础搭建DeepSeek大模型聊天系统 - Spring Boot+React完整开发指南

📚博主匠心之作，强推专栏：

JAVA集合专栏【夜话集】
JVM知识专栏
数据库sql理论与实战【博主踩坑之道】
小游戏开发【博主强推匠心之作拿来即用无门槛】

DeepSeek AI流式对话系统

项目介绍

在AI大模型时代，拥有一个流畅的AI对话助手至关重要。上周我们分享了DeepSeek AI对话系统的V1版本，实现了基础的问答功能。然而，用户反馈表明，他们期望更加流畅的对话体验，就像与真人交流那样实时。

为此，我们推出了V2版本，核心亮点是：

✨ 流式输出功能：实时显示AI回答过程，让用户无需等待完整回复
🎯 增强的Markdown格式处理：修复格式问题，确保内容展示美观
🖼️ 优化的用户界面：更宽的显示区域，优化的滚动条，适合长篇技术解答
🔄 智能分段策略：通过语义边界和格式完整性判断，优化内容传输

V2版本大幅提升了用户体验，让AI对话感觉更像与真人交流，而非机械问答。

流式输出功能剖析

流式输出的核心原理

传统的API请求-响应模式下，客户端发送请求后需要等待服务器处理完毕并返回完整响应。这在AI大模型生成长文本时会导致明显的等待时间，影响用户体验。

流式输出（Streaming）采用了不同的方式：

客户端发送请求给服务器
服务器与AI模型建立连接并开始获取生成内容
关键点：服务器不等待完整内容生成，而是边接收边推送给客户端
客户端实时展示接收到的内容片段，营造"打字效果"

这种方式让用户立即看到部分回答，大幅减少了等待心理，极大提升用户体验。

后端流式处理实现

1. 扩展数据模型支持流式响应

首先，在DeepSeeekResponse.java中添加了Delta类，专门用于处理流式输出：

@Data
@Builder
public static class Delta {
    /**
     * 增量内容
     */
    private String content;
    
    /**
     * 角色
     */
    private String role;
}

这个类处理来自DeepSeek API的增量内容，使我们能够逐段接收并处理数据。

2. 流式数据处理核心实现

在DeepSeekClient02.java中，我们实现了多项关键优化：

// 设置流式输出参数
private static final int BUFFER_SIZE = 16384;  // 16KB
private static final int MAX_CONTENT_BUFFER = 800;  // 超过800字符触发发送
private static final long MAX_EMIT_INTERVAL = 300;  // 最大发送间隔，单位毫秒

// 使用BufferedSource进行流式读取，而不是一次性获取整个响应
BufferedSource source = response.body().source();
StringBuilder lineBuilder = new StringBuilder();
StringBuilder contentBuffer = new StringBuilder();

核心流程包括：

高效缓冲区管理：使用16KB的缓冲区减少系统调用
智能分段策略：通过正则表达式检测句子边界和Markdown格式完整性
定时发送机制：即使未检测到自然分段点，也会每300毫秒发送一次内容
内容累积：将接收到的内容片段智能累积，确保发送的是有意义的单元

3. 格式完整性保障

流式输出的一大挑战是Markdown格式可能在分段过程中被破坏。例如，**加粗文本**可能被分为**加粗和文本**两部分发送，导致格式错误。

我们实现了复杂的检测和修复机制：

/**
 * 检查文本是否包含不完整的Markdown标记
 */
private boolean hasIncompleteMarkdown(String text) {
    // 检查不完整的标题
    if (INCOMPLETE_HEADING.matcher(text).find()) {
        return true;
    }
    
    // 检查不完整的加粗标记
    int boldCount = 0;
    int index = -1;
    while ((index = text.indexOf("**", index + 1)) >= 0) {
        boldCount++;
    }
    
    if (boldCount % 2 != 0) {
        return true;
    }
    
    // 更多格式检查...
}

4. 服务器发送事件（SSE）接口

在控制器层，我们添加了专门的流式输出接口：

@RequestMapping(value = "/ask/stream", method = RequestMethod.POST)
public SseEmitter streamAsk(@RequestBody AskParam askParam) {
    // 创建SseEmitter，设置超时时间为5分钟
    SseEmitter emitter = new SseEmitter(TimeUnit.MINUTES.toMillis(5));
    
    try {
        // 发送初始连接成功事件
        emitter.send(SseEmitter.event()
                .name("connected")
                .data("连接成功"));
        
        // 使用DeepSeekClient02的流式处理功能
        new DeepSeekClient02().getResponse(DeepSeekClient02.API_KEY, 
                                         askParam.getAskInfo(), true, emitter);
        
    } catch (Exception e) {
        // 错误处理
    }
    
    return emitter;
}

前端流式处理实现

1. 使用EventSource建立SSE连接

askQuestionStream: (
  question: string,
  onChunkCallback: (message: string) => void,
  onErrorCallback: (error: string) => void,
  onCompleteCallback: () => void
): () => void => {
  const eventSource = new EventSource(
    `${API_BASE_URL}/deepseek/ask/stream?askInfo=${encodeURIComponent(question)}`
  );
  
  // 设置超时处理、事件监听等...
}

2. Markdown格式修复

前端实现了fixStreamingMarkdown函数，处理可能的格式问题：

const fixStreamingMarkdown = (existingContent: string, newChunk: string): string => {
  let combinedContent = existingContent + newChunk;
  
  // 修复标题后缺少空格的问题
  // 例如: "#Java" => "# Java"
  combinedContent = combinedContent.replace(/^(#+)([^#\s])/gm, '$1 $2');
  
  // 修复二级标题格式问题
  combinedContent = combinedContent.replace(/# #/g, '##');
  
  // 修复加粗格式问题
  combinedContent = combinedContent.replace(/\* \*/g, '**');
  
  // 更多格式修复...
  
  return combinedContent;
};

3. 流式内容的动态更新

在用户发送消息后，创建一个空的机器人消息，然后通过状态更新逐步填充内容：

// 创建一个新的机器人消息，内容初始为空
const botMessageId = (Date.now() + 1).toString();
botMessageIdRef.current = botMessageId;

const botMessage: MessageType = {
  id: botMessageId,
  content: '',
  sender: 'bot',
  timestamp: new Date().toISOString(),
};

setMessages((prev) => [...prev, botMessage]);

// 使用流式API
eventSourceRef.current = deepSeekService.askQuestionStream(
  question,
  // 接收消息分段处理
  (contentChunk) => {
    // 应用Markdown修复，然后累积内容
    const fixedContent = fixStreamingMarkdown(accumulatedContentRef.current, contentChunk);
    accumulatedContentRef.current = fixedContent;
    
    // 更新消息
    setMessages((prevMessages) => {
      return prevMessages.map((msg) => {
        if (msg.id === botMessageId) {
          return {
            ...msg,
            content: fixedContent,
          };
        }
        return msg;
      });
    });
  },
  // 错误和完成处理...
);

用户界面优化

为了提供更好的用户体验，我们对界面进行了多项优化：

1. 增加显示宽度适合文档输出

const ChatContainer = styled.div`
  width: 100%;
  max-width: 1400px;  // V1版本为900px
  // 其他样式...
`;

2. 增加消息区域高度

const MessagesContainer = styled.div`
  height: 65vh;  // V1版本为60vh
  // 其他样式...
`;

3. 添加自定义滚动条样式

&::-webkit-scrollbar {
  width: 8px;
}

&::-webkit-scrollbar-track {
  background: #f1f1f1;
  border-radius: 4px;
}

&::-webkit-scrollbar-thumb {
  background: #c1c1c1;
  border-radius: 4px;
}

&::-webkit-scrollbar-thumb:hover {
  background: #a8a8a8;
}

4. 优化的响应式布局

@media (max-width: 1400px) {
  padding: 20px;
}

@media (max-width: 768px) {
  padding: 15px;
}

5. 流式/非流式模式切换

为方便测试和满足不同需求，我们添加了模式切换功能：

<ChatOptions>
  <ThunderboltOutlined style={{ color: useStreamMode ? '#1890ff' : '#bbb' }} />
  <Switch 
    size="small" 
    checked={useStreamMode} 
    onChange={setUseStreamMode} 
    checkedChildren="流式" 
    unCheckedChildren="非流"
  />
</ChatOptions>