《史上最简单的SpringAI+Llama3.x教程》-01- Spring AI Chat深度讲解

静愚 AGI

已于 2024-07-27 23:04:39 修改

阅读量641

点赞数 18

分类专栏： Spring AI 文章标签： spring 人工智能 java

于 2024-07-27 12:35:27 首次发布

本文链接：https://blog.csdn.net/JingYu_365/article/details/140733695

版权

Spring AI 专栏收录该内容

3 篇文章 0 订阅

订阅专栏

前面一篇文章阐述了 Spring AI 的基本概念与核心功能。此篇文章会着重阐释怎样在实际项目当中运用 Spring AI 的Chat能力，同时会给出详尽的代码示例，助力您迅速入门。

准备工作

项目代码：

Maven pom配置和application配置详见：《史上最简单的SpringAI教程|0. Spring AI概述》

创建一个SpringBoot AI的项目：

@SpringBootApplication
public class ChatCasesApplication {
    public static void main(String[] args) {
        SpringApplication.run(ChatCasesApplication.class, args);
    }
}

在开始之前，先看一下需要用到的OllamaChatModel的类结构：

在这里插入图片描述

在类中，实际上可以使用的也就是两个方法org.springframework.ai.ollama.OllamaChatModel#call 与方法 org.springframework.ai.ollama.OllamaChatModel#stream 这两个方法都是使用了方法org.springframework.ai.ollama.OllamaChatModel#ollamaChatRequest，具体实现如下：

public class OllamaChatModel implements ChatModel {

    /**
     * Low-level Ollama API library.
     */
    private final OllamaApi chatApi;
    
    // 其他代码省略
    
    @Override
    public ChatResponse call(Prompt prompt) {
    
        OllamaApi.ChatResponse response = this.chatApi.chat(ollamaChatRequest(prompt, false));
    
        var generator = new Generation(response.message().content());
        if (response.promptEvalCount() != null && response.evalCount() != null) {
           generator = generator.withGenerationMetadata(ChatGenerationMetadata.from("unknown", null));
        }
        return new ChatResponse(List.of(generator), OllamaChatResponseMetadata.from(response));
    }
    
    @Override
    public Flux<ChatResponse> stream(Prompt prompt) {
    
        Flux<OllamaApi.ChatResponse> response = this.chatApi.streamingChat(ollamaChatRequest(prompt, true));
    
        return response.map(chunk -> {
           Generation generation = (chunk.message() != null) ? new Generation(chunk.message().content())
                 : new Generation("");
           if (Boolean.TRUE.equals(chunk.done())) {
              generation = generation.withGenerationMetadata(ChatGenerationMetadata.from("unknown", null));
           }
           return new ChatResponse(List.of(generation), OllamaChatResponseMetadata.from(chunk));
        });
    }
    
    /**
     * Package access for testing.
     */
    OllamaApi.ChatRequest ollamaChatRequest(Prompt prompt, boolean stream) {
    // 代码省略
    }
}

在实现中可以看到所有的请求都是通过OllamaApi来实现的，OllamaApi类结构如下：

在这里插入图片描述

同步 Chat

上面我们定义了项目的启动类，那么直接使用上面提到的org.springframework.ai.ollama.OllamaChatModel#call 方法来实现同步的模型消息问答。

所谓同步，就是当模型推理完成之后再响应给客户端。

@AllArgsConstructor
@RequestMapping("/chat")
@RestController
public class ChatCaseController {

    private final OllamaChatModel ollamaChatModel;
    @GetMapping
    String chat(@RequestParam String message) {
        return ollamaChatModel
                .call(message);
    }
}

流式响应 Stream Chat

流 API 可以帮助处理长时间运行的任务，诸如实时音频转录或者长文本生成。

由于模型推理是一个持续的过程，在 Spring AI 里，运用流 API 来处理此类任务，可以保证应用程序可以实时作出响应。

使用流可以很好的避免用户在模型推理过程的等待，只要模型有结果输出就可以通过流式方式范围给用户端，提高首字响应速度。

@AllArgsConstructor
@RequestMapping("/chat")
@RestController
public class ChatStreamCaseController {

    private final OllamaChatModel ollamaChatModel;

    @GetMapping("/stream")
    Flux<String> chatStream(@RequestParam String message) {
        return ollamaChatModel.stream(message);
    }
}

配置模型参数 Chat Options

现在我们可以正常使用chat方式和模型进行交互，但是现在还有个问题，比如，我怎么修改模型的Temperature等参数？

我们先看一下这个call方法：public ChatResponse call(Prompt prompt)这个方法参数是一个Prompt，这个类需要我们研究一下：

Prompt构建参数

其中有一个构造函数需要携带 ChatOptions，很明显是需要配置Chat需要的一些配置参数。这是一个接口，结构如下：

/**
 * The ChatOptions represent the common options, portable across different chat models.
 */
public interface ChatOptions extends ModelOptions {

    Float getTemperature();

    Float getTopP();

    Integer getTopK();
}

这里只是模型公共的配置，模型特例的参数需要看它的实现类，我们来看一下Ollama的实现类：

在这里插入图片描述

Ollama部署模型参数配置

那么，到这里我们就知道，想要配置回话级别的参数，只要使用Prompt配置一个ChatOptions就可以。示例如下：

@AllArgsConstructor
@RequestMapping("/chat-option")
@RestController
public class ChatOptionCaseController {
    
    private final OllamaChatModel ollamaChatModel;

    @GetMapping
    Flux<?> chatOptions(@RequestParam String message) {
        return ollamaChatModel.stream(new Prompt(
                message,
                new OllamaOptions()
                        .withTemperature(0.2f)      // 设置会话温度，温度越高回答越发散
                        .withModel("llama3")        // 设置本次会话使用的模型，会覆盖application中配置的。
        ));
    }
}

更多Ollama关于Options可以参考：Chat Properties

最后

本文说明仅使用Ollama为例进行说明的，其他模型的使用方式如上相同，万变不离其中，关注核心逻辑即可。
本章节我们提到了Prompt如何使用，下节我将继续带着大家深度分析Prompt使用姿势。

静愚 AGI

关注

18
点赞
踩
9

收藏

觉得还不错? 一键收藏
0
评论
《史上最简单的SpringAI+Llama3.x教程》-01- Spring AI Chat深度讲解

本文说明使用Ollama+Llama3.1为例进行说明SpringAI Chat的能力应用，其他模型的使用方式如上相同，万变不离其中，同时详细讲解了SpringAI Chat源码设计。
复制链接

扫一扫

专栏目录