AI Model Hub 是一个大模型应用开发集成框架,它通过对接各大模型厂商的 OPEN API,将数据转换成统一格式,适配前端各种场景展示,同时实现模型的匹配切换、prompt 构建、知识库调用、数据存储等功能,加速模型应用的落地开发。在本文中,我们将详细介绍 AI Model Hub 对接本地运行的llama3模型。
github:https://github.com/flower-trees/ai-model-hub
ollama运行llama3:8b
参考:https://blog.csdn.net/fenglingguitar/article/details/140320238
修改代码
相关代码列表
ai-model-hub/
│
└── src/
└── main/
├── java/
│ └── org/
│ └── salt/
│ └── ai/
│ └── hub/
│ ├── models.ai/
│ │ ├── ollama/
│ │ │ ├── dto/
│ │ │ │ ├── OllamaRequest.java
│ │ │ │ └── OllamaResponse.java
│ │ │ ├── OllamaActuator.java
│ │ │ └── OllamaListener.java
│ │ └── enums/
│ │ └── VendorType.java
│ └── chat/
│ ├── process/
│ │ └── SimpleContextProcess.java
│ └── service/
│ └── ChatService.java
└── resources/
└── application-dev.yml
1.定义ollama的dto
OllamaRequest.java
@Data
public class OllamaRequest {
private String model;
private List<Message> messages;
private Options options;
private boolean stream;
@Data
public static class Message {
private String role;
private String content;
}
@Data
public static class Options {
private double temperature;
}
}
OllamaResponse.java
@Data
public class OllamaResponse {
private String model;
private String created_at;
private Message message;
private String done_reason;
private boolean done;
private long total_duration;
private long load_duration;
private int prompt_eval_count;
private long prompt_eval_duration;
private int eval_count;
private long eval_duration;
@Data
public static class Message {
private String role;
private String content;
}
}
添加Actuator和Listener
OllamaActuator.java
@Component
public class OllamaActuator implements AiChatActuator {
@Value("${models.ollama.chat-url}")
private String chatUrl;
@Value("${models.ollama.chat-key}")
private String chatKey;
@Autowired
HttpStreamClient commonHttpClient;
@Override
public void pursue(AiChatDto aiChatDto, Consumer<AiChatResponse> responder, Consumer<AiChatResponse> callback) {
Map<String, String> headers = new HashMap<>();
headers.put("Content-Type", "application/json");
headers.put("Authorization", "Bearer " + chatKey);
//转换输入参数为ollama形式
OllamaRequest request = convert(aiChatDto);
//调用模型,传入OllamaListener
commonHttpClient.call(chatUrl, JsonUtil.toJson(request), headers, List.of(new OllamaListener(responder, callback)));
}
public static OllamaRequest convert(AiChatDto aiChatDto) {
OllamaRequest request = new OllamaRequest();
request.setModel(aiChatDto.getModel());
request.setStream(aiChatDto.isStream());
List<OllamaRequest.Message> doubaoMessages = aiChatDto.getMessages().stream()
.map(OllamaActuator::convertMessage)
.collect(Collectors.toList());
request.setMessages(doubaoMessages);
OllamaRequest.Options options = new OllamaRequest.Options();
options.setTemperature(0.3);
request.setOptions(options);
return request;
}
private static OllamaRequest.Message convertMessage(AiChatDto.Message aiChatMessage) {
OllamaRequest.Message message = new OllamaRequest.Message();
message.setRole(aiChatMessage.getRole());
message.setContent(aiChatMessage.getContent());
return message;
}
}
OllamaListener.java
public class OllamaListener extends DoListener {
public OllamaListener(Consumer<AiChatResponse> responder, Consumer<AiChatResponse> callback) {
super(responder, callback);
}
@Override
public void onMessage(String msg) {
MoonshotResponse response = JsonUtil.fromJson(msg, MoonshotResponse.class);
if (response != null) {
//转ollama结果为通用形式
AiChatResponse aiChatResponse = new AiChatResponse();
aiChatResponse.setVendor(VendorType.DOUBAO.getCode());
aiChatResponse.setVendorId(response.getId());
aiChatResponse.setVendorModel(response.getModel());
List<AiChatResponse.Message> messages = getMessages(response);
aiChatResponse.setMessages(messages);
aiChatResponse.setCode(AiChatCode.MESSAGE.getCode());
aiChatResponse.setMessage(AiChatCode.MESSAGE.getMessage());
//写回数据
responder.accept(aiChatResponse);
}
}
private static @NotNull List<AiChatResponse.Message> getMessages(MoonshotResponse response) {
List<AiChatResponse.Message> messages = new ArrayList<>();
if (!CollectionUtils.isEmpty(response.getChoices()) && response.getChoices().get(0).getDelta() != null) {
AiChatResponse.Message message = new AiChatResponse.Message();
MoonshotResponse.Choice.Delta delta = response.getChoices().get(0).getDelta();
message.setRole(delta.getRole());
message.setContent(delta.getContent());
message.setType(MessageType.MARKDOWN.getCode());
messages.add(message);
}
return messages;
}
}
增加枚举
VendorType.java
OLLAMA("ollama"),
添加配置
application-dev.yml
ollama:
chat-url: http://localhost:11434/api/chat
chat-key: ${OLLAMA_KEY}
修改测试调用逻辑
ChatService.java
@Autowired
OllamaActuator ollamaActuator;
public void hub(AiChatRequest aiChatRequest, Consumer<AiChatResponse> responder) {
......
} else if (aiChatDto.getVendor().equals(VendorType.OLLAMA.getCode())) {
ollamaActuator.pursue(aiChatDto, responder, simpleContextProcess::executeDown);
}
......
}
SimpleContextProcess.java
@Override
public AiChatDto executeUp(AiChatRequest aiChatRequest) {
......
} else if (aiChatDto.getAgent().equals("5")) {
aiChatDto.setVendor(VendorType.OLLAMA.getCode());
aiChatDto.setModel("llama3:8b");
}
......
}
测试
使用agent 5来匹配调用响应的模型
curl --location 'http://127.0.0.1:8080/ai-model-hub/ai/stream/chat' \
--header 'Content-Type: application/json' \
--data '{
"agent": "5",
"content": "介绍一下军博,使用20个汉字以内"
}'