简介
GPT火了之后,一种新的Http MediaType慢慢火了起来,它就是text/event-stream
,如果你对这个陌生,一定对他的兄弟比较熟悉:application/octet-stream
。这几种类型本质上都是客户端与服务端打开了一个长连接,服务端可以多次写入一部分数据给客户端,客户端可以多次读取,直到全部读取完成。
由于ChatGPT的特性,如果需要生成的token较多,等它全部生成完成将消耗较多的时间,但是如果你将它生成过程中的数据源源不断地展示给用户,那么用户端的体验也不会差(类似于在线播放视频,不需要把整个视频下载完成才能播放)
text/event-stream
支持服务端分多次往客户端写内容。
提供text/event-stream接口
利用spring mvc的SseEmitter提供服务接口
@PostMapping(value = "/test-stream", produces = "text/event-stream")
public SseEmitter conversation(@RequestBody ChatRequest request) {
final SseEmitter emitter = new SseEmitter();
new Thread(() -> {
try {
for (int i = 0; i < 10; i++) {
// 模拟某些耗时操作
Thread.sleep(200L);
emitter.send("这是第" + i +"次往服务端发送内容");
}
} finally {
emitter.complete();
}
}).start();
return emitter;
}
客户端如何读取text/event-stream
的接口呢?对于客户端来说,无论什么协议,都是用InputStream去读取数据,需要注意的是,这种协议每次都会在send的数据前面加上data:
,每次send后都会发送一次空行数据。
URL url = new URL(urlStr);
HttpURLConnection conn = (HttpURLConnection)url.openConnection();
conn.setRequestMethod("POST");
conn.setDoOutput(true);
conn.setDoInput(true);
// 写入请求参数
OutputStream os = conn.getOutputStream();
os.write("request body".getBytes(StandardCharsets.UTF_8));
os.flush();
os.close();
//读取响应参数
try (InputStream is = conn.getInputStream();) {
String line;
BufferedReader reader = new BufferedReader(new InputStreamReader(is, StandardCharsets.UTF_8));
while ((line = reader.readLine()) != null) {
if (StringUtils.isBlank(line)) {
continue;
}
if (line.startsWith("data:")) {
line = line.substring("data:".length());
// 处理数据
}
}
超时时间设置
默认情况下,tomcat对于AsyncRequest会设置默认30秒的超时时间,如果你的异步请求耗时较大,会抛出AsyncRequestTimeoutException,可以通过以下方式解决:
@Component
public class MyWebMvcConfig implements WebMvcConfigurer {
@Override
public void configureAsyncSupport(AsyncSupportConfigurer configurer) {
configurer.setDefaultTimeout(-1L);
}
}
创建SseEmitter的时候,通过构造函数指定超时时间(可选)。
final SseEmitter emitter = new SseEmitter(120 * 1000L);
调用text/event-stream接口
- 新建StreamMessageListener接口
提供给调用方定义自己的业务逻辑。
public interface StreamMessageListener<T> {
/**
* 收到消息
* @param message
*/
void messageReceived(T message);
/**
* 连接关闭
*/
void done();
void onException(Throwable t);
}
- 定义HttpUtil类
实现post请求的公共逻辑,服务端可能会发送event: ping
消息来检测客户端是否还存活,客户端收到此消息时应当直接忽略。
public static <R,P> void streamPost(String urlStr, Map<String, String> headerMap , Map<String, String> queryParams
, P body, ParameterizedTypeReference<R> responseType, StreamMessageListener<R> listener) {
HttpURLConnection conn = null;
InputStream is = null;
try {
UriComponentsBuilder uriBuilder = UriComponentsBuilder.fromUriString(urlStr);
if (queryParams != null) {
for (Map.Entry<String, String> paramEntry : queryParams.entrySet()) {
uriBuilder.queryParam(paramEntry.getKey(), paramEntry.getValue());
}
}
urlStr = uriBuilder.build().toString();
URL url = new URL(urlStr);
conn = getHttpURLConnection("POST", url, headerMap);
// 写入请求参数
OutputStream os = conn.getOutputStream();
os.write(Objects.requireNonNull(JsonUtil.toJson(body)).getBytes(StandardCharsets.UTF_8));
os.flush();
os.close();
// 读取响应参数
throwIfError(conn);
is = conn.getInputStream();
String line;
BufferedReader reader = new BufferedReader(new InputStreamReader(is, StandardCharsets.UTF_8));
ChatResponse.Data data = new ChatResponse.Data();
// 读取响应参数
while ((line = reader.readLine()) != null) {
if (StringUtils.isBlank(line) || "event: ping".equals(line)) {
continue;
}
if (line.startsWith("data:")) {
line = line.substring("data:".length());
}
R message = JsonUtil.toObject(line, convertToTypeReference(responseType));
listener.messageReceived(message);
}
} catch (Throwable e) {
listener.onException(e);
} finally {
if (is != null) {
try {
is.close();
} catch (IOException ignored) { }
}
if (conn != null) {
conn.disconnect();
}
listener.done();
}
}
// 失败需要读取errorStream
private static void throwIfError(HttpURLConnection conn) throws IOException {
if (conn.getResponseCode() >= 400) {
ByteArrayOutputStream outputStream = new ByteArrayOutputStream();
byte[] buffer = new byte[1024];
InputStream inputStream = conn.getErrorStream();
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
String errorMsg = outputStream.toString(StandardCharsets.UTF_8.name());
throw new RuntimeException(errorMsg);
}
}
private static HttpURLConnection getHttpURLConnection(String method, URL url, Map<String, String> headerMap) throws IOException {
HttpURLConnection conn = (HttpURLConnection) url.openConnection();
conn.setRequestMethod(method);
if (headerMap != null) {
for (Map.Entry<String, String> entry : headerMap.entrySet()) {
conn.setRequestProperty(entry.getKey(), entry.getValue());
}
}
if (headerMap == null || !headerMap.containsKey("Content-Type")) {
conn.setRequestProperty("Content-Type", MediaType.APPLICATION_JSON_VALUE);
}
conn.setRequestProperty("Accept", "*/*");
conn.setDoOutput(true);
conn.setDoInput(true);
conn.setReadTimeout(READ_TIME_OUT);
return conn;
}
private static <T> TypeReference<T> convertToTypeReference(ParameterizedTypeReference<T> parameterizedTypeReference) {
return new TypeReference<T>() {
@Override
public Type getType() {
return parameterizedTypeReference.getType();
}
};
}
- 调用接口
给的示例为代理一个text/event-stream接口,内部逻辑简单的将下游的内容返回给上游。
HttpUtil.streamPost(endPoint, headers, null, input, new ParameterizedTypeReference<DemoClass>() {}, new StreamMessageListener<DemoClass>() {
@Override
@SneakyThrows
public void messageReceived(DemoClass message) {
emitter.send(message);
}
@Override
public void done() {
emitter.complete();
}
@Override
public void onException(Throwable e) {
log.warn("call api error", e);
try {
emitter.send("<default error message>");
} catch (IOException ignored) {}
}
});
);
nginx代理流式接口
当 Nginx 作为反向代理服务器时,它通常会从后端服务器获取响应,并在将响应发送给客户端之前,先将其缓存在内存或磁盘中。这种机制称为代理缓冲。
使用代理缓冲,缓冲可以在网络带宽有限的情况下更好地管理数据传输,使流量更加平稳,避免突发流量对网络带宽的冲击。
如果你的应用程序需要实时传输数据,如视频流、WebSocket 或者其他低延迟需求的服务,禁用代理缓冲可以确保数据以最快的速度传递给客户端。
server {
listen 80;
server_name example.com;
location / {
proxy_pass http://backend_server;
proxy_buffering off; # 禁用代理缓冲
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}
}
java代理流式接口
如果想完全无延时,则应当用readLine方法,一次精准读一行并flush。如果觉得延时无所谓,也可以考虑加载buffer的size,目前默认用的是256。
private static void proxy(String proxyUrl, HttpServletRequest request, HttpServletResponse response) {
HttpURLConnection connection = null;
InputStream inputStream = null;
OutputStream outputStream = null;
try {
URL url = new URL(proxyUrl);
connection = (HttpURLConnection) url.openConnection();
connection.setRequestMethod(request.getMethod());
Enumeration<String> headerNames = request.getHeaderNames();
while (headerNames.hasMoreElements()) {
String headerName = headerNames.nextElement();
String headerValue = request.getHeader(headerName);
connection.setRequestProperty(headerName, headerValue);
}
// Send request body
if ("POST".equalsIgnoreCase(request.getMethod()) || "PUT".equalsIgnoreCase(request.getMethod())
|| "PATCH".equalsIgnoreCase(request.getMethod())
|| ("DELETE".equalsIgnoreCase(request.getMethod()) && hasRequestBody(request))) {
connection.setDoOutput(true);
outputStream = connection.getOutputStream();
inputStream = request.getInputStream();
byte[] requestBodyBytes = StreamUtils.copyToByteArray(inputStream);
inputStream.close();
inputStream = null;
outputStream.write(requestBodyBytes);
outputStream.flush();
outputStream.close();
outputStream = null;
}
// Read response
Map<String, List<String>> headerFields = connection.getHeaderFields();
for (Map.Entry<String, List<String>> entry : headerFields.entrySet()) {
String headerName = entry.getKey();
if (headerName != null && !"Transfer-Encoding".equals(headerName)) { // skip null header names
for (String headerValue : entry.getValue()) {
response.addHeader(headerName, headerValue);
}
}
}
response.setStatus(connection.getResponseCode());
if (response.getStatus() >= 400) {
inputStream = connection.getErrorStream();
byte[] bodyBytes = StreamUtils.copyToByteArray(inputStream);
outputStream = response.getOutputStream();
outputStream.write(bodyBytes);
outputStream.flush();
if ("gzip".equals(response.getHeader("Content-Encoding"))) {
bodyBytes = decompressGzip(bodyBytes);
}
log.warn("Error proxying request, url: {}, status:{}, msg:{}", proxyUrl, response.getStatus()
, new String(bodyBytes, StandardCharsets.UTF_8));
return;
}
inputStream = connection.getInputStream();
outputStream = response.getOutputStream();
if (response.getContentType() != null && response.getContentType().trim().startsWith("text/event-stream")) {
// 加这两行代码可以让上层nginx代理不用配置proxy_buffering off;
response.setHeader("Cache-Control", "no-cache");
response.setHeader("Connection", "keep-alive");
int byteRead;
int newLineRepeatCount = 0;
ByteArrayOutputStream bos = new ByteArrayOutputStream();
while ((byteRead = inputStream.read()) != -1) {
bos.write(byteRead);
if (byteRead == '\n') {
newLineRepeatCount++;
if (newLineRepeatCount == 1) {
continue;
}
byte[] bytes = bos.toByteArray();
outputStream.write(bytes);
bos.reset();
response.flushBuffer();
newLineRepeatCount = 0;
} else {
newLineRepeatCount = 0;
}
}
if (bos.size() > 0) {
outputStream.write(bos.toByteArray());
}
} else {
byte[] buffer = new byte[4096];
int bytesRead;
while ((bytesRead = inputStream.read(buffer)) != -1) {
outputStream.write(buffer, 0, bytesRead);
}
}
} catch (Exception e) {
response.setStatus(HttpServletResponse.SC_INTERNAL_SERVER_ERROR);
log.warn("Error proxying request: {}", proxyUrl, e);
} finally {
if (inputStream != null) {
try {
inputStream.close();
} catch (IOException ignored) { }
}
if (outputStream != null) {
try {
outputStream.close();
} catch (IOException ignored) { }
}
if (connection != null) {
connection.disconnect();
}
}
}
@SneakyThrows
private static byte[] decompressGzip(byte[] compressedData) {
try (ByteArrayInputStream bis = new ByteArrayInputStream(compressedData);
GZIPInputStream gzipIS = new GZIPInputStream(bis)) {
return StreamUtils.copyToByteArray(gzipIS);
}
}
private static boolean hasRequestBody(HttpServletRequest request) {
try {
return request.getInputStream().available() > 0;
} catch (IOException e) {
return false;
}
}