出现的异常:
Exception in thread “main” org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/json;charset=UTF-8, URL=https://xxx.com/apiname

在网上看了好几篇文章讲的基本都是关于给Jsoup的Connection设置参数ignoreContentType(true)但是没有人说为什么,下面记录一下。
Jsoup发起请求伪代码:这里对28行execute进行断点调试
public static String searchTest() throws IOException {
Map<String, String> params = new HashMap<>();
params.put("p1", "v1");
params.put("p2", 2);
Connection connect = Jsoup.connect("https://xxx.com/apiname");
// 把参数序列化为json字符串
String p = JSON.toJSONString(params);
// org.jsoup.UnsupportedMimeTypeException: Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml. Mimetype=application/json;charset=UTF-8, URL=https://xxx.com/apiname
/*
* Jsoup发起POST请求,发送一段json参数到服务器。
*
* 设置了请求body和请求method,在执行execute时,会把参数写到conn创建链接的输出流当中,接收端能接收并进行解析转换
* ignoreContentType(true) 忽略这个状态,这样会判断请求后的后端response中的Content-Type是否包含text/* ...等
* */
Connection.Response execute = connect
.method(Connection.Method.POST)
// 设置请求头
.header("Content-Type", "application/json; charset=UTF-8")
.header("Accept", "text/plain, */*; q=0.01")
.header("Accept-Encoding", "gzip,deflate,sdch")
.header("Accept-Language", "es-ES,es;q=0.8")
.header("Connection", "keep-alive")
.header("X-Requested-With", "XMLHttpRequest")
.ignoreContentType(true)
.requestBody(p)
.execute();
String body = execute.body();
System.err.println(body);
return body;
}
调试过程:
- 第一步:进入断点


- 第二步:请求后的res响应头,Content-Type类型判断

重点:
Jsoup要求后端Response的类型必须包含Must be text/*, application/xml, or application/xhtml+xml,但是请求的服务端并没有返回包含这几个类型的Content-Type值,所以需要进行设置ignoreContentType(true)进行忽略,就不会直接抛出异常。
源码分析:
部分注释写到代码上,重点关注第15、27、64、134行代码
static HttpConnection.Response execute(org.jsoup.Connection.Request req, HttpConnection.Response previousResponse) throws IOException {
Validate.notNull(req, "Request must not be null");
String protocol = req.url().getProtocol();
if (!protocol.equals("http") && !protocol.equals("https")) {
throw new MalformedURLException("Only http & https protocols supported");
} else {
boolean methodHasBody = req.method().hasBody(); // 判断是否包含请求方式:GET、POST...
boolean hasRequestBody = req.requestBody() != null; // 判断是否包含请求体
if (!methodHasBody) {
Validate.isFalse(hasRequestBody, "Cannot set a request body for HTTP method " + req.method());
}
String mimeBoundary = null;
if (req.data().size() > 0 && (!methodHasBody || hasRequestBody)) {
// 没有设置请求方式和data请求参数,会直接把参数序列化到url上面
serialiseRequestUrl(req);
} else if (methodHasBody) {
mimeBoundary = setOutputContentType(req);
}
HttpURLConnection conn = createConnection(req);
HttpConnection.Response res;
try {
conn.connect();
if (conn.getDoOutput()) {
// 写参数到请求中,见下面:writePost方法
writePost(req, conn.getOutputStream(), mimeBoundary);
}
int status = conn.getResponseCode();
res = new HttpConnection.Response(previousResponse);
res.setupFromConnection(conn, previousResponse);
res.req = req;
String contentType;
if (res.hasHeader("Location") && req.followRedirects()) {
if (status != 307) {
req.method(Method.GET);
req.data().clear();
}
contentType = res.header("Location");
if (contentType != null && contentType.startsWith("http:/") && contentType.charAt(6) != '/') {
contentType = contentType.substring(6);
}
URL redir = StringUtil.resolve(req.url(), contentType);
req.url(HttpConnection.encodeUrl(redir));
Iterator var11 = res.cookies.entrySet().iterator();
while(var11.hasNext()) {
Entry<String, String> cookie = (Entry)var11.next();
req.cookie((String)cookie.getKey(), (String)cookie.getValue());
}
HttpConnection.Response var21 = execute(req, res);
return var21;
}
if ((status < 200 || status >= 400) && !req.ignoreHttpErrors()) {
throw new HttpStatusException("HTTP error fetching URL", status, req.url().toString());
}
// 响应中的Content-Type是否包含下面逻辑匹配
contentType = res.contentType();
if (contentType != null && !req.ignoreContentType() && !contentType.startsWith("text/") && !xmlContentTypeRxp.matcher(contentType).matches()) {
throw new UnsupportedMimeTypeException("Unhandled content type. Must be text/*, application/xml, or application/xhtml+xml", contentType, req.url().toString());
}
if (contentType != null && xmlContentTypeRxp.matcher(contentType).matches() && req instanceof HttpConnection.Request && !((HttpConnection.Request)req).parserDefined) {
req.parser(Parser.xmlParser());
}
res.charset = DataUtil.getCharsetFromContentType(res.contentType);
if (conn.getContentLength() != 0 && req.method() != Method.HEAD) {
Object bodyStream = null;
try {
bodyStream = conn.getErrorStream() != null ? conn.getErrorStream() : conn.getInputStream();
if (res.hasHeaderWithValue("Content-Encoding", "gzip")) {
bodyStream = new GZIPInputStream((InputStream)bodyStream);
}
res.byteData = DataUtil.readToByteBuffer((InputStream)bodyStream, req.maxBodySize());
} finally {
if (bodyStream != null) {
((InputStream)bodyStream).close();
}
}
} else {
res.byteData = DataUtil.emptyByteBuffer();
}
} finally {
conn.disconnect();
}
res.executed = true;
return res;
}
}
private static void writePost(org.jsoup.Connection.Request req, OutputStream outputStream, String bound) throws IOException {
Collection<org.jsoup.Connection.KeyVal> data = req.data();
BufferedWriter w = new BufferedWriter(new OutputStreamWriter(outputStream, req.postDataCharset()));
if (bound != null) {
for(Iterator var5 = data.iterator(); var5.hasNext(); w.write("\r\n")) {
org.jsoup.Connection.KeyVal keyVal = (org.jsoup.Connection.KeyVal)var5.next();
w.write("--");
w.write(bound);
w.write("\r\n");
w.write("Content-Disposition: form-data; name=\"");
w.write(HttpConnection.encodeMimeName(keyVal.key()));
w.write("\"");
if (keyVal.hasInputStream()) {
w.write("; filename=\"");
w.write(HttpConnection.encodeMimeName(keyVal.value()));
w.write("\"\r\nContent-Type: application/octet-stream\r\n\r\n");
w.flush();
DataUtil.crossStreams(keyVal.inputStream(), outputStream);
outputStream.flush();
} else {
w.write("\r\n\r\n");
w.write(keyVal.value());
}
}
w.write("--");
w.write(bound);
w.write("--");
} else if (req.requestBody() != null) {
// 包含请求体,就把请求参数写到outputStream,outputStream提交到服务端进行解析
w.write(req.requestBody());
} else {
boolean first = true;
Iterator var9 = data.iterator();
while(var9.hasNext()) {
org.jsoup.Connection.KeyVal keyVal = (org.jsoup.Connection.KeyVal)var9.next();
if (!first) {
w.append('&');
} else {
first = false;
}
w.write(URLEncoder.encode(keyVal.key(), req.postDataCharset()));
w.write(61);
w.write(URLEncoder.encode(keyVal.value(), req.postDataCharset()));
}
}
w.close();
}
参考文章:
https://blog.csdn.net/dietime1943/article/details/78974194
3213

被折叠的 条评论
为什么被折叠?



