最近使用apache的httpclient向服务器发信息,发现收到的中文是乱码。我的代码类似下面:
MultipartEntityBuilder mb = MultipartEntityBuilder.create();
mb.setCharset(Charset.forName("UTF-8"));
// 设置信息
mb.addPart("name", new StringBody("张三", ContentType.TEXT_PLAIN));
mb.addPart("addr", new StringBody("XXX市XX区", ContentType.TEXT_PLAIN));
((HttpPost) request).setEntity(mb.build());
// 发起请求
response = client.execute(request);
通过查询资料,发现是ContentType.TEXT_PLAIN的编码问题。虽然在前面已经用setCharset(Charset.forName("UTF-8"))设置了编码。但在设置信息时,还需要注意ContentType.TEXT_PLAIN的编码问题。
从源码中可以看到常见的几种ContentType类型:
public static final ContentType APPLICATION_FORM_URLENCODED = create("application/x-www-form-urlencoded", Consts.ISO_8859_1);
public static final ContentType APPLICATION_JSON = create("application/json", Consts.UTF_8);
public static final ContentType APPLICATION_XML = create("application/xml", Consts.ISO_8859_1);
public static final ContentType MULTIPART_FORM_DATA = create("multipart/form-data", Consts.ISO_8859_1);
public static final ContentType TEXT_HTML = create("text/html", Consts.ISO_8859_1);
public static final ContentType TEXT_PLAIN = create("text/plain", Consts.ISO_8859_1);
public static final ContentType TEXT_XML = create("text/xml", Consts.ISO_8859_1);
从上面代码中可以看出,其实这些文本大多数都使用ISO_8859_1编码,这里也包括了我的代码中的TEXT_PLAIN类型。这样就导致了中文编码错误。
解决方法有很多,都是从编码角度来解决,创建一个utf-8编码的ContentType。比如可以写成下面的形式:
// 方法1(我用的方法)
mb.addPart("name", new StringBody("张三", ContentType.TEXT_PLAIN.withCharset("UTF-8")));
// 方法2(未测试)
mb.addPart("name", new StringBody("张三", ContentType.create("text/plain", MIME.UTF8_CHARSET)));
// 方法3(未测试)
mb.addPart("name", new StringBody("张三", ContentType.create("text/plain", Charset.forName("UTF-8")));
// 方法4(未测试)
mb.addPart("name", new StringBody("张三", ContentType.create(HTTP.PLAIN_TEXT_TYPE, HTTP.UTF_8);