应用场景
从Hive数据仓库批量捞取数据通过UDF中HttpURLConnection调用至服务端;
问题
服务端拿到的中文数据部分存在乱码;
排查
- 1、查询MySql数据库,发现源数据非乱码且编码格式为UTF-8;
- 2、查询Hive数据仓库,发现数据非乱码且编码格式为UTF-8;
- 3、初步判断乱码发生在HttpURLConnection调用过程中;
解决
修改前部分代码:
URL url = new URL(requestUrl);
httpURLConnection = (HttpURLConnection) url.openConnection();
httpURLConnection.setRequestProperty("Content-type", "application/json");
httpURLConnection.setDoOutput(true);
httpURLConnection.setDoInput(true);
outputStream = httpURLConnection.getOutputStream();
printWriter = new PrintWriter(outputStream);
printWriter.print(body);
printWriter.flush();
printWriter.close();
修改后部分代码:
URL url = new URL(requestUrl);
httpURLConnection = (HttpURLConnection) url.openConnection();
httpURLConnection.setRequestProperty("Content-type", "application/json;charset=UTF-8");
httpURLConnection.setDoOutput(true);
httpURLConnection.setDoInput(true);
dataOutputStream = new DataOutputStream(httpURLConnection.getOutputStream());
dataOutputStream.write(body.getBytes("UTF-8"));
dataOutputStream.flush();
printWriter.close();