flume http source windows 中文乱码

flume HTTPSource默认的是utf-8的编码,flume服务用在linux机器上都是没问题的,但当把flume服务部署在windows机器上的时候,会出现乱码问题 。从源码检查乱码出现在那里:

1.在刚接收到请求的时候,打印requestBody里的内容是正常的,数据正常。

2.数据存到event里之后,打印event里的内容,出现乱码。

源码里有个JSONHandler的类处理了request的请求,然后将request的内容放到了event里。代码里都是用“utf-8”进行编码的,而windows机器默认的编码是“GBK”,猜测是这里的问题,将编码统一换成“GBK”中文不乱码了。将源码改动了下边两个地方。

 

package org.apache.flume.source.http;

import com.google.gson.Gson;
import com.google.gson.GsonBuilder;
import com.google.gson.JsonSyntaxException;
import com.google.gson.reflect.TypeToken;
import org.apache.flume.Context;
import org.apache.flume.Event;
import org.apache.flume.event.EventBuilder;
import org.apache.flume.event.JSONEvent;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

import javax.servlet.ServletInputStream;
import javax.servlet.http.HttpServletRequest;
import java.io.BufferedReader;
import java.io.InputStreamReader;
import java.lang.reflect.Type;
import java.nio.charset.Charset;
import java.nio.charset.UnsupportedCharsetException;
import java.util.ArrayList;
import java.util.List;

public class JSONHandlerTest implements HTTPSourceHandler {

  private static final Logger LOG = LoggerFactory.getLogger(JSONHandlerTest.class);
  private final Type listType = new TypeToken<List<JSONEvent>>() {}.getType();
  private final Gson gson;

  public JSONHandlerTest() {
    gson = new GsonBuilder().disableHtmlEscaping().create();
  }

  /**
   * {@inheritDoc}
   */
  @Override
  public List<Event> getEvents(HttpServletRequest request) throws Exception {
    BufferedReader reader = request.getReader();
    String charset = request.getCharacterEncoding();
    //UTF-8 is default for JSON. If no charset is specified, UTF-8 is to
    //be assumed.
    if (charset == null) {
      LOG.debug("Charset is null, default charset of UTF-8 will be used.");
      charset = "UTF-8";
    } else if (!(charset.equalsIgnoreCase("utf-8")
            || charset.equalsIgnoreCase("utf-16")
            || charset.equalsIgnoreCase("utf-32"))) {
      LOG.error("Unsupported character set in request {}. "
              + "JSON handler supports UTF-8, "
              + "UTF-16 and UTF-32 only.", charset);
      throw new UnsupportedCharsetException("JSON handler supports UTF-8, "
              + "UTF-16 and UTF-32 only.");
    }

    /*
     * Gson throws Exception if the data is not parseable to JSON.
     * Need not catch it since the source will catch it and return error.
     */
    List<Event> eventList = new ArrayList<Event>(0);
    try {
      eventList = gson.fromJson(reader, listType);
    } catch (JsonSyntaxException ex) {
      throw new HTTPBadRequestException("Request has invalid JSON Syntax.", ex);
    }

    for (Event e : eventList) {
      ((JSONEvent) e).setCharset("GBK");
    }
    return getSimpleEvents(eventList);
  }

  @Override
  public void configure(Context context) {
  }

  private List<Event> getSimpleEvents(List<Event> events) {
    Charset ch=Charset.forName("GBK");
    List<Event> newEvents = new ArrayList<Event>(events.size());
    for (Event e:events) {
//生成event的时候,用“GBK”编码。
      newEvents.add(EventBuilder.withBody(new String (e.getBody()),ch,e.getHeaders()));
    }
    return newEvents;
  }
}

个人观点,欢迎指正。

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值