Flume agent端event重复发送(数据暴增不一致)的问题

       一直用flume做数据收集,用了好一阵子都没出现数据误差的问题,今天在导入数据的时候却突然出现了数据暴增的问题,查看原始数据文件,只有几十万条数据,可却收集到了几百万的event,而且还在持续的增加。很奇怪,首先以为是插件更新的问题,后来换成最原始的console作为Collector的sink,问题依旧存在。然后清理配置数据重新启动,问题还是没有解决。

      查看原始数据并与原有的数据对比,发现有超大内容的的event,是业务端的错误。所以估计应该是大event的问题,然后查看源码,发现异样

public class SyslogUdpSource extends EventSource.Base {
  static final Logger LOG = LoggerFactory.getLogger(SyslogUdpSource.class);
  final public static int SYSLOG_UDP_PORT = 514;
  int port = SYSLOG_UDP_PORT; // default udp syslog port
  int maxsize = 1 << 16; // 64k is max allowable in RFC 5426

  long rejects = 0;
  DatagramSocket sock;

  public SyslogUdpSource() {
  }

  public SyslogUdpSource(int port) {
    this.port = port;
  }

  @Override
  public void close() throws IOException {
    LOG.info("closing SyslogUdpSource on port " + port);
    if (sock == null) {
      LOG.warn("double close of SyslogUdpSocket on udp:" + port
          + " , (this is ok but odd)");
      return;
    }

    sock.close();
  }

  @Override
  public Event next() throws IOException {
    byte[] buf = new byte[maxsize];
    DatagramPacket pkt = new DatagramPacket(buf, maxsize);
    Event e = null;
    do { // loop until we get a valid packet, drop bad ones.
      sock.receive(pkt);

      ByteBuffer bb = ByteBuffer.wrap(buf, 0, pkt.getLength());
      ByteBufferInputStream bbis = new ByteBufferInputStream(bb);
      DataInputStream in = new DataInputStream(bbis);
      try {
        e = SyslogWireExtractor.extractEvent(in);
      } catch (EventExtractException ex) {
        rejects++;
        LOG.warn(rejects + " rejected packets. packet: " + pkt, ex);
        LOG.debug("raw bytes " + Arrays.toString(pkt.getData()));
        // TODO (jon) maybe have a hook here to do something with rejects
      }

      // need a sane way to fall out of his loop.
    } while (e == null);

    updateEventProcessingStats(e);
    return e;
  }

  @Override
  public void open() throws IOException {
    sock = new DatagramSocket(port);
  }

  public static SourceBuilder builder() {
    return new SourceBuilder() {

      @Override
      public EventSource build(Context ctx, String... argv) {
        int port = SYSLOG_UDP_PORT; // default udp port, need root permissions
        // for this.
        if (argv.length > 1) {
          throw new IllegalArgumentException("usage: syslogUdp([port no]) ");
        }

        if (argv.length == 1) {
          port = Integer.parseInt(argv[0]);
        }

        return new SyslogUdpSource(port);
      }

    };
  }

}

64k is max allowable in RFC 5426(有关RFC 5426  event大小的限制)

 

然后网上查看各种资料,原来当event大于64K时,flume agent端会重复发送该event,从而导致数据暴增的问题(更详细的原因有待有空的时候考证)。所以应该在业务端控制好event的大小。

 

 


 

  • 1
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值