关于Flume异常情况导致的数据重复写入问题分析

环境

flume-ng 1.6.0-cdh5.15.1

问题描述

通过flume抽取kafka数据, 落地HDFS. source与channel不在本次问题分析范围内,暂且忽略. sink的部分配置如下:

tier1.sinks.sink1.type=hdfs
tier1.sinks.sink1.channel=channel1
tier1.sinks.sink1.hdfs.path=hdfs://xxx/xxx/day=%Y%m%d
tier1.sinks.sink1.hdfs.filePrefix=xxx.log
tier1.sinks.sink1.hdfs.fileType=DataStream
tier1.sinks.sink1.hdfs.closeTries=3
tier1.sinks.sink1.hdfs.round=true
tier1.sinks.sink1.hdfs.roundValue=10
tier1.sinks.sink1.hdfs.roundUnit=minute
tier1.sinks.sink1.hdfs.rollSize=128000000
tier1.sinks.sink1.hdfs.rollCount=0
tier1.sinks.sink1.hdfs.rollInterval=0
tier1.sinks.sink1.hdfs.idleTimeout=1800
tier1.sinks.sink1.hdfs.callTimeout=7200000
tier1.sinks.sink1.hdfs.threadsPoolSize=10
tier1.sinks.sink1.hdfs.rollTimerPoolSize=10
tier1.sinks.sink1.hdfs.batchSize=10000

由于线上的一个文件夹权限被改动, 导致flume无法向该文件夹写入数据, 并且向上一个文件夹重复写入数据.
flume问题-1
从上图也可以看出, flume没有正常的继续从kafka中消费数据, 并通过检查kafka的offset验证

问题分析

首先我们看下正常情况下, flume的HDFS sink会怎样处理消费的数据. 下面是HDFSEventSink类的process()代码, 我们还需要关注一下sfWriters这个成员变量:

  private final Object sfWritersLock = new Object();
  private WriterLinkedHashMap sfWriters;
/**
   * Pull events out of channel and send it to HDFS. Take at most batchSize
   * events per Transaction. Find the corresponding bucket for the event.
   * Ensure the file is open. Serialize the data and write it to the file on
   * HDFS. <br/>
   * This method is not thread safe.
   */
  public Status process() throws EventDeliveryException {
    Channel channel = getChannel();
    Transaction transaction = channel.getTransaction();
    List<BucketWriter> writers = Lists.newArrayList();
    transaction.begin();
    try {
      int txnEventCount = 0;
      for (txnEventCount = 0; txnEventCount < batchSize; txnEventCount++) {
        Event event = channel.take();
        if (event == null) {
          break;
        }

        // reconstruct the path name by substituting place holders
        String realPath = BucketPath.escapeString(filePath, event.getHeaders(),
            timeZone, needRounding, roundUnit, roundValue, useLocalTime);
        String realName = BucketPath.escapeString(fileName, event.getHeaders(),
          timeZone, needRounding, roundUnit, roundValue, useLocalTime);

        String lookupPath = realPath + DIRECTORY_DELIMITER + realName;
        BucketWriter bucketWriter;
        HDFSWriter hdfsWriter = null;
        // Callback to remove the reference to the bucket writer from the
        // sfWriters map so that all buffers used by the HDFS file
        // handles are garbage collected.
        WriterCallback closeCallback = new WriterCallback() {
          @Override
          public void run(String bucketPath) {
            LOG.info("Writer callback called.");
            synchronized (sfWritersLock) {
              sfWriters.remove(bucketPath);
            }
          }
        };
        synchronized (sfWritersLock) {
          bucketWriter = sfWriters.get(lookupPath);
          // we haven't seen this file yet, so open it and cache the handle
          if (bucketWriter == null) {
            hdfsWriter = writerFactory.getWriter(fileType);
            bucketWriter = initializeBucketWriter(realPath, realName,
              lookupPath, hdfsWriter, closeCallback);
            sfWriters.put(lookupPath, bucketWriter);
          }
        }

        // track the buckets getting written in this transaction
        if (!writers.contains(bucketWriter)) {
          writers.add(bucketWriter);
        }

        // Write the data to HDFS
        try {
          bucketWriter.append(event);
        } catch (BucketClosedException ex) {
          LOG.info("Bucket was closed while trying to append, " +
            "reinitializing bucket and writing event.");
          hdfsWriter = writerFactory.getWriter(fileType);
          bucketWriter = initializeBucketWriter(realPath, realName,
            lookupPath, hdfsWriter, closeCallback);
          synchronized (sfWritersLock) {
            sfWriters.put(lookupPath, bucketWriter);
          }
          bucketWriter.append(event);
        }
      }

      if (txnEventCount == 0) {
        sinkCounter.incrementBatchEmptyCount();
      } else if (txnEventCount == batchSize) {
        sinkCounter.incrementBatchCompleteCount();
      } else {
        sinkCounter.incrementBatchUnderflowCount();
      }

      // flush all pending buckets before committing the transaction
      for (BucketWriter bucketWriter : writers) {
        bucketWriter.flush();
      }

      transaction.commit();

      if (txnEventCount < 1) {
        return Status.BACKOFF;
      } else {
        sinkCounter.addToEventDrainSuccessCount(txnEventCount);
        return Status.READY;
      }
    } catch (IOException eIO) {
      transaction.rollback();
      LOG.warn("HDFS IO error", eIO);
      return Status.BACKOFF;
    } catch (Throwable th) {
      transaction.rollback();
      LOG.error("process failed", th);
      if (th instanceof Error) {
        throw (Error) th;
      } else {
        throw new EventDeliveryException(th);
      }
    } finally {
      transaction.close();
    }
  }

sink process做了以下事情:

  • 获取channel
  • 维护事务
  • 维护BucketWriter, 通过BucketWriter将数据写入HDFS
  • 异常处理与状态记录、更改

flume自己实现了一套Transaction, channel.take()中, 会将取出的数据存入一个临时的ArrayDeque中, 以便在异常情况发生时回滚数据.

BucketWriter是sink向HDFS写数据的类, 里面封装了一系列读写操作, 包括open(), close(), flush(),append(),sync()等.其底层是SequenceFile.Writer或FSDataOutputStream, flume对其封装为多个HDFSWriter, 并通过HDFSWriterFactory根据hdfs.fileType属性来判断具体创建哪个Writer.

HDFSEventSink中的sfWriters维护了不同文件对应的BucketWriter.

结合上面所说的问题产生的原因, 我们可以知道,当文件权限被改变时, BucketWriter一定是无法正常工作的, 因此抛出了IOException, process()在catch到该异常后, 会对channel数据进行回滚.然而,回滚后并没有对sfWriters进行相应的处理, 那么根本问题就出现在这里.

接下来我们再看一下SinkRunner怎么运行sink的:

@Override
    public void run() {
      logger.debug("Polling sink runner starting");

      while (!shouldStop.get()) {
        try {
          if (policy.process().equals(Sink.Status.BACKOFF)) {
            counterGroup.incrementAndGet("runner.backoffs");

            Thread.sleep(Math.min(
                counterGroup.incrementAndGet("runner.backoffs.consecutive")
                * backoffSleepIncrement, maxBackoffSleep));
          } else {
            counterGroup.set("runner.backoffs.consecutive", 0L);
          }
        } catch (InterruptedException e) {
          logger.debug("Interrupted while processing an event. Exiting.");
          counterGroup.incrementAndGet("runner.interruptions");
        } catch (Exception e) {
          logger.error("Unable to deliver event. Exception follows.", e);
          if (e instanceof EventDeliveryException) {
            counterGroup.incrementAndGet("runner.deliveryErrors");
          } else {
            counterGroup.incrementAndGet("runner.errors");
          }
          try {
            Thread.sleep(maxBackoffSleep);
          } catch (InterruptedException ex) {
            Thread.currentThread().interrupt();
          }
        }
      }
      logger.debug("Polling runner exiting. Metrics:{}", counterGroup);
    }

  }

policy可以理解为对应的Sink, SinkRunner会由LifecycleAware负责管理其生命周期, 通过判断LifecycleState来决定start()还是stop(),这里就不再分析生命周期时如何维护的了. 当SinkRunner被stop()时, 会调用HDFSEventSink的stop(), 下面是该stop()的部分代码:

synchronized (sfWritersLock) {
      for (Entry<String, BucketWriter> entry : sfWriters.entrySet()) {
        LOG.info("Closing {}", entry.getKey());

        try {
          entry.getValue().close();
        } catch (Exception ex) {
          LOG.warn("Exception while closing " + entry.getKey() + ". " +
                  "Exception follows.", ex);
          if (ex instanceof InterruptedException) {
            Thread.currentThread().interrupt();
          }
        }
      }
    }

这里我们又再次看到了sfWriters, 在执行stop()时, 会close() sfWriters中的BucketWriter, 那么也就意味着, 权限或着说是能够正常open()的BucketWriter, 还存在在这个sfWriters中, 并且被close()了, 而close()中调用了flush(), 因此, 原先正常被append()的数据, 此时被flush到了磁盘,正常写入文件了. 所以也就导致了我们看到的那种情况, 20190509文件夹中的数据都是重复的.

至此, 我们已经分析完毕.

总结

本次问题分析是因为线上问题驱动的, 因此旨在找出相应的问题产生原因, 并没有对flume的整个运行机制等进行深入探究, 因此上述可能存在一些问题, 欢迎大家评论指出.

在flume的后续版本(flume-1.9)中, 对close()进行了改动, 但从commit中的描述来看, 并不是对该问题的修复.

/**
   * Close the file handle and rename the temp file to the permanent filename.
   * Safe to call multiple times. Logs HDFSWriter.close() exceptions.
   */
  public void close(boolean callCloseCallback, boolean immediate)
      throws InterruptedException {
    if (callCloseCallback) {
      if (closed.compareAndSet(false, true)) {
        runCloseAction(); //remove from the cache as soon as possible
      } else {
        LOG.warn("This bucketWriter is already closing or closed.");
      }
    }
    doClose(immediate);
  }

commit:

FLUME-3080. Close failure in HDFS Sink might cause data loss

If the HDFS Sink tries to close a file but it fails (e.g. due to timeout) the last block might not end up in COMPLETE state. In this case block recovery should happen but as the lease is still held by Flume the NameNode will start the recovery process only after the hard limit of 1 hour expires.

This change adds an explicit recoverLease() call in case of close failure.
  • 0
    点赞
  • 2
    收藏
    觉得还不错? 一键收藏
  • 3
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 3
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值