kafka日志存储(七):LogManager的定时任务

在一个brokers上所有的log都是由LogManager管理的,LogManager提供了加载Log、创建Log、删除Log和查询的功能。分别是日志刷写(log-flusher)、日志保留(log-retention)、检查点更新(recovery-point-checkpoint)以及日志清理(Cleaner)。
LogManager各个字段的功能:
logDirs:log目录集合,通过log.dirs配置。
ioThreads:完成Log加载的相爱过年关操作,每个log目录下分配指定的线程执行加载。
scheduler:KafkaScheduler对象,用来执行周期任务的线程池。
logs:Pool[TopicAndPartition, Log]类型,管理TopicAndPartition和Log之间的对应关系。底层使用hashMap实现。
dirLocks:FileLock集合,对每个log目录加锁。
recoveryPointCheckpoints:Map[File,OffsetCheckpoint]类型。管理每个log目录下和它的RecoveryPointCheckpoint文件的映射关系。
LogManager中的定时任务

log-retention任务:按照两个条件进行清理,一是存活时间,二是Log的大小。

  def cleanupLogs() {
    debug("Beginning log cleanup...")
    var total = 0
    val startMs = time.milliseconds
    //心如果Log的cleanup.policy配置不是delete,就不会刹车农户
    for(log <- allLogs; if !log.config.compact) {
      debug("Garbage collecting '" + log.name + "'")
      //删除的任务给了cleanupExpiredSegments和cleanupSegmentsToMaintainSize,分别根据时间和大小删除
      total += cleanupExpiredSegments(log) + cleanupSegmentsToMaintainSize(log)
    }
    debug("Log cleanup completed. " + total + " files deleted in " +
                  (time.milliseconds - startMs) / 1000 + " seconds")
  }

cleanupExpiredSegments根据存活的时间来删除日志:

  private def cleanupExpiredSegments(log: Log): Int = {
    if (log.config.retentionMs < 0)
      return 0
    val startMs = time.milliseconds
    //LogManager管理的是Log,不能直接管理LogSegment,删除的LogSegment的任务给Log处理。
    //删除的条件是LogSegment的日志文件在最近一段时间内没有被修改(retentionMs)
    log.deleteOldSegments(startMs - _.lastModified > log.config.retentionMs)
  }

  def deleteOldSegments(predicate: LogSegment => Boolean): Int = {
    lock synchronized {
      //获取activeSegment
      val lastEntry = segments.lastEntry
      val deletable =
        if (lastEntry == null) Seq.empty
        //通过logSegments方法得到segments跳表中value集合的迭代器,循环检测LogSegment是否符合删除条件
        else logSegments.takeWhile(s => predicate(s) && (s.baseOffset != lastEntry.getValue.baseOffset || s.size > 0))
      val numToDelete = deletable.size
      if (numToDelete > 0) {
        // 全部都符合条件,那至少得保留一个LogSegment,创建一个activeSegment。
        if (segments.size == numToDelete)
          roll()
        // 删除LogSegment
        deletable.foreach(deleteSegment(_))
      }
      numToDelete
    }
  }
  
  private def deleteSegment(segment: LogSegment) {
    info("Scheduling log segment %d for log %s for deletion.".format(segment.baseOffset, name))
    lock synchronized {
      //从segments集合中删除LogSegment对象
      segments.remove(segment.baseOffset)
      //异步删除日志文件和索引文件。
      asyncDeleteSegment(segment)
    }
  }
  
  private def asyncDeleteSegment(segment: LogSegment) {
      //日志文件和索引文件改成.deleted后缀。
    segment.changeFileSuffixes("", Log.DeletedFileSuffix)
    def deleteSeg() {
      //删除日志文件和索引文件的定时任务。
      info("Deleting segment %d from log %s.".format(segment.baseOffset, name))
      segment.delete()
    }
    scheduler.schedule("delete-file", deleteSeg, delay = config.fileDeleteDelayMs)
  }

cleanupSegmentsToMaintainSize根据retention.bytes配置项预当前Log的大小判断是否删除LogSegment。
 

  private def cleanupSegmentsToMaintainSize(log: Log): Int = {
    if(log.config.retentionSize < 0 || log.size < log.config.retentionSize)
      return 0
    //计算需要删除的字节数
    var diff = log.size - log.config.retentionSize
    //判断这个LogSegment是否需要删除.
    def shouldDelete(segment: LogSegment) = {
      if(diff - segment.size >= 0) {
        diff -= segment.size
        true
      } else {
        false
      }
    }
    log.deleteOldSegments(shouldDelete)
  }

log-flusher任务会周期性地执行flush操作,判断Log未刷新的时长是否大于flush.ms.
  

  private def flushDirtyLogs() = {
    debug("Checking for dirty logs to flush...")
    //遍历logs集合
    for ((topicAndPartition, log) <- logs) {
      try {
        val timeSinceLastFlush = time.milliseconds - log.lastFlushTime
        debug("Checking if flush is needed on " + topicAndPartition.topic + " flush interval  " + log.config.flushMs +
              " last flushed " + log.lastFlushTime + " time since last flush: " + timeSinceLastFlush)
              //检测是否到时间执行flush操作,调用log.flush方法
        if(timeSinceLastFlush >= log.config.flushMs)
          log.flush
      } catch {
        case e: Throwable =>
          error("Error flushing topic " + topicAndPartition.topic, e)
      }
    }
  }

每个目录下都有一个recoveryPointCheckpoints文件,记录这个log目录下每个Log的recoveryPoint值.recoveryPointCheckpoints在Brokers启动时帮组Brokers进行Log的恢复工作.
recovery-point-checkpoint会周期性地调用LogManager.checkpointRecoveryPointOffsets完成recoveryPointCheckpoints文件的更新
 

  def checkpointRecoveryPointOffsets() {'
    //对每个目录调用checkpointLogsInDir方法.
    this.logDirs.foreach(checkpointLogsInDir)
  }
  
  private def checkpointLogsInDir(dir: File): Unit = {
      //获取log目录下的TopicAndPartition值,以及对应的Log对象
    val recoveryPoints = this.logsByDir.get(dir.toString)
    if (recoveryPoints.isDefined) {
        //更新recoveryPointCheckpoints文件
      this.recoveryPointCheckpoints(dir).write(recoveryPoints.get.mapValues(_.recoveryPoint))
    }
  }
  recoveryPointCheckpoints文件的更新在OffsetCheckPoint中实现的,更新方式就是把log目录下所有的recoveryPoint写到tmp文件中,然后用tmp去替换。
  def write(offsets: Map[TopicAndPartition, Long]) {
    lock synchronized {
      // write to temp file and then swap with the existing file
      val fileOutputStream = new FileOutputStream(tempPath.toFile)
      val writer = new BufferedWriter(new OutputStreamWriter(fileOutputStream))
      try {
        //写入当前版本号
        writer.write(CurrentVersion.toString)
        writer.newLine()
        //写入记录条数
        writer.write(offsets.size.toString)
        writer.newLine()
        //写入topic名称、分区编号和对应Log的recoveryPoint
        offsets.foreach { case (topicPart, offset) =>
          writer.write(s"${topicPart.topic} ${topicPart.partition} $offset")
          writer.newLine()
        }
        //刷新磁盘
        writer.flush()
        fileOutputStream.getFD().sync()
      } catch {
        case e: FileNotFoundException =>
          if (FileSystems.getDefault.isReadOnly) {
            fatal("Halting writes to offset checkpoint file because the underlying file system is inaccessible : ", e)
            Runtime.getRuntime.halt(1)
          }
          throw e
      } finally {
        writer.close()
      }
      //tmp临时文件替换到  recoveryPointCheckpoints文件
      Utils.atomicMoveWithFallback(tempPath, path)
    }
  }
  

 

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值