java kafka日志_Kafka日志

kafka是怎么通过偏移量找到对应的消息?

首先消费者消费时会指定Topic和Partition,每个Partition对应物理上的Log文件夹,为了防止Log过大而带来的清理、删除困难,因此Partition分成多个LogSegment,每个LogSegment包含.index(偏移量索引日志)、.log(日志文件)、.timestamp(时间戳索引日志)等文件。其中偏移量索引是稀疏索引,从逻辑偏移量映射到物理偏移量。稀疏索引的稀疏程度,可以由参数()指定。

604c8799a662

image.png

假定我们要查找偏移量为268的消息,那么需要先定位到消息的日志分段,Kafka是使用ConcurrentSkipListMap(Java util包下面的)来保存各个日志分段的,ConcurrentSkipListMap是一个跳表,增删改查复杂度都是log(n)),(ConcurrentSkipListMap实现原理?Kafka怎么使用ConcurrentSkipListMap?),

private val segments: ConcurrentNavigableMap[java.lang.Long, LogSegment] = new ConcurrentSkipListMap[java.lang.Long, LogSegment]

//这段代码是日志的读取方法,入参有起始偏移量,最大长度等

def read(startOffset: Long,

maxLength: Int,

isolation: FetchIsolation,

minOneMessage: Boolean): FetchDataInfo = {

maybeHandleIOException(s"Exception while reading from $topicPartition in dir ${dir.getParent}") {

trace(s"Reading maximum $maxLength bytes at offset $startOffset from log with " +

s"total length $size bytes")

val includeAbortedTxns = isolation == FetchTxnCommitted

// Because we don't use the lock for reading, the synchronization is a little bit tricky.

// We create the local variables to avoid race conditions with updates to the log.

//读取的时候避免使用锁,而是用临时变量保存LEO,避免竞争

val endOffsetMetadata = nextOffsetMetadata

val endOffset = endOffsetMetadata.messageOffset

var segmentEntry = segments.floorEntry(startOffset)//通过线程安全的跳表,获取日志段

// return error on attempt to read beyond the log end offset or read below log start offset

if (startOffset > endOffset || segmentEntry == null || startOffset < logStartOffset)

throw new OffsetOutOfRangeException(s"Received request for offset $startOffset for partition $topicPartition, " +

s"but we only have log segments in the range $logStartOffset to $endOffset.")

val maxOffsetMetadata = isolation match {

case FetchLogEnd => endOffsetMetadata

case FetchHighWatermark => fetchHighWatermarkMetadata

case FetchTxnCommitted => fetchLastStableOffsetMetadata

}

if (startOffset == maxOffsetMetadata.messageOffset) {

return emptyFetchDataInfo(maxOffsetMetadata, includeAbortedTxns)

} else if (startOffset > maxOffsetMetadata.messageOffset) {

val startOffsetMetadata = convertToOffsetMetadataOrThrow(startOffset)

return emptyFetchDataInfo(startOffsetMetadata, includeAbortedTxns)

}

// Do the read on the segment with a base offset less than the target offset

// but if that segment doesn't contain any messages with an offset greater than that

// continue to read from successive segments until we get some messages or we reach the end of the log

while (segmentEntry != null) {

val segment = segmentEntry.getValue

val maxPosition = {

// Use the max offset position if it is on this segment; otherwise, the segment size is the limit.

if (maxOffsetMetadata.segmentBaseOffset == segment.baseOffset) {

maxOffsetMetadata.relativePositionInSegment

} else {

segment.size

}

}

val fetchInfo = segment.read(startOffset, maxLength, maxPosition, minOneMessage)//这个方法在日志段上读取偏移量的消息

if (fetchInfo == null) {

segmentEntry = segments.higherEntry(segmentEntry.getKey)

} else {

return if (includeAbortedTxns)

addAbortedTransactions(startOffset, segmentEntry, fetchInfo)

else

fetchInfo

}

}

// okay we are beyond the end of the last segment with no data fetched although the start offset is in range,

// this can happen when all messages with offset larger than start offsets have been deleted.

// In this case, we will return the empty set with log end offset metadata

FetchDataInfo(nextOffsetMetadata, MemoryRecords.EMPTY)

}

}

//日志段读取偏移量,先是translateOffset方法找到起始物理偏移量,FetchDataInfo进行读取

@threadsafe

def read(startOffset: Long,

maxSize: Int,

maxPosition: Long = size,

minOneMessage: Boolean = false): FetchDataInfo = {

if (maxSize < 0)

throw new IllegalArgumentException(s"Invalid max size $maxSize for log read from segment $log")

val startOffsetAndSize = translateOffset(startOffset)

// if the start position is already off the end of the log, return null

if (startOffsetAndSize == null)

return null

val startPosition = startOffsetAndSize.position

val offsetMetadata = LogOffsetMetadata(startOffset, this.baseOffset, startPosition)

val adjustedMaxSize =

if (minOneMessage) math.max(maxSize, startOffsetAndSize.size)

else maxSize

// return a log segment but with zero size in the case below

if (adjustedMaxSize == 0)

return FetchDataInfo(offsetMetadata, MemoryRecords.EMPTY)

// calculate the length of the message set to read based on whether or not they gave us a maxOffset

val fetchSize: Int = min((maxPosition - startPosition).toInt, adjustedMaxSize)

FetchDataInfo(offsetMetadata, log.slice(startPosition, fetchSize),

firstEntryIncomplete = adjustedMaxSize < startOffsetAndSize.size)

}

通过跳表我们可以快速定位到baseOffset为251的日志段,268-251=17,算出相对偏移量,然后通过.index稀疏索引,找到第一个不大于17的索引。这里是通过二分查找找到稀疏索引里不大于17的索引是14,然后指向.log文件,从14开始遍历,发现不存在相对偏移量为17的物理地址。这里的firstOffset是这块BatchRecord的起始位置,然后通过position可以定位到这条消息的具体位置。这里使用的是相对偏移量而不是绝对偏移量的原因是节省内存空间,毕竟通过baseOffset就可以知道原来的偏移量是多少了。

通过偏移量找到对应的消息,这块源码还在看,有待更新~

2. LEO和HW的概念

HW是消费者只能拉取HW之前的消息。

LEO标识日志中下一条待写入的消息的offset,ISR集合中最小的LEO即为分区的HW

3. kafka改进的二分法:将数据分为冷区和热区,先在热区使用二分查找,再在冷区进行二分查找

|

IndexSearchType): (Int, Int) = {

// check if the index is empty

if(_entries == 0)

return (-1, -1)

def binarySearch(begin: Int, end: Int) : (Int, Int) = {

// binary search for the entry

var lo = begin

var hi = end

while(lo < hi) {

val mid = (lo + hi + 1) >>> 1

val found = parseEntry(idx, mid)

val compareResult = compareIndexEntry(found, target, searchEntity)

if(compareResult > 0)

hi = mid - 1

else if(compareResult < 0)

lo = mid

else

return (mid, mid)

}

(lo, if (lo == _entries - 1) -1 else lo + 1)

}

val firstHotEntry = Math.max(0, _entries - 1 - _warmEntries)

// check if the target offset is in the warm section of the index

if(compareIndexEntry(parseEntry(idx, firstHotEntry), target, searchEntity) < 0) {

return binarySearch(firstHotEntry, _entries - 1)

}

// check if the target offset is smaller than the least offset

if(compareIndexEntry(parseEntry(idx, 0), target, searchEntity) > 0)

return (-1, 0)

binarySearch(0, firstHotEntry)

}

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值