CleanerConfig
numThreads: Int =1
清理线程的个数,每个线程
调用
cleanerManager
.grabFilthiestLog()返回的最该清理的topicAndPartition的
LogToClean对象,之后开始清理工作
dedupeBufferSize: Long =4*1024*1024L
dedupeBufferLoadFactor: Double =0.9d
hashAlgorithm: String = "MD5"
offsetMap =
new
SkimpyOffsetMap(memory =
math.min(
config.dedupeBufferSize
/
config.numThreads
, Int.
MaxValue
).toInt,
hashAlgorithm =
config.hashAlgorithm
)
hashAlgorithm为map中key值的转换成hash的方法
清理消息的起始位置是0,结束位置为
endOffset
buildOffsetMap函数返回的偏移量记作endOffset,由如下两个因素决定
这个偏移量不会超过
(
cleanable.
firstDirtyOffset
+ map.slots *
this
.dupBufferLoadFactor)
在从segment往
offsetMap写
message
.key,
entry
.
offset时,写到
map.utilization <
this
.dupBufferLoadFactor位置的offset
1).得到需要清理的segment集合,取出
cleanable.
firstDirtyOffset到
log
.activeSegment.
baseOffset的所有segment,记作
dirty
2).通过
offsetMap参数的大小,来计算一次清理的结束的offset,记作
minStopOffset
minStopOffset
= (start + map.slots *
this
.dupBufferLoadFactor).toLong
3).遍历
dirty
满足两个条件其中之一,
segment
.
baseOffset
<=
minStopOffset
|| map.utilization <
this
.dupBufferLoadFactor
就开始调用
buildOffsetMapForSegment来把该segment信息保存在
offsetMap中
segment把消息读到
Cleaner.
readBuffer中,之后利用
Cleaner.
readBuffer创建
ByteBufferMessageSet
entry类型为
MessageAndOffset(
message
: Message,
offset
: Long)
offsetMap保存的内容是
map.put(
message
.key,
entry
.
offset
)
ioBufferSize: Int =1024*1024
Cleaner.
readBuffer
Cleaner.
writeBuffer
这两个buff大小为
config
.
ioBufferSize
/
config
.
numThreads
/
2
在做某个topicAndPartition清理时,需要
从老segmernt中读到
Cleaner.
readBuffer,之后在把符合的message写入心segment时,要先把数据写到
Cleaner.
writeBuffer中
maxMessageSize
: Int =
32
*
1024
*
1024
maxIoBytesPerSecond
: Double = Double.
MaxValue
backOffMs: Long =15*1000
调用
cleanerManager
.grabFilthiestLog()返回的最该清理的topicAndPartition的
LogToClean对象
如果该
LogToClean
对象唯恐,表示现在暂时没有需要符合清理条件的
LogToClean,就调用
backOffWaitLatch
.await(
config
.
backOffMs
, TimeUnit.
MILLISECONDS
)
enableCleaner: Boolean =true
是否可以进行清理操作
LogConfig
segmentSize: Int = Defaults.SegmentSize
log下每个段segment的字节最大大小,超过大小需要建立新段
segmentMs: Long = Defaults.SegmentMs
当前写入的段离该段创建时间超过
segmentMs这个值,就建立新段
segmentJitterMs: Long = Defaults.SegmentJitterMs
为避免config.segmentMs后segment同时进行回滚,用@param segmentJitterMs 来错开进行回滚,就是写入新的segment
randomSegmentJitter =
if (segmentJitterMs == 0) 0 else Utils.abs(scala.util.Random.nextInt()) % math.min(segmentJitterMs, segmentMs)
flushInterval: Long = Defaults.FlushInterval
写入的消息个数达到阈值
FlushInterval是,对这个topicAndPartition所属的log进行fulsh
上次flush到新写入的消息的个数 unflushedMessages() = this.logEndOffset - this.recoveryPoint
unflushedMessages >= config.flushInterval
flush使用的截至offset是nextOffsetMetadata.messageOffset
1.把this.recoveryPoint到offset的partition的绝对偏移量的segment列表,来逐个flush
segment.flush就是把index和log文件进行flush
2.用参数offset来设置this.recoveryPoint
3.this.lastflushedTime设置当前时间time.milliseconds
flushMs: Long = Defaults.FlushMs
log的flush时间小于log.config.flushMs,就对log进行flush
lastflushedTime
.set(time.milliseconds)
retentionSize: Long = Defaults.RetentionSize
retentionMs: Long = Defaults.RetentionMs
对修改时间和log总字节大小限制进行Segment清理工作
1.删除log目录中修改时间需要删除的Segment
2.日志里的Segment字节总和超过log.config.retentionSize,
就删除一些Segment,直到总大小小于log.config.retentionSize
maxMessageSize: Int = Defaults.MaxMessageSize
每个消息的最大字节数
maxIndexSize: Int = Defaults.MaxIndexSize
每个segment都有对应的index,index文件大小不能超过
maxIndexSize
indexInterval: Int = Defaults.IndexInterval
在log写入数据时,间隔写入indexInterval条后,往index中写一个位置
fileDeleteDelayMs: Long = Defaults.FileDeleteDelayMs
定时删除segment
1.log和index文件后缀加.deleted后缀名
2.启动线程定时器,config.fileDeleteDelayMs后调用segment.delete()
segment.delete()删除指定的带有.deleted后缀名的文件
segment.lastModified
deleteRetentionMs: Long = Defaults.DeleteRetentionMs
得到需要删除的时间戳,比这个时间戳小的,就直接删除,不计入归并计算,记作
deleteHorizonMs
1)把offset从
0到
cleanable.
firstDirtyOffset的segment集合
2)取出该集合最后一个segment,这个segment是离当前时间最近的segment,
deleteHorizonMs =
seg
.lastModified -
log
.
config
.
deleteRetentionMs
minCleanableRatio: Double = Defaults.MinCleanableDirtyRatio
最小的清理log下segment列表的比例
compact: Boolean = Defaults.Compact
需要清理的数据是否要压缩保存
uncleanLeaderElectionEnable
: Boolean = Defaults.
UncleanLeaderElectionEnable
,
minInSyncReplicas: Int = Defaults.MinInSyncReplicas
Defaults
SegmentSize
=
1024
*
1024
config.segmentMs是段对象存活的时间@param segmentMs
为避免config.segmentMs后segment同时进行回滚,用@param segmentJitterMs 来错开进行回滚
SegmentMs= Long.MaxValue
SegmentJitterMs
=
0L
FlushInterval= Long.MaxValue
FlushMs
= Long.
MaxValue
RetentionSize
= Long.
MaxValue
RetentionMs
= Long.
MaxValue
MaxMessageSize
= Int.
MaxValue
MaxIndexSize
=
1024
*
1024
IndexInterval
=
4096
FileDeleteDelayMs
=
60
*
1000L
DeleteRetentionMs
=
24
*
60
*
60
*
1000L
MinCleanableDirtyRatio
=
0.5
Compact
=
false
UncleanLeaderElectionEnable
=
true
MinInSyncReplicas
=
1