ReplicaManager.updateReplicaLEOAndPartitionHW
更新
replica
.logEndOffset 是
在函数
ReplicaManager.
updateReplicaLEOAndPartitionHW
(topic: String, partitionId: Int, replicaId: Int, offset: LogOffsetMetadata)
Partition.
updateLeaderHWAndMaybeExpandIsr(
replicaId: Int
)
如果参数replicaId在AR中,不ISR中,并且读取的offset已经超过这个partition的leader的hw的offset,就加入到ISR中,并更新zk的
更新zk集群的/brokers/topics/{topic}/partitions/{partitionId}/state结点
{
"controller_epoch":29,
"leader":2,
"version":1,
"leader_epoch":48,
"isr":[2]
"leader":2,
"version":1,
"leader_epoch":48,
"isr":[2]
}
ReplicaFetcherThread.
processPartitionData
replace是follower时,
BrokerAndInitialOffset(
broker
: Broker,
initOffset
: Long)
BrokerAndInitialOffset创建时两个参数说明
broker
得到这个repilca所在partition的leader的
Broker信息
initOffset 是
Replica.
logEndOffset的值
如果是本地的那么
Replica.log就不为空,这时
Replica.
logEndOffset为
Replica.
log
.get.logEndOffsetMetadata
如果不是本地的 ,那就返回
Replica.
logEndOffsetMetadata
replicaFetcherManager
.addFetcherForPartitions(
partitionAndOffsets: Map[TopicAndPartition, BrokerAndInitialOffset]
)
BrokerAndFetcherId(
broker
: Broker,
fetcherId
: Int)
fetcherId 是通过
topic和
partitionId算出的数字hash值
把参数
partitionAndOffsets转化成
[BrokerAndFetcherId,
BrokerAndInitialOffset]
AbstractFetcherManager.
fetcherThreadMap
=
new
mutable.HashMap[BrokerAndFetcherId, AbstractFetcherThread]
AbstractFetcherThread的派生类是
ReplicaFetcherThread
也就是一个为
follower的
replica,就需要一个线程来从这个partition的leader中取数据
ReplicaFetcherThread.
addPartitions(
partitionAndOffsets: Map[TopicAndPartition, Long]
)
addPartitions函数设置到成员变量
ReplicaFetcherThread.
partitionMap =
new
mutable.HashMap[TopicAndPartition, Long]
partitionAndOffsets参数肯定就是一个成员,
Long为
BrokerAndInitialOffset.
initOffset
该线程把
ReplicaFetcherThread.
partitionMap拼装称
FetchRequest命令
FetchRequest(
versionId
: Short = FetchRequest.
CurrentVersion
,
correlationId : Int = FetchRequest. DefaultCorrelationId ,
clientId : String = ConsumerConfig. DefaultClientId ,
replicaId : Int = Request. OrdinaryConsumerId ,
maxWait : Int = FetchRequest. DefaultMaxWait ,
minBytes : Int = FetchRequest. DefaultMinBytes ,
correlationId : Int = FetchRequest. DefaultCorrelationId ,
clientId : String = ConsumerConfig. DefaultClientId ,
replicaId : Int = Request. OrdinaryConsumerId ,
maxWait : Int = FetchRequest. DefaultMaxWait ,
minBytes : Int = FetchRequest. DefaultMinBytes ,
requestInfo: Map[TopicAndPartition, PartitionFetchInfo])
PartitionFetchInfo(
offset
: Long,
fetchSize
: Int)
得到的回复为
FetchResponse(
correlationId
: Int,
data: Map[TopicAndPartition, FetchResponsePartitionData])
FetchResponsePartitionData(
error
: Short = ErrorMapping.
NoError
,
hw
: Long = -
1L
,
messages
: MessageSet)
得到的回复中取
messages中最后一个消息的offset,之后更新到
ReplicaFetcherThread.
partitionMap =
new
mutable.HashMap[TopicAndPartition, Long]
ReplicaManager.
getReplica(
topic
,
partitionId
)得到这个回复所属的replica,之后写入这个消息集合
replica
.
log
.get.append(
messageSet
, assignOffsets =
false
)
replica
.logEndOffset.
messageOffset和
FetchResponsePartitionData.
hw取最小值,之后更新到
replica
.highWatermark中
ReplicaManager会定时把本broker所有的replica的
replica
.highWatermark写入到
replication-offset-checkpoint
文件中
replica是leader或follower都要写入,如果
follower为选为leader,会用到这个
highWatermark
replace是leader时,
Partition在初始化时会把ar里的列表创建
Replica
创建好的AR赋值到Partition.
assignedReplicaMap
=
new
Pool[Int, Replica]
replicaManager.
highWatermarkCheckpoints负责
replication-offset-checkpoint
文件的读写
replica在通过
Partition.
getOrCreateReplica
创建的时候
Replica.
initialHighWatermarkValue可以从
replication-offset-checkpoint中读
如果
replication-offset-checkpoint文件没有topicpartition的值,
Replica.
initialHighWatermarkValue就赋值为 本partition的replica的
log
.logEndOffset
Replica(
val
brokerId
: Int,
val partition : Partition,
time: Time = SystemTime,
initialHighWatermarkValue: Long = 0L ,
val partition : Partition,
time: Time = SystemTime,
initialHighWatermarkValue: Long = 0L ,
val log: Option[Log] = None)
如果是follower发送的fetch命令,那么把fetch的topic partition的offset来设置
eplicaManager
.updateReplicaLEOAndPartitionHW(
topicAndPartition
.
topic
,
topicAndPartition.partition, replicaId, offset)
partition
.getReplica(replicaId)
使用offset更新
replica
.logEndOffset = offset
partition
.
updateLeaderHWAndMaybeExpandIsr
(replicaId: Int)
如果参数replicaId在AR中,不ISR中,并且读取的offset已经超过leader的hw的offset,就加入到ISR中,并更新zk的
更新zk集群/brokers/topics/{topic}/partitions/{partitionId}/state
{
"controller_epoch":29,
"leader":2,
"version":1,
"leader":2,
"version":1,
"leader_epoch":48,
"isr":[2,3,6]
}
partition.
maybeIncrementLeaderHW
(leaderReplica: Replica)
判断leader水位线是否要增长
从partition
.inSyncReplicas取出所有replica的
logEndOffset,形成集合,取最小值
这个值和leader的
highWatermark进行对比,如果这个值比
leader.
highWatermark大,就更新
leader.
highWatermark