1、概述
Coordator作为Discovery的实现类,在集群选举中负责master的选举。
2、创建
Coordinator是通过DiscoveryModule创建的
discovery = new Coordinator(NODE_NAME_SETTING.get(settings),
settings, clusterSettings,
transportService, namedWriteableRegistry, allocationService, masterService, gatewayMetaState::getPersistedState,
seedHostsProvider, clusterApplier, joinValidators, new Random(Randomness.get().nextLong()), rerouteService,
electionStrategy, nodeHealthService);
其中seedHostProvider为
final SeedHostsProvider seedHostsProvider = hostsResolver -> {
final List<TransportAddress> addresses = new ArrayList<>();
for (SeedHostsProvider provider : filteredSeedProviders) {
addresses.addAll(provider.getSeedAddresses(hostsResolver));
}
return Collections.unmodifiableList(addresses);
};
clusterApplier为ClusterService的ClusterApplierService
persistedStateSupplier为GatewayMetaState#getPersistedState
3、启动
调用doStart方法
创建CoordinationState,同时peerFinder设置currentTerm。
在单结点发现模式下,检查votingConfiguration是否满足quorm。
创建ClusterState,设置applierState,设置clusterApplier的初始状态。
protected void doStart() {
synchronized (mutex) {
CoordinationState.PersistedState persistedState = persistedStateSupplier.get();
coordinationState.set(new CoordinationState(getLocalNode(), persistedState, electionStrategy));
peerFinder.setCurrentTerm(getCurrentTerm());
configuredHostsResolver.start();
final ClusterState lastAcceptedState = coordinationState.get().getLastAcceptedState();
if (lastAcceptedState.metadata().clusterUUIDCommitted()) {
logger.info("cluster UUID [{}]", lastAcceptedState.metadata().clusterUUID());
}
final VotingConfiguration votingConfiguration = lastAcceptedState.getLastCommittedConfiguration();
if (singleNodeDiscovery &&
votingConfiguration.isEmpty() == false &&
votingConfiguration.hasQuorum(Collections.singleton(getLocalNode().getId())) == false) {
throw new IllegalStateException("cannot start with [" + DiscoveryModule.DISCOVERY_TYPE_SETTING.getKey() + "] set to [" +
DiscoveryModule.SINGLE_NODE_DISCOVERY_TYPE + "] when local node " + getLocalNode() +
" does not have quorum in voting configuration " + votingConfiguration);
}
ClusterState initialState = ClusterState.builder(ClusterName.CLUSTER_NAME_SETTING.get(settings))
.blocks(ClusterBlocks.builder()
.addGlobalBlock(STATE_NOT_RECOVERED_BLOCK)
.addGlobalBlock(noMasterBlockService.getNoMasterBlock()))
.nodes(DiscoveryNodes.builder().add(getLocalNode()).localNodeId(getLocalNode().getId()))
.build();
applierState = initialState;
clusterApplier.setInitialState(initialState);
}
}
4 、选举
调用startInitialJoin开始选举
4.1 becomeCandidate
主要执行操作包含
- 设置当前mode为CANDIDATE。InitialJoinAccumulator关闭。
- 设置当前的joinAccumulator为CandidateJoinAccumulator
- 激活peerFinder(CoordinatorPeerFinder),获取参与选举的所有节点。
- PreVoteCollector更新PreVoteResponse(currentTerm当前任期,lastAcceptedTerm上次集群状态接受的任期, lastAcceptedVersion上次集群状态接受的版本)
通过SeedHostsResolver解析得到集群中参与选举的节点,使用HandshakingTransportAddressConnector异步连接到远端主节点,同时将TransportAddress与远端节点Peer对应关系放入peersByAddress的map中。
4.1.1 获取参与选举的所有节点
PeerFinder来查找所有可以参与选举的节点。
在找到选举节点后,首先会与对端节点建立连接,然后发送REQUEST_PEERS_ACTION_NAME请求。
CoordinatorPeerFinder#onFoundPeersUpdated在满足条件时开始发起选举。
protected void onFoundPeersUpdated() {
synchronized (mutex) {
final Iterable<DiscoveryNode> foundPeers = getFoundPeers();
if (mode == Mode.CANDIDATE) {
final VoteCollection expectedVotes = new VoteCollection();
foundPeers.forEach(expectedVotes::addVote);
expectedVotes.addVote(Coordinator.this.getLocalNode());
final boolean foundQuorum = coordinationState.get().isElectionQuorum(expectedVotes);
if (foundQuorum) {
if (electionScheduler == null) {
startElectionScheduler();
}
} else {
closePrevotingAndElectionScheduler();
}
}
}
clusterBootstrapService.onFoundPeersUpdated();
}
private void startElectionScheduler() {
assert electionScheduler == null : electionScheduler;
if (getLocalNode().isMasterNode() == false) {
return;
}
final TimeValue gracePeriod = TimeValue.ZERO;
electionScheduler = electionSchedulerFactory.startElectionScheduler(gracePeriod, new Runnable() {
@Override
public void run() {
synchronized (mutex) {
if (mode == Mode.CANDIDATE) {
final ClusterState lastAcceptedState = coordinationState.get().getLastAcceptedState();
if (localNodeMayWinElection(lastAcceptedState) == false) {
logger.trace("skip prevoting as local node may not win election: {}",
lastAcceptedState.coordinationMetadata());
return;
}
final StatusInfo statusInfo = nodeHealthService.getHealth();
if (statusInfo.getStatus() == UNHEALTHY) {
logger.debug("skip prevoting as local node is unhealthy: [{}]", statusInfo.getInfo());
return;
}
if (prevotingRound != null) {
prevotingRound.close();
}
prevotingRound = preVoteCollector.start(lastAcceptedState, getDiscoveredNodes());
}
}
}
@Override
public String toString() {
return "scheduling of new prevoting round";
}
});
}
4.2 PreVoteCollector
4.2.1 start
创建PreVotingRound, 开始预投票
public Releasable start(final ClusterState clusterState, final Iterable<DiscoveryNode> broadcastNodes) {
PreVotingRound preVotingRound = new PreVotingRound(clusterState, state.v2().getCurrentTerm());
preVotingRound.start(broadcastNodes);
return preVotingRound;
}
发送PreVoteRequest到其他节点,收到请求后,更新节点见到的最大任期,返回当前节点的PreVoteResponse。接收到响应后,更新见到的最大任期,并且开始正式选举。时序图为
4.3 startElection
创建StartJoinRequest,sourceNode为本地节点 ,term为见到的任期及当前集群状态任期中的最大值+1,向集群中的其他节点发送START_JOIN_ACTION_NAME请求。对端收到StartJoinRequest处理,发送 JoinRequest,并且返回 StartJoinResponse。详细的流程图为
4.4 JoinAccumulator
不同的角色处理join请求不同,有三种角色+一种初始化角色
4.5 ClusterStateTaskExecutor
处理集群状态任务执行器