转载自:https://zhuanlan.zhihu.com/p/476876447
参考文献:https://juejin.cn/post/6907151199141625870
参考文献: https://raft.github.io/raft.pdf
介绍
Apache Ratis是开源的、由Java实现的Multi-Raft共识协议,官网在Apache Ratis,对应的代码仓库在ASF Git Repos - ratis.git/summary。
总体架构图
Client
- 客户端可以为每种不同的请求(Read/Write/Watch等)设置不同的RetryPolicy,这些会被以下的数据结构管理
EnumMap<RaftProtos.RaftClientRequestProto.TypeCase, RetryPolicy> map
2. 客户端的每一个请求都带上了 clientID和callerID, 用于唯一标识客户端的每一个请求。这两个字段会被RaftServer用作实现exactly-once的请求语义
3. 客户端会为每一个Raft Group的Leader进行缓存,如果在60s内这个Leader没有被访问,那么自动会将这个缓存信息过期
Cache<RaftGroupId, RaftPeerId> LEADER_CACHE = CacheBuilder.newBuilder()
.expireAfterAccess(60, TimeUnit.SECONDS)
.maximumSize(1024).build();
RetryCache
RetryCache是一个针对client请求的缓存,该cache会缓存最近被请求的request的reply。cache数据结构为
private final Cache<ClientInvocationId, CacheEntry> cache;
每一个Cache Entry有一个Expire Time,可以由用户进行配置。在RaftServerImpl中有使用到对应的Cache缓存。注意这个Cache只是用来优化Client的Retry操作的,并没有严格保证Exactly-Once语义。
final RetryCacheImpl.CacheQueryResult queryResult = retryCache.queryCache(ClientInvocationId.valueOf(request));
final CacheEntry cacheEntry = queryResult.getEntry();
if (queryResult.isRetry()) {
// if the previous attempt is still pending or it succeeded, return its
// future
replyFuture = cacheEntry.getReplyFuture();
} else {
do_request...
}
Raft Peer身份管理
Raft协议中,每一个节点可能有三种身份,Leader、Candidate、Follower,Ratis在类org.apache.ratis.server.impl.RoleInfo 中实现并管理角色状态机的转换和实现,对应的方法如下:
1. Follower身份
- 在RaftServerImpl类中,通过方法changeToFollower让角色状态机进入到Follower的身份。该方法会关闭上一个身份的状态机(Leader的心跳守护线程和Candidate的选举线程),启动Follower状态机。对应的Follower状态代码实现在FollowerState中
private synchronized boolean changeToFollower(long newTerm, boolean force, Object reason) {
final RaftPeerRole old = role.getCurrentRole();
final boolean metadataUpdated = state.updateCurrentTerm(newTerm);
if (old != RaftPeerRole.FOLLOWER || force) {
setRole(RaftPeerRole.FOLLOWER, reason);
if (old == RaftPeerRole.LEADER) {
role.shutdownLeaderState(false);
} else if (old == RaftPeerRole.CANDIDATE) {
role.shutdownLeaderElection();
} else if (old == RaftPeerRole.FOLLOWER) {
role.shutdownFollowerState();
}
role.startFollowerState(this, reason);
}
return metadataUpdated;
}
2. ElectionTimeOut: 在配置中可以配置minRpcTimeoutMs和maxRpcTimeoutMs,Ratis使用[minRPCTimeOut, maxRPCTimeOut]的范围内的随机值作为ElectionTimeOut
TimeDuration getRandomElectionTimeout() {
final int min = properties().minRpcTimeoutMs();
final int millis = min + ThreadLocalRandom.current().nextInt(properties().maxRpcTimeoutMs() - min + 1);
return TimeDuration.valueOf(millis, TimeUnit.MILLISECONDS);
}
3. Follower会记录最后一次RPC的时间,如果发现超过了ElectionTimeOut并且符合选举条件,那么就会启动选举流程,调用RaftServerImpl的changeToCandidate方法进入到Candidate身份。
while (isRunning && server.getInfo().isFollower()) {
final TimeDuration electionTimeout = server.getRandomElectionTimeout();
final TimeDuration extraSleep = electionTimeout.sleep();
final boolean isFollower = server.getInfo().isFollower();
synchronized (server) {
if (outstandingOp.get() == 0
&& isRunning
&& lastRpcTime.elapsedTime().compareTo(electionTimeout) >= 0
&& !lostMajorityHeartbeatsRecently()) {
server.changeToCandidate(false);
break;
}
}
}
2. Candidate身份
1. 在RaftServerImpl类中,通过方法changeToCandidate让角色状态机进入到Candidate身份,开启选举流程。对应的Candidate状态代码实现在LeaderElection类中。
synchronized void changeToCandidate(boolean forceStartLeaderElection) {
Preconditions.assertTrue(getInfo().isFollower());
role.shutdownFollowerState();
setRole(RaftPeerRole.CANDIDATE, "changeToCandidate");
if (state.shouldNotifyExtendedNoLeader()) {
stateMachine.followerEvent().notifyExtendedNoLeader(getRoleInfoProto());
}
// start election
role.startLeaderElection(this, forceStartLeaderElection);
}
2. 通过后台的Daemon来进行Leader Election,代码在LeaderElection类中的run()。Leader Election实现了首领选举流程,并做了两个优化:PreVote和优先级选举。用户可以配置关闭Pre-Vote机制和优先级选举机制。
- Pre-Vote机制,为保证在网络分区情况下,不会出现经常性的选举。在正式的选举前,先进行一轮PreVote选举,只有通过了PreVote选举,才会增加Term进行正式的选举。
- 优先级选举机制,为防止出现多个Candidate瓜分选票。为所有的Raft Peer设置优先级,选举的时候,多个不同优先级的Candidate同时出现,高优先级的Candidate将会逼迫低优先级的Candidate进入Follower
选举过程如下:首先进行一轮PreVote,成功之后进行正式的Election,选举为Leader之后状态机会进入Leader身份。在每一轮的选举中,当前的Candidate必须要获得Majority的投票,并且获得所有Priority高于Candidate的Peer的投票,这样才算通过选举。
if (skipPreVote || askForVotes(Phase.PRE_VOTE)) {
if (askForVotes(Phase.ELECTION)) {
server.changeToLeader();
}
}
其中,askForVote方法会向所有的Raft Group中的Peer发送对应的Request Vote RPC,并且收集最终选举结果。
final TermIndex lastEntry = server.getState().getLastEntry();
final Executor voteExecutor = new Executor(this, others.size());
try {
final int submitted = submitRequests(phase, electionTerm, lastEntry, others, voteExecutor);
r = waitForResults(phase, electionTerm, submitted, conf, voteExecutor);
} finally {
voteExecutor.shutdown();
}
其中,SubmitRequests负责发送RPC请求,通过一个固定线程池完成网络任务。waitForResults会在下次ElectionTimeOut之前等待RPC的回复,并根据是否选举成功(Majority+Priority)来返回本次选举结果。如果选举成功,那么就会调用RaftServerImpl的方法changeToLeader进入Leader身份。
3. Leader身份
选举成功后,通过调用RaftServerImpl的方法changeToLeader将角色状态机改为Leader身份,对应的Leader状态代码实现在LeaderState中
synchronized void changeToLeader() {
Preconditions.assertTrue(getInfo().isCandidate());
role.shutdownLeaderElection();
setRole(RaftPeerRole.LEADER, "changeToLeader");
state.becomeLeader();
// start sending AppendEntries RPC to followers
final LogEntryProto e = role.startLeaderState(this);
getState().setRaftConf(e);
}
在成为新的Leader之后,会首先commit一条no-op,保证之前所有的日志被复制到Raft Peers,同时也能精准地获取到Term和Index的信息,返回给ServerStateMachine。
LogEntryProto start() {
// In the beginning of the new term, replicate a conf entry in order
// to finally commit entries in the previous term.
// Also this message can help identify the last committed index and the conf.
final LogEntryProto placeHolder = LogProtoUtils.toLogEntryProto(
server.getRaftConf(), server.getState().getCurrentTerm(), raftLog.getNextIndex());
CodeInjectionForTesting.execute(APPEND_PLACEHOLDER,
server.getId().toString(), null);
raftLog.append(placeHolder);
processor.start();
senders.forEach(LogAppender::start);
return placeHolder;
}
接下来,对于每一个Follower,都启动一个LogAppender的后台线程(LogAppenderDefault.run()),用于向Follower发送对应的日志和snapshot,让Follower的状态跟上Leader的状态。
Raft Log日志管理
Apache Ratis实现了两种不同的日志策略:内存日志和磁盘分段日志。其中内存日志不推荐使用,更多是为了测试而实现的,因此主要看一看SegmentedRaftLog的实现。
Raft Log日志层实现的操作主要就是append和get两个原语,由RaftServer通过append的方式交给日志层实现,由对应的StateMachineUpdater/Leader Log Appender工作线程通过get到最新的日志。
1. 总体结构
SegmentedRaftLog将日志存储在本地磁盘的文件上,以分段文件(segment file)的方式存储,每一个段文件中包含了多个Log Entries。单个段文件有8MB上限,每一条日志大小也有8MB的上限。
当一个段内的日志增多,达到段文件上限8MB之后,会关闭这个段文件,重新开启一个段文件继续写日志。被关闭的段文件不能被修改,只能因为和Leader出现冲突而被truncated。
Ratis提供了两个Cache进行性能优化
- 为当前Open的SegmentFile文件在内存中设立了一个缓存,对应的内存类SegmentFile。目前的缓存策略是简单把整一个SegementFile都缓存在内存,等到写满了的时候再刷到外部磁盘
- 为Raft Entry设置了一个Cache,会将从磁盘读到的日志缓存起来
2. 提交新的日志
Leader向RaftLog提交新的日志,实现在方法appendImpl。方法首先获取日志文件的writeLock,然后
- 提交以TransactionContext作为内容的日志前,允许TransactionContext执行preAppendTransaction钩子方法进行额外逻辑操作
- 向RaftLogSequentialOps.Runner提交一个异步任务appendEntry,由Runner保证线性执行
- 等待异步任务完成,返回给上层StateMachine index。
private long appendImpl(long term, TransactionContext operation) throws StateMachineException {
checkLogState();
try(AutoCloseableLock writeLock = writeLock()) {
final long nextIndex = getNextIndex();
// This is called here to guarantee strict serialization of callback executions in case
// the SM wants to attach a logic depending on ordered execution in the log commit order.
operation = operation.preAppendTransaction();
// build the log entry after calling the StateMachine
final LogEntryProto e = operation.initLogEntry(term, nextIndex);
appendEntry(e);
return nextIndex;
}
}
异步的appendEntry写日志操作如下:
- 将日志提交给内存中的Open Segment缓存,
- 直接更新Log Entry Cache,将这个刚写入的日志缓存起来。注意这两个步骤不能调换,不然会出现不一致。
protected CompletableFuture<Long> appendEntryImpl(LogEntryProto entry) {
try(AutoCloseableLock writeLock = writeLock()) {
validateLogEntry(entry);
// 找到目前的Segment的内存缓冲
final LogSegment currentOpenSegment = cache.getOpenSegment();
if (currentOpenSegment == null) {
// 如果没有Segment,那么新建对应的Segment文件
cache.addOpenSegment(entry.getIndex());
fileLogWorker.startLogSegment(entry.getIndex()); // 向IO Worker提交一个新建Segment文件任务
} else if (isSegmentFull(currentOpenSegment, entry)) {
// 当前的Segment写满了,那么刷盘这个Segment,新建一个Segment
cache.rollOpenSegment(true);
fileLogWorker.rollLogSegment(currentOpenSegment);
}
// If the entry has state machine data, then the entry should be inserted
// to statemachine first and then to the cache. Not following the order
// will leave a spurious entry in the cache.
CompletableFuture<Long> writeFuture =
fileLogWorker.writeLogEntry(entry).getFuture();
cache.appendEntry(entry, LogSegment.Op.WRITE_CACHE_WITHOUT_STATE_MACHINE_CACHE);
return writeFuture;
}
}
3. 读取index日志
由StateMachineUpdater调用get接口获得被commit的日志apply到上层StateMachine。由Leader的LogAppender线程读取成功提交、需要被复制到所有Follower的新日志。
读取日志的时候会首先从Log Entry Cache中读取。如果Cache Miss才会去磁盘中读取
public LogEntryProto get(long index) throws RaftLogIOException {
checkLogState();
final LogSegment segment;
final LogRecord record;
try (AutoCloseableLock readLock = readLock()) {
segment = cache.getSegment(index);
if (segment == null) {
return null;
}
record = segment.getLogRecord(index);
if (record == null) {
return null;
}
final LogEntryProto entry = segment.getEntryFromCache(record.getTermIndex());
if (entry != null) {
return entry;
}
}
// the entry is not in the segment's cache. Load the cache without holding the lock.
return segment.loadCache(record);
}
Apply和Snapshot管理:StateMachineUpdater
这个独立线程不断更新目前Committed的Log,将其apply到上层的StateMachine。同时StateMachineUpdater会根据日志情况为当前制作snapshot,成功后purge掉不再需要的日志条目。其工作循环可以简化为下
for(; state != State.STOP; ) {
waitForCommit();
final MemoizedSupplier<List<CompletableFuture<Message>>> futures = applyLog();
checkAndTakeSnapshot(futures);
}
1. Apply Log
其中applyLog方法负责向上层的StateMachine提交已经提交的日志
private MemoizedSupplier<List<CompletableFuture<Message>>> applyLog() throws RaftLogIOException {
final MemoizedSupplier<List<CompletableFuture<Message>>> futures = MemoizedSupplier.valueOf(ArrayList::new);
final long committed = raftLog.getLastCommittedIndex();
for(long applied; (applied = getLastAppliedIndex()) < committed && state == State.RUNNING && !shouldStop(); ) {
final long nextIndex = applied + 1;
final LogEntryProto next = raftLog.get(nextIndex);
final CompletableFuture<Message> f = server.applyLogToStateMachine(next);
final long incremented = appliedIndex.incrementAndGet(debugIndexChange);
}
return futures;
}
2. Snapshot
当日志超过400000条,或者用户手动触发的时候,就会启动snapshot机制。snapshot机制将会调用上层StateMachine的takeSnapshot()的方法,由SM完成快照并且持久化之后,将Snapshot之后的日志Index返回给底层的RaftLog,然后就可以触发purge任务删除被包含的日志。
private void takeSnapshot() {
final long i;
i = stateMachine.takeSnapshot();
takeSnapshotTimerContext.stop();
server.getSnapshotRequestHandler().completeTakingSnapshot(i);
stateMachine.getStateMachineStorage().cleanupOldSnapshots(snapshotRetentionPolicy);
snapshotIndex.updateIncreasingly(i, infoIndexChange);
final long purgeIndex = i;
raftLog.purge(purgeIndex);
}
构筑上层应用:StateMachine
如果想要在ratis的基础上构建自己的应用程序,例如一个KV Storage Service,那么需要实现StateMachine中的基本操作接口,如下
1. DataApi
默认会把操作以及操作的数据以日志的方式写入到RaftLog。如果应用程序是data-intensive的,那么这可能会导致数据被多次copy,因此暴露DataApi来将操作和数据分开管理。
interface DataApi {
DataApi DEFAULT = new DataApi() {};
default CompletableFuture<ByteString> read(LogEntryProto entry);
default CompletableFuture<?> write(LogEntryProto entry);
default CompletableFuture<DataStream> stream(RaftClientRequest request);
default CompletableFuture<?> link(DataStream stream, LogEntryProto entry);
default CompletableFuture<Void> flush(long logIndex);
default CompletableFuture<Void> truncate(long logIndex);
}
一个具体的例子由FileStore给出。这个状态机存储filename->file content的映射。在写操作的时候,raft原先的做法是把content先commit到日志,然后apply的时候从日志读取内存,最后写入对应的文件。这个过程中出现了多次的copy,IO效率降低。因此,采用DataApi,可以在commit之前直接把数据写入到文件,然后再提交空的log。由StateMachine的上层逻辑来保证提前写入文件的数据的一致性、持久性。
在FileStore的例子中,采取Override startTransaction的方式,在commit日志之前将write操作改成writecommit操作,然后Override write操作,将File Content在日志提交的时候就写入到文件。最后apply的时候进行commit本次操作。
startTransaction();
WriteLog() {
StateMachine.write();
writeRaftLog();
}
apply() {
commitStateMachineWrite();
}
2. EventApi
EventApi是底层Raft出现状态变更(例如Leader变化)的时候告知上层StateMachine的钩子函数
interface EventApi {
EventApi DEFAULT = new EventApi() {};
default void notifyLeaderChanged(RaftGroupMemberId groupMemberId, RaftPeerId newLeaderId) {}
default void notifyTermIndexUpdated(long term, long index) {}
default void notifyConfigurationChanged(long term, long index, RaftConfigurationProto newRaftConfiguration) {}
default void notifyGroupRemove() {}
default void notifyLogFailed(Throwable cause, LogEntryProto failedEntry) {}
}
3. LeaderEventApi
LeaderEventApi是当前的Peer是Leader的时候,告知上层SM出现特殊的Event的钩子函数
interface LeaderEventApi {
LeaderEventApi DEFAULT = new LeaderEventApi() {};
default void notifyFollowerSlowness(RoleInfoProto roleInfoProto) {}
default void notifyNotLeader(Collection<TransactionContext> pendingEntries){}
}
4. FollowerEventApi
FollowerEventApi是当前的Peer是Follower的时候,告知上层SM出现特殊Event的钩子函数
interface FollowerEventApi {
FollowerEventApi DEFAULT = new FollowerEventApi() {};
default void notifyExtendedNoLeader(RoleInfoProto roleInfoProto) {}
default CompletableFuture<TermIndex> notifyInstallSnapshotFromLeader(
RoleInfoProto roleInfoProto, TermIndex firstTermIndexInLog) {}
}
5. 生命周期接口
void initialize(RaftServer raftServer, RaftGroupId raftGroupId, RaftStorage storage);
LifeCycle.State getLifeCycleState();
void pause();
void reinitialize() throws IOException;
6. snapshot接口
SnapshotInfo getLatestSnapshot();
void cleanupOldSnapshots(SnapshotRetentionPolicy snapshotRetentionPolicy) throws IOException;
long takeSnapshot() throws IOException;
7. 查询状态机接口
CompletableFuture<Message> query(Message request);
CompletableFuture<Message> queryStale(Message request, long minIndex);
8. 更改状态机接口
注意这里applyTransaction是RaftLog提交日志的调用接口,这个接口一定会按日志顺序被调用(线性化语义),但是上层StateMachine可以决定对应的transaction的执行方法,因此可以适当异步和并行来提高效率。
// 将用户的请求转化为TransactionContext
TransactionContext startTransaction(RaftClientRequest request) throws IOException;
// 提交日志前可以做的额外逻辑
TransactionContext preAppendTransaction(TransactionContext trx) throws IOException;
// 告知用户Transaction失败
TransactionContext cancelTransaction(TransactionContext trx) throws IOException;
TransactionContext applyTransactionSerial(TransactionContext trx) throws InvalidProtocolBufferException;
// 按顺序提交Transaction。 SM决定对trx的操作顺序
CompletableFuture<Message> applyTransaction(TransactionContext trx);
9. Example: CounterStateMachine
CounterStateMachine实现了一个非常简单的应用状态机:管理一个Integer类型的Counter,状态机接受的操作有Get和Increment
Get操作:重写Query接口
public CompletableFuture<Message> query(Message request) {
String msg = request.getContent().toString(Charset.defaultCharset());
assertEquals(msg, "GET");
return CompletableFuture.completedFuture(
Message.valueOf(counter.toString()));
}
Increment操作:重写applyTransaction接口
public CompletableFuture<Message> applyTransaction(TransactionContext trx) {
final RaftProtos.LogEntryProto entry = trx.getLogEntry();
//check if the command is valid
String logData = entry.getStateMachineLogEntry().getLogData()
.toString(Charset.defaultCharset());
assertEquals(logData, "INCRMENT");
//update the last applied term and index
final long index = entry.getIndex();
updateLastAppliedTermIndex(entry.getTerm(), index);
//actual execution of the command: increment the counter
counter.incrementAndGet();
//return the new value of the counter to the client
final CompletableFuture<Message> f =
CompletableFuture.completedFuture(Message.valueOf(counter.toString()));
return f;
}
Snapshot接口:
public long takeSnapshot() {
//get the last applied index
final TermIndex last = getLastAppliedTermIndex();
//create a file with a proper name to store the snapshot
final File snapshotFile =
storage.getSnapshotFile(last.getTerm(), last.getIndex());
//serialize the counter object and write it into the snapshot file
try (ObjectOutputStream out = new ObjectOutputStream(
new BufferedOutputStream(new FileOutputStream(snapshotFile)))) {
out.writeObject(counter);
} catch (IOException ioe) {
LOG.warn("Failed to write snapshot file \"" + snapshotFile
+ "\", last applied index=" + last);
}
//return the index of the stored snapshot (which is the last applied one)
return last.getIndex();
}
Multi-Raft实现
单组成员变更
Ratis允许一次变更多个Peer成员,使用了Raft Paper中采用的两阶段成员变更Config(Old, New)的方式,具体的调用入口为
public RaftClientReply setConfiguration(SetConfigurationRequest request) throws IOException {
return waitForReply(request, setConfigurationAsync(request));
}
对应的setConfigurationAsync实现可以简化为:首先初始化新Peer的必要的数据结构,例如RPC地址、LogAppender线程等内容。
final RaftConfigurationImpl current = getRaftConf();
getRaftServer().addRaftPeers(peersInNewConf);
// add staging state into the leaderState
pending = leaderState.startSetConfiguration(request);
Collection<RaftPeer> newPeers = configurationStagingState.getNewPeers();
// set the staging state
this.stagingState = configurationStagingState;
if (newPeers.isEmpty()) {
applyOldNewConf();
} else {
// update the LeaderState's sender list
addAndStartSenders(newPeers);
}
等完全初始化之后,就提交一个Config(Old,New)日志,等待被apply
final ServerState state = server.getState();
final RaftConfigurationImpl current = state.getRaftConf();
final RaftConfigurationImpl oldNewConf= stagingState.generateOldNewConf(current, state.getLog().getNextIndex());
// apply the (old, new) configuration to log, and use it as the current conf
long index = state.getLog().append(state.getCurrentTerm(), oldNewConf);
updateConfiguration(index, oldNewConf);
最终会被上层ApplyLog的时候捕捉,将最新的Config写入到持久化文件中,使用对应的Config
if (next.hasConfigurationEntry()) {
// the reply should have already been set. only need to record
// the new conf in the metadata file and notify the StateMachine.
state.writeRaftConfiguration(next);
stateMachine.event().notifyConfigurationChanged(next.getTerm(), next.getIndex(), next.getConfigurationEntry());
}
注意,在过渡阶段,Config(Old, New)会影响赢得选举的条件。即一个Peer必须在Old和New两个Config中都获得Majority才能赢得选举。
boolean hasMajority(Collection<RaftPeerId> others, RaftPeerId selfId) {
Preconditions.assertTrue(!others.contains(selfId));
return conf.hasMajority(others, selfId) &&
(oldConf == null || oldConf.hasMajority(others, selfId));
}
多成员组管理
RaftServer接口定义了一个Raft-Server端所需要完成的所有接口,对应的实现类是RaftServerImpl。这个类完成了一个基于Raft的服务端,其中RoleInfo代表了该Server的角色,StateMachine代表了上层应用的数据状态机,RaftLog代表了这个Server的Raft日志。
RaftServerProxy同样实现了RaftServer接口,该类是Multi-Raft实现入口。通过维护一个Map<GroupId, List<RaftServerImpl>>来构建多个RaftGroup和RaftGroupMember的映射关系。每一个Group中的每一个Member都是一个RaftServerImpl,有独立的RaftLog和StateMachine。
工厂方法build()最终会返回一个RaftServerProxy的实例
public RaftServer build() throws IOException {
return newRaftServer(
serverId,
group,
Objects.requireNonNull(stateMachineRegistry , "Neither 'stateMachine' nor 'setStateMachineRegistry' " +
"is initialized."),
Objects.requireNonNull(properties, "The 'properties' field is not initialized."),
parameters);
}
RaftServerProxy接受对应客户端请求,修改Group management的入口在
public CompletableFuture<RaftClientReply> groupManagementAsync(GroupManagementRequest request) {
final RaftGroupId groupId = request.getRaftGroupId();
final GroupManagementRequest.Add add = request.getAdd();
if (add != null) {
return groupAddAsync(request, add.getGroup());
}
final GroupManagementRequest.Remove remove = request.getRemove();
if (remove != null) {
return groupRemoveAsync(request, remove.getGroupId(),
remove.isDeleteDirectory(), remove.isRenameDirectory());
}
}
以Add Group为例,收到这个请求的RaftServerProxy会新建一个RaftServerImpl并且以Follower的身份启动。因此推测如果要新加一个Group,需要对所有的对应Peer发送AddGroup的请求。对于remove group来说亦是如此。
impls.addNew(newGroup)
.thenApplyAsync(newImpl -> {
final boolean started = newImpl.start();
return newImpl.newSuccessReply(request);
});
synchronized CompletableFuture<RaftServerImpl> addNew(RaftGroup group) {
final RaftGroupId groupId = group.getGroupId();
final CompletableFuture<RaftServerImpl> newImpl = newRaftServerImpl(group);
final CompletableFuture<RaftServerImpl> previous = map.put(groupId, newImpl);
return newImpl;
}