mongodb一致性协议_MongoDB复制集同步原理

同步过程

选取从哪个节点同步后,拉取oplog

1.Applies the op执行这个op日志

2.Writes the op to its own oplog (also local.oplog.rs)将这个op日志写入到自己的oplog中

3.Requests the next op请求下一个op日志secondary节点同步到哪了

secondary节点同步到哪了

主节点根据从节点获取oplog的时间戳可以判断数据同步到哪了

How does primary know where secondary is synced to? Well, secondary is querying primary‘s oplog for more results. So, if secondary requests an op written at 3pm, primary knows seconday has replicated all ops written before 3pm.

So, it goes like:

1.Do a write on primary.

2.Write is written to the oplog on primary, with a field “ts” saying the write occurred at time t.

3.{getLastError:1,w:2} is called on primary. primary has done the write, so it is just waiting for one more server to get the write (w:2).

4.secondary queries the oplog on primary and gets the op

5.secondary applies the op from time t

6.secondary requests ops with {ts:{$gt:t}} from primary‘s oplog

7.primary updates that secondary has applied up to t because it is requesting ops > t.

8.getLastError notices that primary and secondary both have the write, so w:2 is satisfied and it returns.

同步原理

如果A从B同步数据,B从C同步,C怎么知道A同步到哪了?看oplog读取协议:

当A从B同步数据,B对A说,我从年度oplog同步数据,如果你有写操作,告诉我一下。

B回答,我不是主节点,等我转发一下;B就跟主节点C说,就当做我是A,我代表A从你这同步数据。这时B与主节点C有两个连接,一个是B自己的,一个是代表A的。

A向B请求ops(写操作),B就转向C,这样来完成A的请求。

A            B          C

<====>

<====>

<====> 是”真正”的同步连接. “ghost” connection,B代表A与C的连接。

初始化同步

新增成员或者重做同步的时候,会进行初始化同步。

如下7步:

1.Check the oplog. If it is not empty, this node does not initial sync, it just starts syncing normally. If the oplog is empty, then initial sync is necessary, continue to step #2:检查oplog,如果空的,需要进行初始化同步,否则进行普通的同步。

2.Get the latest oplog time from the source member: call this time start.取同步来源节点最新的oplog time,标记为start

3.Clone all of the data from the source member to the destination member.复制所有数据到目标节点

4.Build indexes on destination. 目标节点建索引,2.0版本包含在复制数据步骤里,2.2在复制数据后建索引。

5.Get the latest oplog time from the sync target, which is called minValid.取目标节点最新的oplog time,标记为minValid

6.Apply the sync target’s oplog from start to minValid.在目标节点执行start 到minValid之间的oplog

7.Become a “normal” member (transition into secondary state).成为正常的成员

个人理解,start 到minValid之间的oplog是复制过来的没有执行的oplog,没有完成最终一致性的那部分,就是一个oplog replay的过程。

查看源码rs_initialsync.cpp,同步初始化步骤如下:

/**

Do the initial sync for this member. There are several steps to this process:

*

Record start time.

Clone.

Set minValid1 to sync target’s latest op time.

Apply ops from start to minValid1, fetching missing docs as needed.

Set minValid2 to sync target’s latest op time.

Apply ops from minValid1 to minValid2.

Build indexes.

Set minValid3 to sync target’s latest op time.

Apply ops from minValid2 to minValid3.

*

At that point, initial s

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值