MongoDB更新同步的oplog位置

在MongoDB中, 副本集节点之间为了保持一致性, 需要通过oplog的同步与回放来进行。MongoDB采用的是节点向源节点主动拉取的方式, 从源节点拉取oplog, 目的节点需要及时通知其他节点它的最新的同步到的时间点。
在这里插入图片描述
如上图所示, 2个Secondary从Primary上面拉取oplog,每当secondary的时间点发生改变, 会调用replSetUpdatePosition来告诉
在mongod内, 有一个专门的名字为SyncSourceFeedback的线程,它负责向与节点汇报当前节点的进度, Primary本身是不需要的, 因为它不需向其他节点同步数据,当然不存储数据的节点, 例如Arbiter类型的节点也不需要。 有2个类专门负责这项任务: SyncSourceFeedback与Reporter, 其调用关系如下图所示:
在这里插入图片描述

SyncSourceFeedback

SyncSourceFeedback负责:

  • 节点是否需要向其源节点汇报位置;
  • 节点角色的转换, 比如一个节点从secondary转化为primary, 就不需要继续汇报位置了;
  • 同步源发生了切换, 比如原本从A节点同步, 后来变成从B同步;
  • 调用Reporter汇报位置, 它本身将具体的汇报工作交给Reporter来做;
void SyncSourceFeedback::run(executor::TaskExecutor* executor,
                             BackgroundSync* bgsync,
                             ReplicationCoordinator* replCoord) {
    Client::initThread("SyncSourceFeedback");

    HostAndPort syncTarget;

    // keepAliveInterval indicates how frequently to forward progress in the absence of updates.
    Milliseconds keepAliveInterval(0);

    while (true) {  // breaks once _shutdownSignaled is true
        // 判断节点的状态, 确定要不要汇报位置
        {  
            while (!_positionChanged && !_shutdownSignaled) {
          
                MemberState state = replCoord->getMemberState();
                if (!(state.primary() || state.startup())) {
                    break;
                }
            }

           //是否程序退出
            if (_shutdownSignaled) {
                break;
            }

            _positionChanged = false;
        }

        {
            stdx::lock_guard<stdx::mutex> lock(_mtx);
            MemberState state = replCoord->getMemberState();
            if (state.primary() || state.startup()) {
                continue;
            }
        }

       // 源节点是否发生了变化
        const HostAndPort target = bgsync->getSyncTarget();
        if (target.empty()) {
            if (syncTarget != target) {
                syncTarget = target;
            }
            // Loop back around again; the keepalive functionality will cause us to retry
            continue;
        }

        if (syncTarget != target) {
            LOG(1) << "setting syncSourceFeedback to " << target;
            syncTarget = target;
        }

        // 产生Reporter
        Reporter reporter(executor,
                          makePrepareReplSetUpdatePositionCommandFn(replCoord, syncTarget, bgsync),
                          syncTarget,
                          keepAliveInterval,
                          syncSourceFeedbackNetworkTimeoutSecs);
        {
            stdx::lock_guard<stdx::mutex> lock(_mtx);
            if (_shutdownSignaled) {
                break;
            }
            _reporter = &reporter;
        }
        //上报位置信息
        auto status = _updateUpstream(&reporter);
    }
}

Status SyncSourceFeedback::_updateUpstream(Reporter* reporter) {
    auto syncTarget = reporter->getTarget();

    auto triggerStatus = reporter->trigger();
    if (!triggerStatus.isOK()) {
        warning() << "unable to schedule reporter to update replication progress on " << syncTarget
                  << ": " << triggerStatus;
        return triggerStatus;
    }

    auto status = reporter->join();

    if (!status.isOK()) {
        log() << "SyncSourceFeedback error sending update to " << syncTarget << ": " << status;
    }

    // Sync source blacklisting will be done in BackgroundSync and SyncSourceResolver.

    return status;
}

Reporter

Reporter主要调用executor::TaskExecutor来完成command的request, callback以及response。
command是通过TopologyCoordinator::prepareReplSetUpdatePositionCommand来实现, 然后通过Reporter::trigger()开始一个command, Reporter::join()等待结束。

Status Reporter::join() {
    stdx::unique_lock<stdx::mutex> lk(_mutex);
    _condition.wait(lk, [this]() { return !_isActive_inlock(); });
    return _status;
}

Status Reporter::trigger() {
    if (_keepAliveTimeoutWhen != Date_t()) {
        // Reset keep alive expiration to signal handler that it was canceled internally.
        invariant(_prepareAndSendCommandCallbackHandle.isValid());
        _keepAliveTimeoutWhen = Date_t();
        _executor->cancel(_prepareAndSendCommandCallbackHandle);
        return Status::OK();
    } else if (_isActive_inlock()) {
        _isWaitingToSendReporter = true;
        return Status::OK();
    }

    auto scheduleResult =
        _executor->scheduleWork([=](const executor::TaskExecutor::CallbackArgs& args) {
            _prepareAndSendCommandCallback(args, true);
        });

    _status = scheduleResult.getStatus();

    _prepareAndSendCommandCallbackHandle = scheduleResult.getValue();

    return _status;
}

void Reporter::_prepareAndSendCommandCallback(const executor::TaskExecutor::CallbackArgs& args,
                                              bool fromTrigger) {
    // Must call without holding the lock.
    auto prepareResult = _prepareCommand();

    _sendCommand_inlock(prepareResult.getValue(), _updatePositionTimeout);
    if (!_status.isOK()) {
        _onShutdown_inlock();
        return;
    }

    invariant(_remoteCommandCallbackHandle.isValid());
    _prepareAndSendCommandCallbackHandle = executor::TaskExecutor::CallbackHandle();
    _keepAliveTimeoutWhen = Date_t();
}

void Reporter::_sendCommand_inlock(BSONObj commandRequest, Milliseconds netTimeout) {
    LOG(2) << "Reporter sending slave oplog progress to upstream updater " << _target << ": "
           << commandRequest;

    auto scheduleResult = _executor->scheduleRemoteCommand(
        executor::RemoteCommandRequest(_target, "admin", commandRequest, nullptr, netTimeout),
        [this](const executor::TaskExecutor::RemoteCommandCallbackArgs& rcbd) {
            _processResponseCallback(rcbd);
        });

    _status = scheduleResult.getStatus();
    _remoteCommandCallbackHandle = scheduleResult.getValue();
}

从上面的代码看到, 基本上所有的功能都是通过executor::TaskExecutor* const _executor 来实现的, 最终通过executor::TaskExecutor::scheduleRemoteCommand完成调用。

replSetUpdatePosition同步的内容

prepareReplSetUpdatePositionCommand是通过TopologyCoordinator::prepareReplSetUpdatePositionCommand来完成,主要是把std::vector _memberData里面的ApplyTime, DurableTime以及其他的每个副本集节点信息发送给源节点。
如下是日志里面打印的该command信息:
在这里插入图片描述

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值