初见mongodb: find命令源码一撇

在上周尽管磕磕碰碰地完成了一次查mongo询任务,但总想一谈工具背后的真相,即为什么Mongodb不支持sql?虽然我知道它是NoSQL系列的, 但还是好奇着它的查询是怎么实现的. 上网找了mongodb的代码,最新版的是4.0.5. https://fastdl.mongodb.org/src/mongodb-src-r4.0.5.zip
   匆匆一瞥,尽管Mongodb的源码有超过一万个C/C++的文件,超过三百万行代码.但是带着问题找源码还是比较容易的,何况我只是对它的查询感到好奇而已.

准备

进入了源代码src/mongo, 这时c/c++代码已经剩下不到4千个文件了.

它是一个数据库服务器, 于是直奔db目录. 大概浏览一下文件,

发现了代码入口 dbmain.cpp.

从这里开始,不打算按部就班地阅读代码,那样工作量太大.

一般服务器的工作流程

首先, Mongodb作为一个数据库服务器,也符合一般服务器的工作方式.

Yes
NO
main
初始化,读配置文件等
启动网络等待
等待网络连接
启动客户端

服务器工作流程在mongodb中的实现

顺着这个逻辑,很容易找到mongdo的执行流程
在src/mongo/db/目录下

main, dbmain.cpp
mongoDbMain,db.cpp
initAndListen
_initAndListen
serviceContext->getTransportLayer()->start()
serviceContext->getServiceEntryPoint()->start()

最后的代码是:

ExitCode _initAndListen(int listenPort) {
    Client::initThread("initandlisten");
	...
		start = serviceContext->getTransportLayer()->start();
        if (!start.isOK()) {
            error() << "Failed to start the listener: " << start.toString();
            return EXIT_NET_ERROR;
        }

        start = serviceContext->getServiceEntryPoint()->start();
        if (!start.isOK()) {
            error() << "Failed to start the service entry point: " << start;
            return EXIT_NET_ERROR;
        }
	....
	return waitForShutdown();
和预期的不一样,没有找到循环语句. 似乎线索在这里中断了. 但其实并不怎么影响代码阅读, 回过头来检查代码
int mongoDbMain(int argc, char* argv[], char** envp) {
    ....
	//在这里设置了ServiceEntryPointMongod 作为ServiceEntryPoint
    service->setServiceEntryPoint(std::make_unique<ServiceEntryPointMongod>(service));
	...
	ExitCode exitCode = initAndListen(serverGlobalParams.port);
    exitCleanly(exitCode);
    return 0;
}
ExitCode _initAndListen(int listenPort) {
	....
	
	auto tl = //在这里创建TransportLayer
		transport::TransportLayerManager::createWithConfig(&serverGlobalParams, serviceContext);
	auto res = tl->setup();
	if (!res.isOK()) {
		error() << "Failed to set up listener: " << res;
		return EXIT_NET_ERROR;
	}
	serviceContext->setTransportLayer(std::move(tl));//在这里设置了TransportLayer
std::unique_ptr<TransportLayer> TransportLayerManager::createWithConfig(
 	...
	//在这里创建了TransportLayerASIO
    auto transportLayerASIO = stdx::make_unique<transport::TransportLayerASIO>(opts, sep);
    transportLayer = std::move(transportLayerASIO);
	.....
    std::vector<std::unique_ptr<TransportLayer>> retVector;
    retVector.emplace_back(std::move(transportLayer));
    return stdx::make_unique<TransportLayerManager>(std::move(retVector));
}

于是在src/mongo/transport/transport_layer_asio.cpp 里发现消息循环

Status TransportLayerASIO::start() {
	.....
	_listenerThread = stdx::thread([this] {
		setThreadName("listener");
		while (_running.load()) {//消息循环在这里
			_acceptorReactor->run();
		}
	});
	....
	return Status::OK();
}
void TransportLayerASIO::_acceptConnection(GenericAcceptor& acceptor) {
   ....
        try {
            std::shared_ptr<ASIOSession> session(
				//在这里创建新的Session,代表一个客户端连接
                new ASIOSession(this, std::move(peerSocket), true));
            _sep->startSession(std::move(session));
        } catch (const DBException& e) {
            warning() << "Error accepting new connection " << e;
        }

        _acceptConnection(acceptor);
    };

    acceptor.async_accept(*_ingressReactor, std::move(acceptCb));
}
在这里处理客户端请求
DbResponse ServiceEntryPointMongod::handleRequest(OperationContext* opCtx, const Message& m) {
    return ServiceEntryPointCommon::handleRequest(opCtx, m, Hooks{});
}
在这里解析客户端命令
//service_entry_point_common.cpp
DbResponse ServiceEntryPointCommon::handleRequest(OperationContext* opCtx,
                                                  const Message& m,
                                                  const Hooks& behaviors) {
	.....
    DbMessage dbmsg(m);
	.....
    DbResponse dbresponse;
    if (op == dbMsg || op == dbCommand || (op == dbQuery && isCommand)) {
        dbresponse = receivedCommands(opCtx, m, behaviors); //这是一个非常重要的函数,查询命令, aggregate,map/reduce 命令都会经过这里
    }
	....
//service_entry_point_common.cpp
DbResponse receivedCommands(OperationContext* opCtx,
                            const Message& message,
                            const ServiceEntryPointCommon::Hooks& behaviors) {
	....
	// 获得command
	c = CommandHelpers::findCommand(request.getCommandName()
	....
	//执行真正的命令
	execCommandDatabase(opCtx, c, request, replyBuilder.get(), behaviors);
	....
回顾一下:
  • MongoDB的源代码中,将网络传输层包装了起来. 抽象成了TransportLayer, 目前只有一个具体的子类TransportLayerASIO. 以后也许可能会增加新的传输层.
  • 对协议的命令抽象成了ServiceEntryPoint, DB服务器用到的是ServiceEntryPointMongod. 处理请求的函数是ServiceEntryPointMongod::handleRequest,它直接转手给了ServiceEntryPointCommon::handleRequest.
    也就是, ServiceEntryPointCommon::handleRequest将是查询命令故事开始的地方.

查询语句find的处理:

还是一样的不打算按部就班地阅读代码. 通过简单的list命令发现了commands目录,有理由相信所有命令的处理都在这里.
发现find_cmd.cpp

//find_cmd.cpp
class FindCmd : public BasicCommand {
public:
	//看上去像是注册了find命令.
	//另外发现run函数,有理由相信run是find的处理函数 
    FindCmd() : BasicCommand("find") {}
	....
}

run 函数前的注释说明了find的执行过程.从注释中可以知道,find命令按条件过滤完数据(文档)后, 没有做其他处理,直接返回给了客户端.
也就是说,没法实现SQL语句的group by, Sum等复杂用法.
也可以顺着代码大致看一眼.

//find_cmd.cpp

/**
     * Runs a query using the following steps:
     *   --Parsing.
     *   --Acquire locks.
     *   --Plan query, obtaining an executor that can run it.
     *   --Generate the first batch.
     *   --Save state for getMore, transferring ownership of the executor to a ClientCursor.
     *   --Generate response to send to the client.
     */
bool run(OperationContext* opCtx,
		 const std::string& dbname,
		 const BSONObj& cmdObj,
		 BSONObjBuilder& result) override {
	......
	//从这里才拿到Request,也就是查询还没开始.
	const QueryRequest& originalQR = exec->getCanonicalQuery()->getQueryRequest();

	// Stream query results, adding them to a BSONArray as we go.
	CursorResponseBuilder firstBatch(/*isInitialResponse*/ true, &result);
	BSONObj obj;
	PlanExecutor::ExecState state = PlanExecutor::ADVANCED;
	long long numResults = 0;
	while (!FindCommon::enoughForFirstBatch(originalQR, numResults) &&
		   PlanExecutor::ADVANCED == (state = exec->getNext(&obj, NULL))) {
		// If we can't fit this result inside the current batch, then we stash it for later.
		if (!FindCommon::haveSpaceForNext(obj, numResults, firstBatch.bytesUsed())) {
			exec->enqueue(obj);
			break;
		}

		// Add result to output buffer.
		firstBatch.append(obj);//将文档加入firstBatch!!!!!,值得关注firstBatch的动向
		numResults++;
	}
	..........
	// Generate the response object to send to the client.
		//已经将firstBatch发送给客户端了,中间没什么处理. 
		//也就是只能在exec->getNext里根据find参数过滤文档了
	firstBatch.done(cursorId, nss.ns());
	return true;
}

其实至此已经能远远地看清MongoDB find命令的执行过程了. 也能看到它的查询处理比较简单,仅仅只是执行了简单的文档过滤之后就将结果返回给客户端了.没有笛卡尔乘积的相关代码,做不了复杂的关联查询.

只不过我们要更严谨一点. 要弄清

  1. 命令的注册过程,确定find命令一定是由find_cmd.cpp来完成的.
  2. 从网络消息到find命令处理的中间步骤过程
  3. exec->getNext真的执行了过滤操作?

下面开始填坑过程:

find命令的注册:

//find_cmd.cpp
class FindCmd : public BasicCommand {
public:
   FindCmd() : BasicCommand("find") {}
	....
} findCmd;// 这里生成了一个全局对象findCmd, 同时会调用FindCmd的构造函数,
//进而调用BasicCommand的构造函数, 再调用Command的构造函数,完成命令的注册
// commands.h
class BasicCommand : public Command {
// commands.h
/**
 * Serves as a base for server commands. See the constructor for more details.
 */
class Command {
public:
...
    /**
     * Constructs a new command and causes it to be registered with the global commands list. It is
     * not safe to construct commands other than when the server is starting up.
     *
     * @param oldName an optional old, deprecated name for the command
     */
    Command(StringData name, StringData oldName = StringData());
// commands.cpp
Command::Command(StringData name, StringData oldName)
	...
	//在这里注册了命令与处理函数
    globalCommandRegistry()->registerCommand(this, name, oldName);
}
// commands.cpp
void CommandRegistry::registerCommand(Command* command, StringData name, StringData oldName) {
    for (StringData key : {name, oldName}) {
        ...
        auto hashedKey = CommandMap::HashedKey(key);
		...
        _commands[hashedKey] = command;
    }
}

从网络请求到find的处理, find命令的解析:

// service_entry_point_common.cpp
DbResponse ServiceEntryPointCommon::handleRequest(OperationContext* opCtx,
                                                  const Message& m,
                                                  const Hooks& behaviors) {
	...
	if (op == dbMsg || op == dbCommand || (op == dbQuery && isCommand)) {
      dbresponse = receivedCommands(opCtx, m, behaviors);
	...
// service_entry_point_common.cpp
DbResponse receivedCommands(OperationContext* opCtx,
                            const Message& message,
                            const ServiceEntryPointCommon::Hooks& behaviors) {
    auto replyBuilder = rpc::makeReplyBuilder(rpc::protocolForMessage(message));
    [&] {
		...
		//通过"find", 获得全局对象findCmd
		c = CommandHelpers::findCommand(request.getCommandName())
		...
	}();//挺有意思的C++11语法, 匿名函数调用
//src/mongo/db/commands.cpp
Command* CommandHelpers::findCommand(StringData name) {
    return globalCommandRegistry()->findCommand(name);
}
.....
Command* CommandRegistry::findCommand(StringData name) const {
    auto it = _commands.find(name);
    if (it == _commands.end())
        return nullptr;
    return it->second;
}

从网络请求到find的处理, find命令的调用:

// service_entry_point_common.cpp
DbResponse receivedCommands(OperationContext* opCtx,
                            const Message& message,
                            const ServiceEntryPointCommon::Hooks& behaviors) {
    auto replyBuilder = rpc::makeReplyBuilder(rpc::protocolForMessage(message));
    [&] {
		...
		//通过"find", 获得全局对象findCmd
		c = CommandHelpers::findCommand(request.getCommandName())
		...
		//在这里执行命令
		runCommandImpl(opCtx,
							invocation.get(),
							request,
							replyBuilder,
							startOperationTime,
							behaviors,
							&extraFieldsBuilder,
							sessionOptions)) ;
		...
	}()
}

bool runCommandImpl(OperationContext* opCtx,
                    CommandInvocation* invocation,
                    const OpMsgRequest& request,
                    rpc::ReplyBuilderInterface* replyBuilder,
                    LogicalTime startOperationTime,
                    const ServiceEntryPointCommon::Hooks& behaviors,
                    BSONObjBuilder* extraFieldsBuilder,
                    const boost::optional<OperationSessionInfoFromClient>& sessionOptions) {
	....
	// 从这里进入FindCmd::run()
	invocation->run(opCtx, &crb);
	..
}

数据的查询过滤,以下开启gdb大法,一路跟踪下去了:

// commands/find_cmd.cpp
bool run(OperationContext* opCtx,
		 const std::string& dbname,
		 const BSONObj& cmdObj,
		 BSONObjBuilder& result) override {
	...
	// 注意: exec->getNext
	while (!FindCommon::enoughForFirstBatch(originalQR, numResults) &&
		   PlanExecutor::ADVANCED == (state = exec->getNext(&obj, NULL))) {
		// If we can't fit this result inside the current batch, then we stash it for later.
		if (!FindCommon::haveSpaceForNext(obj, numResults, firstBatch.bytesUsed())) {
			exec->enqueue(obj);
			break;
		}

		// Add result to output buffer.
		firstBatch.append(obj);
		numResults++;
	}
	...
}
// query/plan_executor.cpp
PlanExecutor::ExecState PlanExecutor::getNext(BSONObj* objOut, RecordId* dlOut) {
    ...
    ExecState state = getNextImpl(objOut ? &snapshotted : NULL, dlOut);
	...
    return state;
}
...
PlanExecutor::ExecState PlanExecutor::getNextImpl(Snapshotted<BSONObj>* objOut, RecordId* dlOut) {
	.....
	//  _root->work是真正干活的地方
	PlanStage::StageState code = _root->work(&id);
	.....
}
// exec/plan_stage.cpp
PlanStage::StageState PlanStage::work(WorkingSetID* out) {
    ...
    StageState workResult = doWork(out);
	...
    return workResult;
}
// exec/collection_scan.cpp
PlanStage::StageState CollectionScan::doWork(WorkingSetID* out) {
 ....
 record = _cursor->next();
...
	return returnIfMatches(member, id, out);
}
........
PlanStage::StageState CollectionScan::returnIfMatches(WorkingSetMember* member,
                                                      WorkingSetID memberID,
                                                      WorkingSetID* out) {
    ++_specificStats.docsTested;

    if (Filter::passes(member, _filter)) {
  		//找到了
        return PlanStage::ADVANCED;
    } else if (_endCondition && Filter::passes(member, _endCondition.get())) {
        //文件结尾
        return PlanStage::IS_EOF;
    } else {
        // 没找到. 找下一条吧.
        return PlanStage::NEED_TIME;
    }
}
  • 1
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 1
    评论
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值