查询计划的生成, 主要是通过QueryPlanner::plan 作为入口, 里面包含了基本的逻辑, 但是真正的QuerySolutionNode 是通过QueryPlannerAccess类来实现。具体的实现非常的复杂, 我们先看一下
下图的整个调用过程, 以及QueryPlanner的结构图, 然后再诸葛的分析一下相关的代码实现。
1. tailable 有设定的情况
通过makeCollectionScan产生QuerySolution;
2. 指定了hint, 并且hint使用了$natural 操作符
直接使用collectionScan的方式, 一开始不知道为什么是这样, 看了文档才知道, 就是一个设定。
通过makeCollectionScan产生QuerySolution;
[natural操作符用法文档](https://docs.mongodb.com/manual/reference/operator/meta/natural/%20natural%E6%93%8D%E4%BD%9C%E7%AC%A6)
3. 指定了snapshot
如果指定了snapshot,并且没有特别要求用_id index (MMapV1使用), 也是使用CollectionScan的方式,通过makeCollectionScan产生QuerySolution;否则, 就是用index的方式;
4. 指定min & max
计算出来index的实际的最大最小值, 通过makeIndexScan产生QuerySolution;
5. QueryPlannerIXSelect
程序走到这里, MatchExpression的每一个节点对应的TagData*或者RelevantTag* 字段还是空的, 我们需要找出与这个节点对应的index, 具体的实现在QueryPlannerIXSelect::rateIndices:
void QueryPlannerIXSelect::rateIndices(MatchExpression* node,
string prefix,
const vector<IndexEntry>& indices) {
// Do not traverse tree beyond logical NOR node
MatchExpression::MatchType exprtype = node->matchType();
if (exprtype == MatchExpression::NOR) {
return;
}
// Every indexable node is tagged even when no compatible index is
// available.
if (Indexability::isBoundsGenerating(node)) {
string fullPath;
if (MatchExpression::NOT == node->matchType()) {
fullPath = prefix + node->getChild(0)->path().toString();
} else {
fullPath = prefix + node->path().toString();
}
verify(NULL == node->getTag());
RelevantTag* rt = new RelevantTag();
node->setTag(rt);
rt->path = fullPath;
// TODO: This is slow, with all the string compares.
for (size_t i = 0; i < indices.size(); ++i) {
BSONObjIterator it(indices[i].keyPattern);
BSONElement elt = it.next();
if (elt.fieldName() == fullPath && compatible(elt, indices[i], node)) {
rt->first.push_back(i);
}
while (it.more()) {
elt = it.next();
if (elt.fieldName() == fullPath && compatible(elt, indices[i], node)) {
rt->notFirst.push_back(i);
}
}
}
// If this is a NOT, we have to clone the tag and attach
// it to the NOT's child.
if (MatchExpression::NOT == node->matchType()) {
RelevantTag* childRt = static_cast<RelevantTag*>(rt->clone());
childRt->path = rt->path;
node->getChild(0)->setTag(childRt);
}
} else if (Indexability::arrayUsesIndexOnChildren(node)) {
// See comment in getFields about all/elemMatch and paths.
if (!node->path().empty()) {
prefix += node->path().toString() + ".";
}
for (size_t i = 0; i < node->numChildren(); ++i) {
rateIndices(node->getChild(i), prefix, indices);
}
} else if (node->isLogical()) {
for (size_t i = 0; i < node->numChildren(); ++i) {
rateIndices(node->getChild(i), prefix, indices);
}
}
}
从上述的代码可以看到, 对于每一个节点, 我们把第一个匹配的index放在RelevantTag 的first
字段, 其他的匹配的index放进notfirst 字段;如果是NOT, 找到他的子节点; 如果是逻辑节点, 使用它的所有的子节点进行深度优先的遍历。
在这之后, 还可以通过 QueryPlannerIXSelect::stripInvalidAssignments 去掉partialindex 或者2Dsparedeng 被认为invalid的赋值;
还有另外一个优化index的函数: QueryPlannerIXSelect::stripUnneededAssignments, 它会找出AND几点的EQ操作, 如果某个field的EQ操作只有一个index, 就去掉该节点的其他index:
void QueryPlannerIXSelect::stripUnneededAssignments(MatchExpression* node,
const std::vector<IndexEntry>& indices) {
if (MatchExpression::AND == node->matchType()) {
for (size_t i = 0; i < node->numChildren(); i++) {
MatchExpression* child = node->getChild(i);
if (MatchExpression::EQ != child->matchType()) {
continue;
}
if (!child->getTag()) {
continue;
}
// We found a EQ child of an AND which is tagged.
RelevantTag* rt = static_cast<RelevantTag*>(child->getTag());
// Look through all of the indices for which this predicate can be answered with
// the leading field of the index.
for (std::vector<size_t>::const_iterator i = rt->first.begin(); i != rt->first.end();
++i) {
size_t index = *i;
if (indices[index].unique && 1 == indices[index].keyPattern.nFields()) {
// Found an EQ predicate which can use a single-field unique index.
// Clear assignments from the entire tree, and add back a single assignment
// for 'child' to the unique index.
clearAssignments(node);
RelevantTag* newRt = static_cast<RelevantTag*>(child->getTag());
newRt->first.push_back(index);
// Tag state has been reset in the entire subtree at 'root'; nothing
// else for us to do.
return;
}
}
}
}
for (size_t i = 0; i < node->numChildren(); i++) {
stripUnneededAssignments(node->getChild(i), indices);
}
}
6. PlanEnumerator
PlanEnumerator 类将MatchExpression 抽象为PredicateAssignment, OrAssignment,ArrayAssignment以及AndAssignment 4种赋值类型, 针对每一种类型做相应的处理, 具体是, 找出每一个叶子节点的所有的index, 放进一个数组里面, 依次遍历每一个index, 就能够enumerator 罗列出来所有可能的情形。 这里以最简单的PredicateAssignment为例子:
struct PredicateAssignment {
PredicateAssignment() : indexToAssign(0) {}
std::vector<IndexID> first;
// Not owned here.
MatchExpression* expr;
// Enumeration state. An indexed predicate's possible states are the indices that the
// predicate can directly use (the 'first' indices). As such this value ranges from 0
// to first.size()-1 inclusive.
size_t indexToAssign;
};
vector 类型的first李曼存了某个field的index的ID, 每一次罗列出其中的一个indexID, 通过该ID找到相应的index, 就是该节点的一个QuerySolution。indexToAssign记录了遍历过程中, 使用到了第几个index, 他的值为: 0 ~ sizeof(first)-1;
每一次调用PlanEnummerator::getNext, indexToAssign会加1, 若果indexToAssign < sizeof(first), 会调用getNext, 知道所有的可能都被罗列一遍;
具体的实现可以参考如下的代码:
Status PlanEnumerator::init() {
// Fill out our memo structure from the tagged _root.
_done = !prepMemo(_root, PrepMemoContext());
// Dump the tags. We replace them with IndexTag instances.
_root->resetTag();
return Status::OK();
}
bool PlanEnumerator::prepMemo(MatchExpression* node, PrepMemoContext context) {
PrepMemoContext childContext;
childContext.elemMatchExpr = context.elemMatchExpr;
if (Indexability::nodeCanUseIndexOnOwnField(node)) {
// We know we can use an index, so grab a memo spot.
size_t myMemoID;
NodeAssignment* assign;
allocateAssignment(node, &assign, &myMemoID);
assign->pred.reset(new PredicateAssignment());
assign->pred->expr = node;
assign->pred->first.swap(rt->first);
return true;
} else if (Indexability::isBoundsGeneratingNot(node)) {
bool childIndexable = prepMemo(node->getChild(0), childContext);
size_t myMemoID;
NodeAssignment* assign;
allocateAssignment(node, &assign, &myMemoID);
OrAssignment* orAssignment = new OrAssignment();
orAssignment->subnodes.push_back(memoIDForNode(node->getChild(0)));
assign->orAssignment.reset(orAssignment);
return true;
} else if (MatchExpression::OR == node->matchType()) {
// For an OR to be indexed, all its children must be indexed.
for (size_t i = 0; i < node->numChildren(); ++i) {
if (!prepMemo(node->getChild(i), childContext)) {
return false;
}
}
// If we're here we're fully indexed and can be in the memo.
size_t myMemoID;
NodeAssignment* assign;
allocateAssignment(node, &assign, &myMemoID);
OrAssignment* orAssignment = new OrAssignment();
for (size_t i = 0; i < node->numChildren(); ++i) {
orAssignment->subnodes.push_back(memoIDForNode(node->getChild(i)));
}
assign->orAssignment.reset(orAssignment);
return true;
} else if (Indexability::arrayUsesIndexOnChildren(node)) {
...
size_t myMemoID;
NodeAssignment* assign;
allocateAssignment(node, &assign, &myMemoID);
assign->arrayAssignment.reset(aa.release());
return true;
} else if (MatchExpression::AND == node->matchType()) {
...
}
}
PlanEnumerator::init() 是把每一个Matchexpression 生成一个 NodeAssignment, 里面又根据子节点的不同分为前述的4中 Assignment, 我们看一下getNext, 就是ENUM 所有的Assignment的index的组合。
bool PlanEnumerator::getNext(MatchExpression** tree) {
if (_done) {
return false;
}
// Tag with our first solution.
tagMemo(memoIDForNode(_root));
*tree = _root->shallowClone().release();
tagForSort(*tree);
sortUsingTags(*tree);
_root->resetTag();
LOG(5) << "Enumerator: memo just before moving:" << endl
<< dumpMemo();
_done = nextMemo(memoIDForNode(_root));
return true;
}
bool PlanEnumerator::nextMemo(size_t id) {
NodeAssignment* assign = _memo[id];
verify(NULL != assign);
if (NULL != assign->pred) {
PredicateAssignment* pa = assign->pred.get();
pa->indexToAssign++;
if (pa->indexToAssign >= pa->first.size()) {
pa->indexToAssign = 0;
return true;
}
return false;
} else if (NULL != assign->orAssignment) {
OrAssignment* oa = assign->orAssignment.get();
// Limit the number of OR enumerations
oa->counter++;
if (oa->counter >= _orLimit) {
return true;
}
// OR just walks through telling its children to
// move forward.
for (size_t i = 0; i < oa->subnodes.size(); ++i) {
// If there's no carry, we just stop. If there's a carry, we move the next child
// forward.
if (!nextMemo(oa->subnodes[i])) {
return false;
}
}
// If we're here, the last subnode had a carry, therefore the OR has a carry.
return true;
} else if (NULL != assign->arrayAssignment) {
ArrayAssignment* aa = assign->arrayAssignment.get();
// moving to next on current subnode is OK
if (!nextMemo(aa->subnodes[aa->counter])) {
return false;
}
// Move to next subnode.
++aa->counter;
if (aa->counter < aa->subnodes.size()) {
return false;
}
aa->counter = 0;
return true;
} else if (NULL != assign->andAssignment) {
AndAssignment* aa = assign->andAssignment.get();
// One of our subnodes might have to move on to its next enumeration state.
const AndEnumerableState& aes = aa->choices[aa->counter];
for (size_t i = 0; i < aes.subnodesToIndex.size(); ++i) {
if (!nextMemo(aes.subnodesToIndex[i])) {
return false;
}
}
// None of the subnodes had another enumeration state, so we move on to the
// next top-level choice.
++aa->counter;
if (aa->counter < aa->choices.size()) {
return false;
}
aa->counter = 0;
return true;
}
// This shouldn't happen.
verify(0);
return false;
}
7. QueryPlannerAccess
QueryPlannerAccess的主要作用是根据MatchExpression和index 产生一个QuerySolutionNode, 前面我们列出了几种需要CollectionScan的方式, 程序走到这里, 都是通过index建立query solution, QueryPlannerAccess::buildIndexedDataAccess 这个函数会根据MatchExpression的节点的类型, 建立对应的QuerySolutionNode节点, 最终形成一个树形的QuerySolutionNode树。当然, 还需要建立或者更新PlanCacheIndexTree。 具体参考如下代码:
// If we have any relevant indices, we try to create indexed plans.
if (0 < relevantIndices.size()) {
// The enumerator spits out trees tagged with IndexTag(s).
PlanEnumeratorParams enumParams;
enumParams.intersect = params.options & QueryPlannerParams::INDEX_INTERSECTION;
enumParams.root = query.root();
enumParams.indices = &relevantIndices;
PlanEnumerator isp(enumParams);
isp.init();
MatchExpression* rawTree;
while (isp.getNext(&rawTree) && (out->size() < params.maxIndexedSolutions)) {
。。。
PlanCacheIndexTree* cacheData;
Status indexTreeStatus =
cacheDataFromTaggedTree(clone.get(), relevantIndices, &cacheData);
if (!indexTreeStatus.isOK()) {
LOG(5) << "Query is not cachable: " << indexTreeStatus.reason() << endl;
}
unique_ptr<PlanCacheIndexTree> autoData(cacheData);
// This can fail if enumeration makes a mistake.
QuerySolutionNode* solnRoot = QueryPlannerAccess::buildIndexedDataAccess(
query, rawTree, false, relevantIndices, params);
if (NULL == solnRoot) {
continue;
}
QuerySolution* soln = QueryPlannerAnalysis::analyzeDataAccess(query, params, solnRoot);
if (NULL != soln) {
LOG(5) << "Planner: adding solution:" << endl
<< soln->toString();
if (indexTreeStatus.isOK()) {
SolutionCacheData* scd = new SolutionCacheData();
scd->tree.reset(autoData.release());
soln->cacheData.reset(scd);
}
out->push_back(soln);
}
}
}
至此, QuerySolution就生成了, 接下来就是产生一个PlanExecutor, 选出最有的query solution, 并且执行。