M102: MongoDB for DBAs Final Exam

M102: MongoDB for DBAs Final Exam

运行环境

操作系统:windows 10 家庭中文版
Mongodb :Mongodb 3.4

Mongodb安装路径:E:>MongoDB\Server\3.4\bin\
Mongodb存储路径:E:>MongoDB\data

课后习题

Question 1

Download Handouts:

rollback_553ed0e3d8ca3966d777dfe0.zip

Problems 1 through 3 are an exercise in running mongod’s, replica sets, and an exercise in testing of replica set rollbacks, which can occur when a former primary rejoins a set after it has previously had a failure.

Get the files from Download Handout link, and extract them. Use a.bat instead of a.sh on Windows.

Start a 3 member replica set (with default options for each member, all are peers). (a.sh will start the mongod’s for you if you like.)

# if on unix:
chmod +x a.sh
./a.sh

You will need to initiate the replica set next.

Run:

mongo --shell --port 27003 a.js

// ourinit() will initiate the set for us.
// to see its source code type without the parentheses:
ourinit

// now execute it:
ourinit()

We will now do a test of replica set rollbacks. This is the case where data never reaches a majority of the set. We’ll test a couple scenarios.

Now, let’s do some inserts. Run:

db.foo.insert( { _id : 1 }, { writeConcern : { w : 2 } } )
db.foo.insert( { _id : 2 }, { writeConcern : { w : 2 } } )
db.foo.insert( { _id : 3 }, { writeConcern : { w : 2 } } )

Note: if 27003 is not primary, make it primary – using rs.stepDown() on the mongod on port 27001 (perhaps also rs.freeze()) for example.

Next, let’s shut down that server on port 27001:

var a = connect("localhost:27001/admin");
a.shutdownServer()
rs.status()

At this point the mongod on 27001 is shut down. We now have only our 27003 mongod, and our arbiter on 27002, running.

Let’s insert some more documents:

db.foo.insert( { _id : 4 } )
db.foo.insert( { _id : 5 } )
db.foo.insert( { _id : 6 } )

Now, let’s restart the mongod that is shut down. If you like you can cut and paste the relevant mongod invocation from a.sh.

Now run ps again and verify three are up:

ps -A | grep mongod

Now, we want to see if any data that we attempted to insert isn’t there. Go into the shell to any member of the set. Use rs.status() to check state. Be sure the member is “caught up” to the latest optime (if it’s a secondary). Also on a secondary you might need to invoke rs.slaveOk() before doing a query.)

Now run:

db.foo.find()

to see what data is there after the set recovered from the outage. How many documents do you have?

解答

下载题目给出的压缩包,解压后有下面三个文件:

    a.js
    a.sh (to be used by Mac/Linux users)
    a.bat (to be used by Windows users)

这里我选用Linux系统,执行脚本:

./a.sh

输出如下:

Already running mongo* processes (this is fyi, should be none probably):
50275 ttys002    0:00.00 grep mongo

make / reset dirs

running mongod processes...
about to fork child process, waiting until server is ready for connections.
forked process: 50282
child process started successfully, parent exiting
about to fork child process, waiting until server is ready for connections.
forked process: 50285
child process started successfully, parent exiting
about to fork child process, waiting until server is ready for connections.
forked process: 50288
child process started successfully, parent exiting

50282 ??         0:00.07 mongod --fork --logpath a.log --smallfiles --oplogSize 50 --port 27001 --dbpath data/z1 --replSet z
50285 ??         0:00.05 mongod --fork --logpath b.log --smallfiles --oplogSize 50 --port 27002 --dbpath data/z2 --replSet z
50288 ??         0:00.06 mongod --fork --logpath c.log --smallfiles --oplogSize 50 --port 27003 --dbpath data/z3 --replSet z
50291 ttys002    0:00.00 grep mongo

Now run:

  mongo --shell --port 27003 a.js

启动集群:

mongo --shell --port 27003 a.js

按题意执行脚本:

ourinit()

输出如下:

{
    "info" : "Config now saved locally.  Should come online in about a minute.",
    "ok" : 1
}
waiting for set to initiate...
{
    "setName" : "z",
    "setVersion" : 1,
    "ismaster" : false,
    "secondary" : true,
    "hosts" : [
        "localhost:27003",
        "localhost:27001"
    ],
    "arbiters" : [
        "localhost:27002"
    ],
    "me" : "localhost:27003",
    "maxBsonObjectSize" : 16777216,
    "maxMessageSizeBytes" : 48000000,
    "maxWriteBatchSize" : 1000,
    "localTime" : ISODate("2015-05-01T20:02:22.656Z"),
    "maxWireVersion" : 2,
    "minWireVersion" : 0,
    "ok" : 1
}

查看所有成员状态:

rs.status()

输出如下:

{
    "set" : "z",
    "date" : ISODate("2015-05-01T20:08:49Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 1,
            "name" : "localhost:27001",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 386,
            "optime" : Timestamp(1430510540, 1),
            "optimeDate" : ISODate("2015-05-01T20:02:20Z"),
            "lastHeartbeat" : ISODate("2015-05-01T20:08:48Z"),
            "lastHeartbeatRecv" : ISODate("2015-05-01T20:08:48Z"),
            "pingMs" : 0,
            "syncingTo" : "localhost:27003"
        },
        {
            "_id" : 2,
            "name" : "localhost:27002",
            "health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",
            "uptime" : 386,
            "lastHeartbeat" : ISODate("2015-05-01T20:08:48Z"),
            "lastHeartbeatRecv" : ISODate("2015-05-01T20:08:49Z"),
            "pingMs" : 0
        },
        {
            "_id" : 3,
            "name" : "localhost:27003",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 416,
            "optime" : Timestamp(1430510540, 1),
            "optimeDate" : ISODate("2015-05-01T20:02:20Z"),
            "electionTime" : Timestamp(1430510551, 1),
            "electionDate" : ISODate("2015-05-01T20:02:31Z"),
            "self" : true
        }
    ],
    "ok" : 1
}

此时,我已关闭了端口27001处的mongod。此外,我在端口27003处的mongod是主节点。如果跟我情况不一样,要执行以下操作来切换主节点:

z:SECONDARY> exit
bye
mongo --port 27001
z:PRIMARY> rs.stepDown()
2015-05-01T16:23:04.511-0400 DBClientCursor::init call() failed
2015-05-01T16:23:04.513-0400 Error: error doing query: failed at src/mongo/shell/query.js:81
2015-05-01T16:23:04.514-0400 trying reconnect to 127.0.0.1:27001 (127.0.0.1) failed
2015-05-01T16:23:04.515-0400 reconnect 127.0.0.1:27001 (127.0.0.1) ok
z:SECONDARY>exit
bye
mongo --shell --port 27003 a.js

设置完成,执行插入命令:

z:PRIMARY> db.foo.insert( { _id : 1 }, { writeConcern : { w : 2 } } )
WriteResult({ "nInserted" : 1 })
z:PRIMARY> db.foo.insert( { _id : 2 }, { writeConcern : { w : 2 } } )
WriteResult({ "nInserted" : 1 })
z:PRIMARY> db.foo.insert( { _id : 3 }, { writeConcern : { w : 2 } } )
WriteResult({ "nInserted" : 1 })
z:PRIMARY>

关闭27001上的服务:

z:PRIMARY> var a = connect("localhost:27001/admin");
connecting to: localhost:27001/admin
z:PRIMARY> a.shutdownServer()
2015-05-01T16:11:42.920-0400 DBClientCursor::init call() failed
server should be down...
z:PRIMARY> rs.status()
{
    "set" : "z",
    "date" : ISODate("2015-05-01T20:26:26Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 1,
            "name" : "localhost:27001",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "optime" : Timestamp(1430511923, 3),
            "optimeDate" : ISODate("2015-05-01T20:25:23Z"),
            "lastHeartbeat" : ISODate("2015-05-01T20:26:24Z"),
            "lastHeartbeatRecv" : ISODate("2015-05-01T20:26:18Z"),
            "pingMs" : 0,
            "syncingTo" : "localhost:27003"
        },
        {
            "_id" : 2,
            "name" : "localhost:27002",
            "health" : 1,
            "state" : 7,
            "stateStr" : "ARBITER",
            "uptime" : 1443,
            "lastHeartbeat" : ISODate("2015-05-01T20:26:26Z"),
            "lastHeartbeatRecv" : ISODate("2015-05-01T20:26:24Z"),
            "pingMs" : 0
        },
        {
            "_id" : 3,
            "name" : "localhost:27003",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1473,
            "optime" : Timestamp(1430511923, 3),
            "optimeDate" : ISODate("2015-05-01T20:25:23Z"),
            "electionTime" : Timestamp(1430511789, 1),
            "electionDate" : ISODate("2015-05-01T20:23:09Z"),
            "self" : true
        }
    ],
    "ok" : 1
}
z:PRIMARY>

此时我的主节点和仲裁节点还存活,集群还能运行,插入数据:

z:PRIMARY> db.foo.insert( { _id : 4 } )
WriteResult({ "nInserted" : 1 })
z:PRIMARY> db.foo.insert( { _id : 5 } )
WriteResult({ "nInserted" : 1 })
z:PRIMARY> db.foo.insert( { _id : 6 } )
WriteResult({ "nInserted" : 1 })

执行查询:

z:PRIMARY> db.foo.find()
{ "_id" : 1 }
{ "_id" : 2 }
{ "_id" : 3 }
{ "_id" : 4 }
{ "_id" : 5 }
{ "_id" : 6 }
z:PRIMARY>

有6个数据,所以答案为6.

答案:6

Question 2

Let’s do that again with a slightly different crash/recover scenario for each process. Start with the following:

With all three members (mongod’s) up and running, you should be fine; otherwise, delete your data directory, and, once again:

./a.sh
mongo --shell --port 27003 a.js
ourinit() // you might need to wait a bit after this.
// be sure 27003 is the primary.
// use rs.stepDown() elsewhere if it isn't.
db.foo.drop()
db.foo.insert( { _id : 1 }, { writeConcern : { w : 2 } } )
db.foo.insert( { _id : 2 }, { writeConcern : { w : 2 } } )
db.foo.insert( { _id : 3 }, { writeConcern : { w : 2 } } )
var a = connect("localhost:27001/admin");
a.shutdownServer()
rs.status()
db.foo.insert( { _id : 4 } )
db.foo.insert( { _id : 5 } )
db.foo.insert( { _id : 6 } )

Now this time, shut down the mongod on port 27003 (in addition to the other member being shut down already) before doing anything else. One way of doing this in Unix would be:

ps -A | grep mongod
# should see the 27003 and 27002 ones running (only)
ps ax | grep mongo | grep 27003 | awk '{print $1}' | xargs kill
# wait a little for the shutdown perhaps...then:
ps -A | grep mongod
# should get that just the arbiter is present…

Now restart just the 27001 member. Wait for it to get healthy – check this with rs.status() in the shell. Then query

> db.foo.find()

Then add another document:

> db.foo.insert( { _id : "last" } )

After this, restart the third set member (mongod on port 27003). Wait for it to come online and enter a health state (secondary or primary).

Run (on any member – try multiple if you like) :

> db.foo.find()

You should see a difference from problem 1 in the result above.

Question: Which of the following are true about mongodb’s operation in these scenarios? Check all that apply.

Check all that apply:

  • The MongoDB primary does not write to its data files until a majority acknowledgement comes back from the rest of the cluster.When 27003 was primary,it did not perform the last 3 writes.
  • When 27003 came back up,it transmitted its write ops that the other member had not yet seen so that it would also have them.
  • MongoDB preserves the order of writes in a collection in its consistency model.In this problem,27003’s oplog was effectively a “fork” and to preserve write ordering a rollback was necessary during 27003’s recovery phase.

解答

重建环境:

killall mongod
rm -r data
./a.sh
mongo --shell --port 27003 a.js

执行给出的脚本:

db.foo.drop()
db.foo.insert( { _id : 1 }, { writeConcern : { w : 2 } } )
db.foo.insert( { _id : 2 }, { writeConcern : { w : 2 } } )
db.foo.insert( { _id : 3 }, { writeConcern : { w : 2 } } )
var a = connect("localhost:27001/admin");
a.shutdownServer()
rs.status()  // first server, if it's not down yet, should be down soon.
db.foo.insert( { _id : 4 } )
db.foo.insert( { _id : 5 } )
db.foo.insert( { _id : 6 } )
db.foo.find() // Let's see what we wrote.
exit

模拟27003进程异常关闭:

ps ax | grep mongo | grep 27003 | read processId otherStuff
kill $processId

好的,我们的数据承载服务器现在都已关闭。 我们将会看到,我们的仲裁者在保留我们的数据方面没有任何帮助。

我可以找到启动另一台服务器的命令,并通过评估a.js中的相应命令(假设我没有修改它)来启动它:

cat a.sh | grep 27001 | read launchMongod
eval $launchMongod
mongo --port 27001

等待27001成为主节点,执行下面语句:

db.foo.insert( { _id : "last" } )
db.foo.find()
exit

所以我现在可以看到,我只有4份文件。 让我们看看当我提出另一个mongod时是否有剩下的部分。

cat a.sh | grep 27003 | grep z3 | read launchMongod
eval $launchMongod
mongo --port 27003

查看数据:

rs.slaveOk()
db.foo.find() // This only shows 4 documents.

下面对选项进行判断:

  • The MongoDB primary does not write to its data files until a majority acknowledgement comes back from the rest of the cluster. When 27003 was primary, it did not perform the last 3 writes.

除非大多数确认从群集的其余部分回来,否则MongoDB主服务器不会写入其数据文件。 当27003是主的时候,它没有执行最后3次写入。
错误。我们可以从shell的响应中看到写入的内容已写入,并且在插入后我们可以看到这些文件(当我们执行find()时。 我们也知道,默认的关注点是{w:1},而不是{w:“majority”},所以这个答案是双重错误的。

  • When 27003 came back up, it transmitted its write ops that the other member had not yet seen so that it would also have them.

27003回来后,它发送了它的写操作,而其他成员,还没看到,所以它也会有。
错误。当我们在27003回来后检查这些写入时,无法找到,所以这个不可能是正确的。

  • MongoDB preserves the order of writes in a collection in its consistency model. In this problem, 27003’s oplog was effectively a “fork” and to preserve write ordering a rollback was necessary during 27003’s recovery phase.

MongoDB在其一致性模型中保留集合中的写入顺序。 在这个问题中,27003的oplog实际上是一个“分支”,为了保留写命令,在27003的恢复阶段需要回滚。
正确。由于我们在端口27001的mongod处执行了,因此未知的写入,当端口27003处的mongod恢复时,我们必须将这些写入存储在某处。 我们可以从副本集的状态中看到,当所有服务器都备份时,其行为与此答案一致,并且我们必须从章节或文档中添加关于回滚的知识(以及何时触发它们) 为了知道这件事发生了。

答案:

  • MongoDB preserves the order of writes in a collection in its consistency model.In this problem,27003’s oplog was effectively a “fork” and to preserve write ordering a rollback was necessary during 27003’s recovery phase.

Question 3

In question 2 the mongod on port 27003 does a rollback. Go to that mongod’s data directory. Look for a rollback directory inside. Find the .bson file there. Run the bsondump utility on that file. What are its contents?

  • There is no such file.
  • It contains 2 documents.
  • It contains 3 documents.
  • It contains 4 documents.
  • It contains 8 documents.
  • The file exists but is 0 bytes long.

解答

  • It contains 3 documents.

Question 4

Keep the three member replica set from the above problems running. We’ve had a request to make the third member never eligible to be primary. (The member should still be visible as a secondary.)

Reconfigure the replica set so that the third member can never be primary. Then run:

$ mongo --shell a.js --port 27003

And run:

> part4()

And enter the result in the text box below (with no spaces or line feeds just the exact value returned).

解答

233

Question 5

Suppose we have blog posts in a (not sharded*) postings collection, of the form:

{
  _id : ...,
  author : 'joe',
  title : 'Too big to fail',
  text : '...',
  tags : [ 'business', 'finance' ],
  when : ISODate("2008-11-03"),
  views : 23002,
  votes : 4,
  voters : ['joe', 'jane', 'bob', 'somesh'],
  comments : [
    { commenter : 'allan',
      comment : 'Well, i don't think so…',
      flagged : false,
      plus : 2
    },
    ...
  ]
}

Which of these statements is true?

Note: to get a multiple answer question right in this final you must get all the components right, so even if some parts are simple, take your time.

*Certain restrictions apply to unique constraints on indexes when sharded, so I mention this to be clear.

  • We can create an index to make the following query fast/faster:
db.postings.find({"comments.flagged" : true })
  • One way to assure people vote at most once per posting is to use this form of update:
db.postings.update(
  { _id: ... },
  { $inc : {votes:1}, $push : {voters:'joe'} }
);

combined with an index on {voters : 1} which has a unique key constraint.

  • One way to assure people vote at most once per posting is to use this form of update:
db.postings.update(
  { _id: ... , voters:{$ne:'joe'}},
  { $inc : {votes:1}, $push : {voters:'joe'}} ); 

解答

  • We can create an index to make the following query fast/faster:
db.postings.find({"comments.flagged" : true })
  • One way to assure people vote at most once per posting is to use this form of update:
db.postings.update(
  { _id: ... , voters:{$ne:'joe'}},
  { $inc : {votes:1}, $push : {voters:'joe'}} ); 

Question 6

Which of these statements is true?

Note: to get a multiple answer question right in this final you must get all the components right, so even if some parts are simple, take your time.

  • MongoDB(v3.0) supports transactions spanning multiple documents, if those documents all reside on the same shard.
  • MongoDB supports atomic operations on individual documents.
  • MongoDB allows you to choose the storage engine separately for each collection on your mongod.
  • MongoDB has a data type for binary data.

解答

  • MongoDB(v3.0) supports transactions spanning multiple documents, if those documents all reside on the same shard.

如果这些文档都位于同一个分片上,MongoDB(v3.0)支持跨越多个文档的事务。
错误。MongoDB不支持多文档交易(无论分片与否),正如在整个课程中一直强调的那样。 a page from the docs,展示了如何模拟这种行为,如果您需要 系统,在应用程序级别,并在一定程度上降低性能。

  • MongoDB supports atomic operations on individual documents.

MongoDB支持单个文档的原子操作。
正确。参考 the docs

  • MongoDB allows you to choose the storage engine separately for each collection on your mongod.

MongoDB允许您为mongod上的每个集合单独选择存储引擎。
错误。当mongod进程启动时选择存储引擎,并且不能仅为一个集合进行更改。参考:
the docs
但是这些选项将传递给存储引擎,以决定如何配置集合,而这些选项不涉及为集合选择存储引擎本身。

  • MongoDB has a data type for binary data.

MongoDB具有二进制数据的数据类型。
正确。参考:the documentation

答案

  • MongoDB supports atomic operations on individual documents.
  • MongoDB has a data type for binary data.

Question 7

Which of these statements is true?

  • MongoDB is “multi-master” – you can write anywhere,anytime.
  • MongoDB supports reads from slaves/secondaries that are in remote locations.
  • Most MongoDB queries involve javascript execution on the database server(s).

解答

  • MongoDB supports reads from slaves/secondaries that are in remote locations.

Question 8

Download Handouts:

gene_backup_553f1c22d8ca396a7a77dfee.zip

We have been asked by our users to pull some data from a previous database backup of a sharded cluster. They’d like us to set up a temporary data mart for this purpose, in addition to answering some questions from the data. The next few questions involve this user request.

First we will restore the backup. Download gene_backup.zip from the Download Handout link and unzip this to a temp location on your computer.

The original cluster that was backed up consisted of two shards, each of which was a three member replica set. The first one named “s1” and the second “s2”. We have one mongodump (backup) for each shard, plus the config database. After you unzip you will see something like this:

$ ls -la
total 0
drwxr-xr-x   5 dwight  staff  170 Dec 11 13:47 .
drwxr-xr-x  17 dwight  staff  578 Dec 11 13:49 ..
drwxr-xr-x   4 dwight  staff  136 Dec 11 13:45 config_server
drwxr-xr-x   5 dwight  staff  170 Dec 11 13:46 s1
drwxr-xr-x   5 dwight  staff  170 Dec 11 13:46 s2

Our data mart will be temporary, so we won’t need more than one mongod per shard, nor more than one config server (we are not worried about downtime, the mart is temporary).

As a first step, restore the config server backup and run a mongod config server instance with that restored data. The backups were made with mongodump. Thus you will use the mongorestore utility to restore.

Once you have the config server running, confirm the restore of the config server data by running the last javascript line below in the mongo shell, and entering the 5 character result it returns.

$ mongo localhost:27019/config
configsvr>
configsvr> db
config
configsvr> db.chunks.find().sort({_id:1}).next().lastmodEpoch.getTimestamp().toUTCString().substr(20,6)

Notes:

  • You must do this with MongoDB 3.0. The mongorestore may not work with prior versions of MongoDB.
  • If you do not see the prompt with ‘configsvr’ before the ‘>’, then you are not running as a config server.

Enter answer here:

解答

我们将首先制作一个目录temp_mart来存放我们的mongod及其数据文件。 首先,我们下载数据文件,然后转到相应的目录。

ls -la | grep gene_backup  # see that we've got our download.
mkdir temp_mart
mkdir temp_mart/s1
mkdir temp_mart/s2
mkdir temp_mart/cfg
mongod --dbpath temp_mart/cfg --logpath temp_mart/cfg.log --fork --port 27019 --configsvr --replSet csReplSet
mongorestore --port 27019 gene_backup/config_server
mongo localhost:27019/config

完成后,让我们运行查询:

configsvr> rs.initiate()
configsvr> db.chunks.find().sort({_id:1}).next().lastmodEpoch.getTimestamp().toUTCString().substr(20,6)
39:15
configsvr>

答案:39:15

Question 9

Now that the config server from question #8 is up and running, we will restore the two shards (“s1” and “s2”).

If we inspect our restored config db, we see this in db.shards:

~/dba/final $ mongo localhost:27019/config
MongoDB shell version: 3.0.0
connecting to: localhost:27019/config
configsvr> db.shards.find()
{ "_id" : "s1", "host" : "s1/genome_svr1:27501,genome_svr2:27502,genome_svr2:27503" }
{ "_id" : "s2", "host" : "s2/genome_svr4:27601,genome_svr5:27602,genome_svr5:27603" }

From this we know when we run a mongos for the cluster, it will expect the first shard to be a replica set named “s1”, and the second to be a replica set named “s2”, and also to be able to be able to resolve and connect to at least one of the seed hostnames for each shard.

If we were restoring this cluster as “itself”, it would be best to assign the hostnames “genome_svr1” etc. to the appropriate IP addresses in DNS, and not change config.shard. However, for this problem, our job is not to restore the cluster, but rather to create a new temporary data mart initialized with this dataset.

Thus instead we will update the config.shards metadata to point to the locations of our new shard servers. Update the config.shards collection such that your output is:

configsvr> db.shards.find()
{ "_id" : "s1", "host" : "localhost:27501" }
{ "_id" : "s2", "host" : "localhost:27601" }
configsvr>

Be sure when you do this nothing is running except the single config server. mongod and mongos processes cache metadata, so this is important. After the update restart the config server itself for the same reason.

Now start a mongod for each shard – one on port 27501 for shard “s1” and on port 27601 for shard “s2”. At this point if you run ps you should see three mongod’s – one for each shard, and one for our config server. Note they need not be replica sets, but just regular mongod’s, as we did not begin our host string in config.shards with setname/. Finally, use mongorestore to restore the data for each shard.

The next step is to start a mongos for the cluster.

Connect to the mongos with a mongo shell. Run this:

use snps
var x = db.elegans.aggregate( [ { $match : { N2 : "T" } } , { $group : { _id:"$N2" , n : { $sum : 1 } } } ] ).next(); print( x.n )

Enter the number output for n.

Notes:

  • You must do this with MongoDB 3.0. The mongoimport tool may not work with prior versions of MongoDB.

解答

当我们查看分片目录时,我们会看到两个代表分片的文档:

$ mongo localhost:27019/config
MongoDB shell version: 3.0.0
connecting to: localhost:27019/config
configsvr> db.shards.find()
{ "_id" : "s1", "host" : "s1/genome_svr1:27501,genome_svr2:27502,genome_svr2:27503" }
{ "_id" : "s2", "host" : "s2/genome_svr4:27601,genome_svr5:27602,genome_svr5:27603" }

我们的第一项任务是更新这些文件。

configsvr> var s1 = db.shards.findOne( { _id : "s1" } )
configsvr> var s2 = db.shards.findOne( { _id : "s2" } )
configsvr> s1.host = "localhost:27501"
localhost:27501
configsvr> s2.host = "localhost:27601"
localhost:27601
configsvr> db.shards.save(s1)
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
configsvr> db.shards.save(s2)
WriteResult({ "nMatched" : 1, "nUpserted" : 0, "nModified" : 1 })
configsvr> db.shards.find()
{ "_id" : "s1", "host" : "localhost:27501" }
{ "_id" : "s2", "host" : "localhost:27601" }
configsvr>

强制杀死这个进程并启动我们的分片。

ps ax | grep mongo | grep 27019 | read cfgProcessId otherStuff
kill $cfgProcessId

启动我们的服务器并加载数据。

mongod --dbpath temp_mart/s1 --logpath temp_mart/s1.log --port 27501 --fork
mongod --dbpath temp_mart/s2 --logpath temp_mart/s2.log --port 27601 --fork
mongorestore --port 27501 gene_backup/s1
mongorestore --port 27601 gene_backup/s2

启动我们的配置服务器,然后启动一个mongos。

mongod --dbpath temp_mart/cfg --logpath temp_mart/cfg.log --fork --port 27019 --configsvr --replSet csReplSet
mongos --configdb csReplSet/localhost:27019 --logpath temp_mart/mongos.log --fork
mongo

得出答案:

mongos> sh.status() // see what our cluster looks like
mongos> use snps
mongos> var x = db.elegans.aggregate( [ { $match : { N2 : "T" } } , { $group : { _id:"$N2" , n : { $sum : 1 } } } ] ).next(); print( x.n )
47664

答案:47664

Question 10

Now, for our temporary data mart, once again from a mongo shell connected to the cluster:

1) create an index { N2 : 1, mutant : 1 } for the “snps.elegans” collection.
2) now run:

db.elegans.find( { N2 : "T", mutant : "A" } ).limit( 5 ).explain( "executionStats" )

Based on the explain output, which of the following statements below are true?

  • No shards are queried.
  • 1 shard in total is queried.
  • 2 shards in total are queried.
  • 5 documents in total are examined.
  • 7 documents in total are examined.
  • 8 documents in total are examined.
  • Thousands of documents are examined.

解答

创建索引:

mongos> db.elegans.createIndex( { N2 : 1, mutant : 1 } )
{
    "raw" : {
        "localhost:27501" : {
            "createdCollectionAutomatically" : false,
            "numIndexesBefore" : 2,
            "numIndexesAfter" : 3,
            "ok" : 1
        },
        "localhost:27601" : {
            "createdCollectionAutomatically" : false,
            "numIndexesBefore" : 2,
            "numIndexesAfter" : 3,
            "ok" : 1
        }
    },
    "ok" : 1
}
mongos>

观察执行计划:

mongos> db.elegans.find({N2:"T",mutant:"A"}).limit(5).explain("executionStats")
{
 "queryPlanner" : {
  "mongosPlannerVersion" : 1,
  "winningPlan" : {
   "stage" : "SHARD_MERGE",
   "shards" : [
    {
     "shardName" : "s1",
     "connectionString" : "localhost:27501",
     "serverInfo" : {
      "host" : "cross-mb-air.local",
      "port" : 27501,
      "version" : "3.0.2",
      "gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
     },
     "plannerVersion" : 1,
     "namespace" : "snps.elegans",
     "indexFilterSet" : false,
     "parsedQuery" : {
      "$and" : [
       {
        "N2" : {
         "$eq" : "T"
        }
       },
       {
        "mutant" : {
         "$eq" : "A"
        }
       }
      ]
     },
     "winningPlan" : {
      "stage" : "LIMIT",
      "limitAmount" : 0,
      "inputStage" : {
       "stage" : "KEEP_MUTATIONS",
       "inputStage" : {
        "stage" : "SHARDING_FILTER",
        "inputStage" : {
         "stage" : "FETCH",
         "inputStage" : {
          "stage" : "IXSCAN",
          "keyPattern" : {
           "N2" : 1,
           "mutant" : 1
          },
          "indexName" : "N2_1_mutant_1",
          "isMultiKey" : false,
          "direction" : "forward",
          "indexBounds" : {
           "N2" : [
            "[\"T\", \"T\"]"
           ],
           "mutant" : [
            "[\"A\", \"A\"]"
           ]
          }
         }
        }
       }
      }
     },
     "rejectedPlans" : [ ]
    },
    {
     "shardName" : "s2",
     "connectionString" : "localhost:27601",
     "serverInfo" : {
      "host" : "cross-mb-air.local",
      "port" : 27601,
      "version" : "3.0.2",
      "gitVersion" : "6201872043ecbbc0a4cc169b5482dcf385fc464f"
     },
     "plannerVersion" : 1,
     "namespace" : "snps.elegans",
     "indexFilterSet" : false,
     "parsedQuery" : {
      "$and" : [
       {
        "N2" : {
         "$eq" : "T"
        }
       },
       {
        "mutant" : {
         "$eq" : "A"
        }
       }
      ]
     },
     "winningPlan" : {
      "stage" : "LIMIT",
      "limitAmount" : 3,
      "inputStage" : {
       "stage" : "KEEP_MUTATIONS",
       "inputStage" : {
        "stage" : "SHARDING_FILTER",
        "inputStage" : {
         "stage" : "FETCH",
         "inputStage" : {
          "stage" : "IXSCAN",
          "keyPattern" : {
           "N2" : 1,
           "mutant" : 1
          },
          "indexName" : "N2_1_mutant_1",
          "isMultiKey" : false,
          "direction" : "forward",
          "indexBounds" : {
           "N2" : [
            "[\"T\", \"T\"]"
           ],
           "mutant" : [
            "[\"A\", \"A\"]"
           ]
          }
         }
        }
       }
      }
     },
     "rejectedPlans" : [ ]
    }
   ]
  }
 },
 "executionStats" : {
  "nReturned" : 7,
  "executionTimeMillis" : 0,
  "totalKeysExamined" : 8,
  "totalDocsExamined" : 8,
  "executionStages" : {
   "stage" : "SHARD_MERGE",
   "nReturned" : 7,
   "executionTimeMillis" : 0,
   "totalKeysExamined" : 8,
   "totalDocsExamined" : 8,
   "totalChildMillis" : NumberLong(0),
   "shards" : [
    {
     "shardName" : "s1",
     "executionSuccess" : true,
     "executionStages" : {
      "stage" : "LIMIT",
      "nReturned" : 5,
      "executionTimeMillisEstimate" : 0,
      "works" : 7,
      "advanced" : 5,
      "needTime" : 1,
      "needFetch" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "limitAmount" : 0,
      "inputStage" : {
       "stage" : "KEEP_MUTATIONS",
       "nReturned" : 5,
       "executionTimeMillisEstimate" : 0,
       "works" : 6,
       "advanced" : 5,
       "needTime" : 1,
       "needFetch" : 0,
       "saveState" : 0,
       "restoreState" : 0,
       "isEOF" : 0,
       "invalidates" : 0,
       "inputStage" : {
        "stage" : "SHARDING_FILTER",
        "nReturned" : 5,
        "executionTimeMillisEstimate" : 0,
        "works" : 6,
        "advanced" : 5,
        "needTime" : 0,
        "needFetch" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 0,
        "invalidates" : 0,
        "chunkSkips" : 1,
        "inputStage" : {
         "stage" : "FETCH",
         "nReturned" : 6,
         "executionTimeMillisEstimate" : 0,
         "works" : 6,
         "advanced" : 6,
         "needTime" : 0,
         "needFetch" : 0,
         "saveState" : 0,
         "restoreState" : 0,
         "isEOF" : 0,
         "invalidates" : 0,
         "docsExamined" : 6,
         "alreadyHasObj" : 0,
         "inputStage" : {
          "stage" : "IXSCAN",
          "nReturned" : 6,
          "executionTimeMillisEstimate" : 0,
          "works" : 6,
          "advanced" : 6,
          "needTime" : 0,
          "needFetch" : 0,
          "saveState" : 0,
          "restoreState" : 0,
          "isEOF" : 0,
          "invalidates" : 0,
          "keyPattern" : {
           "N2" : 1,
           "mutant" : 1
          },
          "indexName" : "N2_1_mutant_1",
          "isMultiKey" : false,
          "direction" : "forward",
          "indexBounds" : {
           "N2" : [
            "[\"T\", \"T\"]"
           ],
           "mutant" : [
            "[\"A\", \"A\"]"
           ]
          },
          "keysExamined" : 6,
          "dupsTested" : 0,
          "dupsDropped" : 0,
          "seenInvalidated" : 0,
          "matchTested" : 0
         }
        }
       }
      }
     }
    },
    {
     "shardName" : "s2",
     "executionSuccess" : true,
     "executionStages" : {
      "stage" : "LIMIT",
      "nReturned" : 2,
      "executionTimeMillisEstimate" : 0,
      "works" : 3,
      "advanced" : 2,
      "needTime" : 0,
      "needFetch" : 0,
      "saveState" : 0,
      "restoreState" : 0,
      "isEOF" : 1,
      "invalidates" : 0,
      "limitAmount" : 3,
      "inputStage" : {
       "stage" : "KEEP_MUTATIONS",
       "nReturned" : 2,
       "executionTimeMillisEstimate" : 0,
       "works" : 3,
       "advanced" : 2,
       "needTime" : 0,
       "needFetch" : 0,
       "saveState" : 0,
       "restoreState" : 0,
       "isEOF" : 1,
       "invalidates" : 0,
       "inputStage" : {
        "stage" : "SHARDING_FILTER",
        "nReturned" : 2,
        "executionTimeMillisEstimate" : 0,
        "works" : 3,
        "advanced" : 2,
        "needTime" : 0,
        "needFetch" : 0,
        "saveState" : 0,
        "restoreState" : 0,
        "isEOF" : 1,
        "invalidates" : 0,
        "chunkSkips" : 0,
        "inputStage" : {
         "stage" : "FETCH",
         "nReturned" : 2,
         "executionTimeMillisEstimate" : 0,
         "works" : 3,
         "advanced" : 2,
         "needTime" : 0,
         "needFetch" : 0,
         "saveState" : 0,
         "restoreState" : 0,
         "isEOF" : 1,
         "invalidates" : 0,
         "docsExamined" : 2,
         "alreadyHasObj" : 0,
         "inputStage" : {
          "stage" : "IXSCAN",
          "nReturned" : 2,
          "executionTimeMillisEstimate" : 0,
          "works" : 3,
          "advanced" : 2,
          "needTime" : 0,
          "needFetch" : 0,
          "saveState" : 0,
          "restoreState" : 0,
          "isEOF" : 1,
          "invalidates" : 0,
          "keyPattern" : {
           "N2" : 1,
           "mutant" : 1
          },
          "indexName" : "N2_1_mutant_1",
          "isMultiKey" : false,
          "direction" : "forward",
          "indexBounds" : {
           "N2" : [
            "[\"T\", \"T\"]"
           ],
           "mutant" : [
            "[\"A\", \"A\"]"
           ]
          },
          "keysExamined" : 2,
          "dupsTested" : 0,
          "dupsDropped" : 0,
          "seenInvalidated" : 0,
          "matchTested" : 0
         }
        }
       }
      }
     }
    }
   ]
  }
 },
 "ok" : 1
}
mongos>

我们可以看到“totalDocsExamined”:8(在SHARD_MERGE阶段),并且有2个碎片。 根据这个输出,其他一切都是错误的。

答案:

  • 2 shards in total are queried.
  • 8 documents in total are examined.
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值