M101P: MongoDB for Developers - Final Exam

Q1

> db.messages.find({"headers.From":"andrew.fastow@enron.com","headers.To":"jeff.skilling@enron.com"},{"headers":1}).count()
3

Q2


# To 存在重复的
db.messages.aggregate([
{$project:{"headers.From":1,"headers.To":1}}
,{$unwind:"$headers.To"}
,{$group:{_id:{id:"$_id",from:"$headers.From",to:"$headers.To"},count:{$sum:1}}}
,{$match:{count:{$gt:1}}}
],{allowDiskUse:true})

{ "_id" : { "id" : ObjectId("4f16fc98d1e2d32371004584"), "from" : "rhonda.denton@enron.com", "to" : "melissa.murphy@enron.com" }, "count" : 2 }

# 确认
db.messages.find({_id:ObjectId("4f16fc98d1e2d32371004584")})

    "melissa.murphy@enron.com",
    ...
    "melissa.murphy@enron.com",

# 测试 setUnion 是否能去重
#db.messages.aggregate([ {$project:{"headers.From":1,"headers.To":1,newto:{$setUnion:["$headers.To"]}}},{$match:{"$headers.To.1":{$exists:true}}} ])
db.messages.aggregate([ {$project:{"headers.From":1,"headers.To":1,newto:{$setUnion:["$headers.To"]}}},{$match:{_id:ObjectId("4f16fc98d1e2d32371004584")}} ]).pretty()


# 错误的方法
db.messages.aggregate([
{$unwind: "$headers.To"}
, {$group : {_id : {"from": "$headers.From", "to" : "$headers.To"}, emails : {$sum: 1}}}
, {$sort: {emails : -1}}
, {$limit: 5}
])


# 正确的方法
db.messages.aggregate([
{$project:{"headers.From":1,"headers.To":1,newto:{$setUnion:["$headers.To"]}}}
,{$unwind:"$newto"}
,{$group:{
  _id:{from:"$headers.From",to:"$newto"}
  ,count:{$sum:1}
  }}
,{$sort:{count:-1}}
,{$limit:5}
])

{ "_id" : { "from" : "susan.mara@enron.com", "to" : "jeff.dasovich@enron.com" }, "count" : 750 }
{ "_id" : { "from" : "soblander@carrfut.com", "to" : "soblander@carrfut.com" }, "count" : 679 }
{ "_id" : { "from" : "susan.mara@enron.com", "to" : "james.steffes@enron.com" }, "count" : 646 }
{ "_id" : { "from" : "susan.mara@enron.com", "to" : "richard.shapiro@enron.com" }, "count" : 616 }
{ "_id" : { "from" : "evelyn.metoyer@enron.com", "to" : "kate.symes@enron.com" }, "count" : 567 }


# 另一种正确的方法
db.messages.aggregate([
    {"$unwind" : "$headers.To"},
    {
        "$group" : {
            "_id" : {
                "_id" : "$_id",
                "from" : "$headers.From"
            },
            "to" : {"$addToSet" : "$headers.To"}
        }
    },
    {"$unwind" : "$to"},
    {
        "$group" : {
            "_id" : {
                "from" : "$_id.from",
                "to" : "$to"
            },
            "count_msg" : {
                "$sum" : 1
            }
        }
    },
    { "$sort" : {"count_msg" : -1} },
    { "$limit" : 5 }
])

{ "_id" : { "from" : "susan.mara@enron.com", "to" : "jeff.dasovich@enron.com" }, "count_msg" : 750 }
{ "_id" : { "from" : "soblander@carrfut.com", "to" : "soblander@carrfut.com" }, "count_msg" : 679 }
{ "_id" : { "from" : "susan.mara@enron.com", "to" : "james.steffes@enron.com" }, "count_msg" : 646 }
{ "_id" : { "from" : "susan.mara@enron.com", "to" : "richard.shapiro@enron.com" }, "count_msg" : 616 }
{ "_id" : { "from" : "evelyn.metoyer@enron.com", "to" : "kate.symes@enron.com" }, "count_msg" : 567 }

Q3

> db.messages.find({"headers.Message-ID":"<8147308.1075851042335.JavaMail.evans@thyme>"}).count()
1
> db.messages.find({"headers.Message-ID":"<8147308.1075851042335.JavaMail.evans@thyme>"}).pretty()

db.messages.update({"headers.Message-ID":"<8147308.1075851042335.JavaMail.evans@thyme>"},{$push:{"headers.To":"mrpotatohead@mongodb.com"}})

Q4

self.posts.update_one( {'permalink': permalink} , {'$inc':{"comments.%s.num_likes"%(comment_ordinal):1}})

PyMongo 版本是 2.5 的情况下,脚本将运行失败,必须要升级到 3.x

Q5

a_1_c_1
a_1_b_1
a_1_b_1_c_-1
c_1

Q6

Remove all indexes from the collection, leaving only the index on _id in place
Set w=0, j=0 on writes

Q7

> db.images.find({tags:"kittens"}).count()
49932

#!/usr/bin/env python
import pymongo
conn = pymongo.MongoClient(host="127.0.0.1", port=27017)
db = conn.photo
album_images = []
for album in db.albums.find():
  album_images.extend(album['images'])
album_images = list(set(album_images))
images = []
for image in db.images.find():
  images.append(image['_id'])
orphan_images = list(set(images)-set(album_images))
print len(orphan_images)
for id in orphan_images:
  db.images.delete_one({'_id':id})
print db.images.count()
print db.images.find({'tags':"kittens"}).count()

10263
89737
44822

44822

Q8

Maybe, it depends on whether Node 2 has processed the write.

Q9

patient_id

Q10

The query scanned every document in the collection.
The query avoided sorting the documents because it was able to use an index's ordering.


> db.messages.count()
120477

-eof-

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值