The Comments Conundrum

霸气显露无疑!看例子了解Mongodb聚合框架

One of the most common questions we get is:

I have a collection of blog posts and each post has an array of comments. How do I get…
…all comments by a given author
…the most recent comments
…the most popular commenters?

And so on. The answer to this has always been “Well, you can’t do that on the server side…” You can either do it on the client side or store comments in their own collection. What you really want is the ability to treat embedded documents like a “real” collection.

The aggregation pipeline gives you this ability by letting you “unwind” arrays into separate documents, then doing whatever else you need to do in subsequent pipeline operators.

For example…

Getting all comments by Serious Cat

Serious Cat’s comments are scattered between post documents, so there wasn’t a good way of querying for just those embedded documents. Now there is.

Let’s assume we want each comment by Serious Cat, along with the title and url of the post Serious Cat was commenting on. So, the steps we need to take are:

  1. Extract the fields we want (title, url, comments)
  2. Unwind the comments field: make each comment into a “real” document
  3. Query our new “comments collection” for “Serious Cat”

Using the aggregation pipeline, this looks like:




> db.runCommand({aggregate: "posts", pipeline: [
{
   // extract the fields 
   $project: {
        title : 1,
        url : 1,
        comments : 1
    }
},
{
    // explode the "comments" array into separate documents
    $unwind: "$comments"
},
{
    // query like a boss
    $match: {comments.author : "Serious Cat"}
}]})

Now, this works well for something like a blog, where you have human-generated (small) data. If you’ve got gigs of comments to go through, you probably want to filter out as many as possible (e.g., with $match or $limit) before sending it to the “everything-in-memory” parts of the pipeline.

Getting the most recent comments

Let’s assume our site lists the 10 most recent comments across all posts, with links back to the posts they appeared on, e.g.,

  1. Great post! -Jerry (February 2nd, 2012) from This is a Great Post
  2. What does batrachophagous mean? -Fred (February 2nd, 2012) from Fun with Crosswords
  3. Where can I get discount Prada shoes? -Tom (February 1st, 2012) from Rant about Spam

To extract these comments from a collection of posts, you could do something like:

> db.runCommand({aggregate: "posts", pipeline: [
{
   // extract the fields
   $project: {
        title : 1,
        url : 1,
        comments : 1
    }
{
    // explode "comments" array into separate documents
    $unwind: "$comments"
},
{
    // sort newest first
    $sort: {
        "comments.date" : -1
    }
},
{
    // get the 10 newest
    $limit: 10
}]})

Let’s take a moment to look at what $unwind does to a sample document.

Suppose you have a document that looks like this after the $project:

{
    "url" : "/blog/spam",
    "title" : "Rant about Spam",
    "comments" : [
        {text : "Where can I get discount Prada shoes?", ...},
        {text : "First!", ...},
        {text : "I hate spam, too!", ...},
        {text : "I love spam.", ...}
    ]
}

Then, after unwinding the comments field, you’d have:

{
    "url" : "/blog/spam",
    "title" : "Rant about Spam",
    "comments" : [
        {text : "Where can I get discount Prada shoes?", ...},
    ]
}
{
    "url" : "/blog/spam",
    "title" : "Rant about Spam",
    "comments" : [
        {text : "First!", ...}
    ]
}
{
    "url" : "/blog/spam",
    "title" : "Rant about Spam",
    "comments" : [
        {text : "I hate spam, too!", ...}
    ]
},
{
    "url" : "/blog/spam",
    "title" : "Rant about Spam",
    "comments" : [
        {text : "I love spam.", ...}
    ]
}

Then we $sort$limit, and Bob’s your uncle.

Rank commenters by popularity

Suppose we allow users to upvote comments and we want to see who the most popular commenters are.

The steps we want to take are:

  1. Project out the fields we need (similar to above)
  2. Unwind the comments array (similar to above)
  3. Group by author, taking a count of votes (this will sum up all of the votes for each comment)
  4. Sort authors to find the most popular commenters

Using the pipeline, this would look like:






> db.runCommand({aggregate: "posts", pipeline: [
{
   // extract the fields we'll need
   $project: {
        title : 1,
        url : 1,
        comments : 1
    }
},
{
    // explode "comments" array into separate documents
    $unwind: "$comments"
},
{
    // count up votes by author
    $group : {
        _id : "$comments.author",
        popularity : {$sum : "$comments.votes"}
    }
},
{
    // sort by the new popular field
    $sort: {
        "popularity" : -1
    }
}]})

As I mentioned before, there are a couple downsides to using the aggregation pipeline: a lot of the pipeline is done in-memory and can be very CPU- and memory-intensive. However, used judiciously, it give you a lot more freedom to mush around your embedded documents.

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
提供的源码资源涵盖了Java应用等多个领域,每个领域都包含了丰富的实例和项目。这些源码都是基于各自平台的最新技术和标准编写,确保了在对应环境下能够无缝运行。同时,源码中配备了详细的注释和文档,帮助用户快速理解代码结构和实现逻辑。 适用人群: 适合毕业设计、课程设计作业。这些源码资源特别适合大学生群体。无论你是计算机相关专业的学生,还是对其他领域编程感兴趣的学生,这些资源都能为你提供宝贵的学习和实践机会。通过学习和运行这些源码,你可以掌握各平台开发的基础知识,提升编程能力和项目实战经验。 使用场景及目标: 在学习阶段,你可以利用这些源码资源进行课程实践、课外项目或毕业设计。通过分析和运行源码,你将深入了解各平台开发的技术细节和最佳实践,逐步培养起自己的项目开发和问题解决能力。此外,在求职或创业过程中,具备跨平台开发能力的大学生将更具竞争力。 其他说明: 为了确保源码资源的可运行性和易用性,特别注意了以下几点:首先,每份源码都提供了详细的运行环境和依赖说明,确保用户能够轻松搭建起开发环境;其次,源码中的注释和文档都非常完善,方便用户快速上手和理解代码;最后,我会定期更新这些源码资源,以适应各平台技术的最新发展和市场需求。 所有源码均经过严格测试,可以直接运行,可以放心下载使用。有任何使用问题欢迎随时与博主沟通,第一时间进行解答!

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值