mongodb aggregate 分组统计group count

1. collection 名称mycol, 数据初始化

//1
 {
    "_id": ObjectId("5e05fe4a32780f42806a80c5"),
    "author": "tom",
    "books": [
        {    "type": "IT类",        "name": "mongodb",            "price": NumberInt("100")      },
        {   "type": "IT类",         "name": "java",          "price": NumberInt("50")      },
        {   "type": "文学类",     "name": "红楼梦",           "price": NumberInt("20")        }
    ],
    "year": NumberInt("2018")
}
//2
{
    "_id": ObjectId("5e05fe6032780f42806a80c7"),
    "author": "tom",
    "books": [ { "type": "IT类",  "name": "程序员的修养",  "price": NumberInt("30")      },
        {  "type": "文学类",         "name": "简爱",        "price": NumberInt("50")     },
        {  "type": "哲学类",        "name": "西方哲学史",      "price": NumberInt("20")   }
    ],
    "year": "2019"
}
//3
{
    "_id": ObjectId("5e05fe6332780f42806a80c9"),
    "author": "jack",
    "books": [ {   "type": "IT类",  "name": "程序员的修养",  "price": NumberInt("30")    },
        {  "type": "文学类",        "name": "简爱",    "price": NumberInt("50")    },
        {  "type": "哲学类",      "name": "西方哲学史",      "price": NumberInt("20")    }
    ],
    "year": "2019"
}

// 4
{
    "_id": ObjectId("5e0854c432780f0d80c7bb43"),
    "author": "jack",
    "books": [   { "type": "IT类",   "name": "程序员的修养",  "price": NumberInt("30")     },
        {     "type": "IT类",        "name": "高并发编程",      "price": NumberInt("50")     }
    ],
    "year": "2018"
}

2. 查询2018年每个人在每种书的类别下写了多少本书?

2.1 查询2018年, $match相当sql的where

db.mycol.aggregate([
  { $match: {year:2018}}
]);

2.2 书的结构是数组,没法group, 要平铺展开。数组有多个元素,就会拆成多行

db.mycol.aggregate([
  { $match: {year:2018}},
  {$unwind:  "$books" },
]);

2.3 按书的作者和书的分类group by

db.mycol.aggregate([
  { $match: {year:2018}},
	{$unwind:  "$books" },
	{$group:{"_id":{"author": "$author",  "bookType":"$books.type"}, count:{$sum:1}}}
   
]);

2.4 控制查询结果的显示那些列, 相当于定义sql的查询结果(控制显示那些列,列的别名)

db.mycol.aggregate([
  { $match: {year:2018}},
	{$unwind:  "$books" },
	{$group:{"_id":{"author": "$author",  "bookType":"$books.type"},count:{$sum:1}  }},
	{$project:{ "_id":0, "author":"$_id.author" , "bookType":"$_id.bookType", "bookTypeCnt":"$count"}}
]);

2.5 按书的作者排序

db.mycol.aggregate([
  { $match: {year:2018}},
	{$unwind:  "$books" },
	{$group:{"_id":{"author": "$author",  "bookType":"$books.type"},count:{$sum:1}  }},
	{$project:{ "_id":0, "author":"$_id.author" , "bookType":"$_id.bookType", "bookTypeCnt":"$count"}},
	{$sort:{author:-1}}
]);

最终的查询结果
// 1
{
    "author": "tom",
    "bookType": "IT类",
    "bookTypeCnt": 3
}

// 2
{
    "author": "tom",
    "bookType": "文学类",
    "bookTypeCnt": 2
}

// 3
{
    "author": "tom",
    "bookType": "哲学类",
    "bookTypeCnt": 1
}

// 4
{
    "author": "jack",
    "bookType": "文学类",
    "bookTypeCnt": 1
}

// 5
{
    "author": "jack",
    "bookType": "哲学类",
    "bookTypeCnt": 1
}

// 6
{
    "author": "jack",
    "bookType": "IT类",
    "bookTypeCnt": 3
}

3. 查询每个人写了多少不类别的书籍。

前三步和第二个查询是一样的。 也是先$ m a t c h match match, 再$unwind平铺展开, 然后按author、type 进行group by。

db.mycol.aggregate([
  { $match: {year:2018}},
	{$unwind:  "$books" },
	{$group:{"_id":{"author": "$author",  "bookType":"$books.type"} }}
]);

结果

// 1
{
    "_id": {
        "author": "jack",
        "bookType": "文学类"
    }
}

// 2
{
    "_id": {
        "author": "tom",
        "bookType": "IT类"
    }
}

// 3
{
    "_id": {
        "author": "tom",
        "bookType": "文学类"
    }
}

// 4
{
    "_id": {
        "author": "tom",
        "bookType": "哲学类"
    }
}

// 5
{
    "_id": {
        "author": "jack",
        "bookType": "哲学类"
    }
}

// 6
{
    "_id": {
        "author": "jack",
        "bookType": "IT类"
    }
}

然后再按author group by分组count, 去掉相同类别的

db.mycol.aggregate([
  { $match: {year:2018}},
	{$unwind:  "$books" },
	{$group:{"_id":{"author": "$author",  "bookType":"$books.type"} }},
	{$group:{"_id":{"author": "$_id.author" }, "typeCnt":{$sum:1}}}
]);

查询结果:

// 1
{
    "_id": {
        "author": "tom"
    },
    "typeCnt": 3
}

// 2
{
    "_id": {
        "author": "jack"
    },
    "typeCnt": 3
}

  • 6
    点赞
  • 16
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
你可以使用MongoDB的聚合框架来进行去重统计。下面是一个Java实现的例子: ```java import com.mongodb.MongoClient; import com.mongodb.client.AggregateIterable; import com.mongodb.client.MongoCollection; import com.mongodb.client.MongoDatabase; import static com.mongodb.client.model.Accumulators.*; import static com.mongodb.client.model.Aggregates.*; import static com.mongodb.client.model.Filters.*; import org.bson.Document; public class DistinctCountExample { public static void main(String[] args) { // 连接MongoDB MongoClient mongoClient = new MongoClient("localhost", 27017); // 获取数据库 MongoDatabase database = mongoClient.getDatabase("mydb"); // 获取集合 MongoCollection<Document> collection = database.getCollection("mycollection"); // 聚合查询 AggregateIterable<Document> iterable = collection.aggregate( Arrays.asList( // 分组统计去重后的数量 group("$field1", sum("count", 1)), // 投影出结果 project(fields(excludeId(), include("field1", "count"))) ) ); // 输出结果 for (Document document : iterable) { System.out.println(document.toJson()); } // 关闭连接 mongoClient.close(); } } ``` 这个例子中,我们使用了MongoDB的聚合框架来进行去重统计。首先,我们使用`group`操作符对`field1`字段进行分组,并使用`sum`操作符来统计每个分组中元素的数量。然后,我们使用`project`操作符来投影出结果集,包含`field1`字段和去重后的`count`字段,即每个元素在`field1`字段中出现的次数。最终得到去重后的统计结果。

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值