MongoDB--聚合框架，管道

最新推荐文章于 2024-07-15 11:21:56 发布

Keep hunger

最新推荐文章于 2024-07-15 11:21:56 发布

阅读量172

点赞数

分类专栏： MongoDB 文章标签： MongoDB

本文链接：https://blog.csdn.net/ITgagaga/article/details/103298600

版权

MongoDB 专栏收录该内容

33 篇文章 1 订阅

订阅专栏

MongoDB–聚合框架

文章目录

- MongoDB--聚合框架

使用聚合框架可以对集合中的文档进行变换和组合。可以用多个构件创建一个管道(pipeline,类似一个流)，用于对一连串的文档进行处理。这些构件包括：

筛选(filtering)

投射(projecting)

分组(grouping)

排序(sorting)

限制(limiting)

跳过(skipping)

一：管道操作符

准备如下数据：

db.reac.insert([{"title":"a","author":"a1","date":"ag2","count":3},{"title":"tt","author":"a1","date":"afffd2","count":5},{"title":"aas","author":"b","date":"a2g","count":2},{"title":"ffa","author":"c","date":"adg2","count":22},{"title":"afa","author":"c","date":"agd2","count":13},{"title":"aafs","author":"a1","date":"adg2","count":45},{"title":"aee","author":"d","date":"agd2","count":33},{"title":"aef","author":"e","date":"adg2","count":1},{"title":"adg","author":"e","date":"a2sdg","count":63}])

db.reac.insert([{"title":"a","author":"a1","date":"agds2","count":3,"totalPay":0,"count1":4},{"title":"aa","author":"a1","date":"asdg2","count":32,"totalPay":0,"count1":14},{"title":"dsf","author":"a1","date":"acsg2","count":12,"totalPay":0,"count1":3},{"title":"ertf","author":"a1","date":"agccc2","count":11,"totalPay":0,"count1":6}])

1. $match

$match用于对文档集合进行筛选。match可以使用所有的常规查询操作符，但是不能使用地理空间操作符。一般来说match放在管道的前面位置：一是可以快速将不需要的文档过滤掉减小工作量；一方面是如果在投射和分组之前执行match，查询就可以使用索引。

筛选出count大于10的文档
cqsm>db.reac.aggregate({$match:{"count":{"$gt":10}}})
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef23"), "title" : "ffa", "author" : "c", "date" : "adg2", "count" : 22 }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef24"), "title" : "afa", "author" : "c", "date" : "agd2", "count" : 13 }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef25"), "title" : "aafs", "author" : "a1", "date" : "adg2", "count" : 45 }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef26"), "title" : "aee", "author" : "d", "date" : "agd2", "count" : 33 }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef28"), "title" : "adg", "author" : "e", "date" : "a2sdg", "count" : 63 }

2. $project

$project可以从文档中提取字段，加载到内存中，还可以在提取的字段上进行一些操作。

//默认情况下_id会被返回，值取0就不会返回了
cqsm>db.reac.aggregate({$project:{"author":1,"_id":0}})
{ "author" : "a1" }
{ "author" : "a1" }
{ "author" : "b" }
{ "author" : "c" }
{ "author" : "c" }
{ "author" : "a1" }
{ "author" : "d" }
{ "author" : "e" }
{ "author" : "e" }

//对投影的字段在返回结果中重命名,一定要排除_id,所以重命名之后不能使用_id为索引
db.reac.aggregate({"$project":{"userId":"$_id","_id":0}})  //成功
db.reac.aggregate({"$project":{"au":"$author","_id":0}})

#####2.1 管道表达式

可以通过一些表达式进行组合进行任意深度的嵌套，以便创造复杂的表达式

2.2 数学表达式

操作符	语法	意义
$add	$add:[ex1,ex2…]	接受一个或者多个表达式作为参数，将参数相加,可以是常数，如果是文档中的数记住要加$
$substract	$substract:[ex1,ex2.]	只接受两个参数，第一个参数减去第二个参数
$multiply	$multiply[ex1,ex2…]	一到多个参数，相乘
$divide	$divide[ex1,ex2.]	只接受两个参数，用第一个参数除以第二个参数
$mod	$mod[ex1,ex2.]	只接受两个参数,取模

cqsm>db.reac.aggregate({"$project":{"totalPay":{"$add":["$count1","$count"]}}})
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef20"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef21"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef22"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef23"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef24"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef25"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef26"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef27"), "totalPay" : null }
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef28"), "totalPay" : null }
{ "_id" : ObjectId("5ddf894fd4733a02ad33ef29"), "totalPay" : 7 }
{ "_id" : ObjectId("5ddf894fd4733a02ad33ef2a"), "totalPay" : 46 }
{ "_id" : ObjectId("5ddf894fd4733a02ad33ef2b"), "totalPay" : 15 }
{ "_id" : ObjectId("5ddf894fd4733a02ad33ef2c"), "totalPay" : 17 }

2.3 日期表达式

日期字段的表达式，只能对日期类型的字段进行日期操作，不能对数值类型的字段进行日期操作。使用方法都是一样的，接受一个日期类型的参数，返回一个数值

$year
$month
$week
$dayOfMonth
$dayOfWeek
$dayOfYear
$hour
$minute
$second

//抽取出字段totalPay，获取当前日期的年份再赋值给totalPay
db.reac.aggregate({$project:{"totalPay":{$year:new Date()}}})

2.4 字符串表达式

操作符和语法	含义
$substr:[exp,startOffset,numToReturn]	exp是字符串，返回从startOffset开始的numToReturn个字节，是字节
$concat:[exp1,exp2…]	多个字符串连接
$toLower:exp	返回小写形式
$toUpper:exp	返回大写形式

2.5 逻辑表达式

逻辑表达式比较多

$cmp:[exp1,exp2] //比较两个参数的大小,返回0/1
$strcasecmp:[string1,string2]//比较两个罗马字符组成的字符串的大小
gt,gte,lt,lte [exp1,exp2] //返回true或者false
$and:[exp1,exp2…] //如果所有的表达式都是true就返回true
$or:[exp1,exp2…] //如果任意的表达式都是true就返回true
$not:exp //对参数取反
$cond:[booleanExpr,trueExpr,falseExpr] //如果booleanExpr为true就返回trueExpr，否则返回falseExpr
$ifNull:[expr,expr2] //如果expr是null就返回expr2

2.6 注意项

算数操作符只能接受数值类型参数，日期操作符只能接受日期类型的参数，字符串操作符只能接受字符串类型的参数。必要的时候进行条件判断，避免报错。

2.6 运用

找出autuor是a1并且title是a的文档，将totalPay改为count+count1

cqsm>db.reac.aggregate({$match:{"author":"a1","title":"a"}},{$project:{"totalPay":{$add:["$count","$count1"]},"author":1,"title":1}})
{ "_id" : ObjectId("5ddf7d2dd4733a02ad33ef20"), "title" : "a", "author" : "a1", "totalPay" : null }
{ "_id" : ObjectId("5ddf894fd4733a02ad33ef29"), "title" : "a", "author" : "a1", "totalPay" : 7 }

3. $group

$group可以将文档依据特定字段的不同值进行分组。

//按照title分组
cqsm>db.reac.aggregate({"$group":{"_id":"$title"}})
{ "_id" : "ertf" }
{ "_id" : "dsf" }
{ "_id" : "ffa" }
{ "_id" : "tt" }
{ "_id" : "aas" }
{ "_id" : "aee" }
{ "_id" : "aa" }
{ "_id" : "afa" }
{ "_id" : "a" }
{ "_id" : "aafs" }
{ "_id" : "aef" }
{ "_id" : "adg" }

3.1 分组操作符

分组操作符可以允许对每个分组进行计算。例如sum,它就对计算结果+1.

3.2 算数操作符

**" $s u m " : v a l u e * * ，对分组的每一个文档，将 v a l u e 与计算结果相加，注意引用文档中的字段一定要加$

cqsm>db.reac.aggregate({$group:{"_id":"$title","totalPay":{"$sum":"$count"}}})
{ "_id" : "ertf", "totalPay" : 11 }
{ "_id" : "dsf", "totalPay" : 12 }
{ "_id" : "ffa", "totalPay" : 22 }
{ "_id" : "tt", "totalPay" : 5 }
{ "_id" : "aas", "totalPay" : 2 }
{ "_id" : "aee", "totalPay" : 33 }
{ "_id" : "aa", "totalPay" : 32 }
{ "_id" : "afa", "totalPay" : 13 }
{ "_id" : "a", "totalPay" : 6 }
{ "_id" : "aafs", "totalPay" : 45 }
{ "_id" : "aef", "totalPay" : 1 }
{ "_id" : "adg", "totalPay" : 63 }

"$avg":value，返回每个分组的平均值

cqsm>db.reac.aggregate({"$group":{"_id":"$author","totalPay":{$avg:"$count"}}})
{ "_id" : "e", "totalPay" : 32 }
{ "_id" : "a1", "totalPay" : 15.857142857142858 }
{ "_id" : "b", "totalPay" : 2 }
{ "_id" : "c", "totalPay" : 17.5 }
{ "_id" : "d", "totalPay" : 33 }

3.3 极值操作符

"$max":expr，返回分组内的最大值

"$min":expr，返回分组内的最小值

db.reac.aggregate({"$group":{"_id":"$author","totalPay":{"$max":"$count"}}})

db.reac.aggregate({"$group":{"_id":"$author","totalPay":{"$min":"$count"}}})

"$first":expr，返回分组的第一个值，如果数据没排序这个操作没有意义

"$lase":expr，返回分组的最后一个值

db.reac.aggregate({"$group":{"_id":"$author","totalPay":{"$last":"$count"}}})

3.4 数组操作符

“$addToSet”:expr，如果数组不包含expr那就将它添加到数组中，正在返回结果集中，每个元素最多只出现一次，而且元素是不确定的。

“$push”:expr ,不管expr是什么值，都将它添加到数组中。返回包含所有值得数组。

4. $unwind

$unwind可以将数组中的每一个值拆分为单独的文档。

//想要得到的定用户的所有评论，只需要评论
db.blog.aggregate({"$project":{"comments":"$comments"}},{"$unwind":"$comments"},{"$mstch":{"commnets.author":"Mark"}})

5. $sort

$sort可以根据任何字段进行排序，与普通查询的语法相同。sort不能流式工作，只能在接受到所有的文档之后才能排序。

//排序使用的字段可以是文档中存在的字段，也可以是投影的时候重命名的文件
cqsm>db.reac.aggregate({$project:{"title":1,"sort":"$count","_id":0}},{"$sort":{"sort":1}})
{ "title" : "aef", "sort" : 1 }
{ "title" : "aas", "sort" : 2 }
{ "title" : "a", "sort" : 3 }
{ "title" : "a", "sort" : 3 }
{ "title" : "tt", "sort" : 5 }
{ "title" : "ertf", "sort" : 11 }
{ "title" : "dsf", "sort" : 12 }
{ "title" : "afa", "sort" : 13 }
{ "title" : "ffa", "sort" : 22 }
{ "title" : "aa", "sort" : 32 }
{ "title" : "aee", "sort" : 33 }
{ "title" : "aafs", "sort" : 45 }
{ "title" : "adg", "sort" : 63 }

6. $limit

db.reac.find().limit(3)  //只返回三条数据

二：注意

尽量在管道的开始阶段执行projiect,group,unwind操作，这样就先将不需要的字段和文档过滤掉；
管道如果不是在原本的集合中使用的数据就不能使用索引；
MongoDB不允许单一的聚合操作占用过多的系统内存，如果MongoDB发现某个聚合操作占用了20%以上的内存，就会直接报错；

Keep hunger

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
MongoDB--聚合框架，管道

MongoDB–聚合框架文章目录MongoDB--聚合框架一：管道操作符1. $match2. $project2.2 数学表达式2.3 日期表达式2.4 字符串表达式2.5 逻辑表达式2.6 注意项2.6 运用3. $group使用聚合框架可以对集合中的文档进行变换和组合。可以用多个构件创建一个管道(pipeline,类似一个流)，用于对一连串的文档进行处理。这些构件包括：筛选(filte...
复制链接

扫一扫

专栏目录