问题:
MongoDB,每条数据都有一个创建时间(create_time)字段。
按月统计一下业务数据的创建次数。
思路:
典型的数据库聚合操作,能用一条命令搞定。
直接上脚本
数据记录样例:
初始脚本
db.getCollection('XXX').aggregate(
[
{
$project: {
date: {
$dateToString: {
format: "%Y-%m-%d",
date: "$create_time"
}
},
year: {
$dateToString: {
format: "%Y",
date: "$create_time"
}
},
year_month: {
$dateToString: {
format: "%Y-%m",
date: "$create_time"
}
},
day: {
$dayOfMonth: "$create_time"
},
responseTime: "$create_time"
}
},
{
$match: {
$or: [
{
year: '2017'
},
{
year: '2018'
},
{
year: '2019'
},
{
year: '2020'
},
{
year: '2021'
}
]
}
},
{
$group: {
"_id": "$year_month",
"sum": {
$sum: 1
}
}
},
{
$sort: {
"_id": 1
}
}
]
)
执行结果
/* 1 */
{
"_id" : "2019-05",
"sum" : 19.0
}
/* 2 */
{
"_id" : "2019-06",
"sum" : 68.0
}
/* 3 */
{
"_id" : "2019-07",
"sum" : 66.0
}
/* 4 */
{
"_id" : "2019-08",
"sum" : 113.0
}
/* 5 */
{
"_id" : "2019-09",
"sum" : 42.0
}
/* 6 */
{
"_id" : "2019-10",
"sum" : 17.0
}
/* 7 */
{
"_id" : "2019-11",
"sum" : 12.0
}
/* 8 */
{
"_id" : "2019-12",
"sum" : 67.0
}
/* 9 */
{
"_id" : "2020-01",
"sum" : 13.0
}
/* 10 */
{
"_id" : "2020-02",
"sum" : 57.0
}
/* 11 */
{
"_id" : "2020-03",
"sum" : 45.0
}
/* 12 */
{
"_id" : "2020-04",
"sum" : 49.0
}
/* 13 */
{
"_id" : "2020-05",
"sum" : 23.0
}
/* 14 */
{
"_id" : "2020-06",
"sum" : 47.0
}
/* 15 */
{
"_id" : "2020-07",
"sum" : 67.0
}
/* 16 */
{
"_id" : "2020-08",
"sum" : 66.0
}
/* 17 */
{
"_id" : "2020-09",
"sum" : 27.0
}
/* 18 */
{
"_id" : "2020-10",
"sum" : 20.0
}
/* 19 */
{
"_id" : "2020-11",
"sum" : 22.0
}
/* 20 */
{
"_id" : "2020-12",
"sum" : 15.0
}
/* 21 */
{
"_id" : "2021-01",
"sum" : 15.0
}
/* 22 */
{
"_id" : "2021-02",
"sum" : 1.0
}
/* 23 */
{
"_id" : "2021-03",
"sum" : 13.0
}
/* 24 */
{
"_id" : "2021-04",
"sum" : 7.0
}
/* 25 */
{
"_id" : "2021-05",
"sum" : 8.0
}
/* 26 */
{
"_id" : "2021-06",
"sum" : 6.0
}
/* 27 */
{
"_id" : "2021-07",
"sum" : 3.0
}
优化执行语句一
有同学说了,match部分你按年写的,每次都要加年限,好麻烦哦。
db.getCollection('XXX').aggregate(
[
{
$project: {
date: {
$dateToString: {
format: "%Y-%m-%d",
date: "$create_time"
}
},
year: {
$dateToString: {
format: "%Y",
date: "$create_time"
}
},
year_month: {
$dateToString: {
format: "%Y-%m",
date: "$create_time"
}
},
day: {
$dayOfMonth: "$create_time"
},
responseTime: "$create_time"
}
},
{
$match: {
year: {$regex: "20[0-9]{2}"}
}
},
{
$group: {
"_id": "$year_month",
"sum": {
$sum: 1
}
}
},
{
$sort: {
"_id": 1
}
}
]
)
咱们直接正则匹配一下就可以喽。(year: {$regex: "20[0-9]{2}"})
优化执行语句二
又有同学说了,你咋知道的必须是20开头的呢,九几年的不行吗。
db.getCollection('XXX').aggregate(
[
{
$project: {
date: {
$dateToString: {
format: "%Y-%m-%d",
date: "$create_time"
}
},
year: {
$dateToString: {
format: "%Y",
date: "$create_time"
}
},
year_month: {
$dateToString: {
format: "%Y-%m",
date: "$create_time"
}
},
day: {
$dayOfMonth: "$create_time"
},
responseTime: "$create_time"
}
},
{
$match: {
year: {$regex: "[0-9]{4}"}
}
},
{
$group: {
"_id": "$year_month",
"sum": {
$sum: 1
}
}
},
{
$sort: {
"_id": 1
}
}
]
)
再换个正则语句就可以搞定喽。(year: {$regex: "[0-9]{4}"})
总结:
MongoDB有很多聚方式,可用来直接统计数据。
你学废了吗?