一、mongodb实现聚合、distinct
1、group by
db.collection_201605.aggregate([
{
$match:{
"createTime":{$gte:ISODate("2016-05-09T16:00:00.000Z"), $lt:ISODate("2016-05-10T16:00:00.000Z")}, "source" : "9001"
}
},
{
$group:{
_id:{isMonthPayLamp:"$isMonthPayLamp"},totalAmount:{$sum:"$times"},count:{$sum:1}
}
},
{
$sort: {_id:1}
}
])
match:过滤条件。相当于where
group:需要分组的字段,这里是以isMonthPayLamp分组,times累加存储到totalAmount,分组数量count
2、distinct
db.runCommand({"distinct":"collection_201605", "key":"fansUid", "query":{"createTime":{$gte:ISODate("2016-05-07T16:00:00.000Z"), $lt:ISODate("2016-05-08T16:00:00.000Z")}, "isHighChanceUser": "1"}})
distinct:表示需要对哪个collection进行distinct
key:fansUid,对fansUid进行去重
query:内嵌查询条件
二、spring集成mongodb如何实现group by、distinct
版本:mongo-java-driver-3.0.2.jar、spring4.0、spring-data-mongodb-1.8.0.RELEASE.jar
1、group by
写法一:
public List<Document> testMethod(Map<String, Object> conditionMap, String collectionName){
dayAddOne(conditionMap);
Aggregation agg = Aggregation.newAggregation(
Aggregation.match(
conditionMap.containsKey("source")?Criteria.where("createTime").gte(conditionMap.get("beginDate")).lt(conditionMap.get("endDate")).and("source").is(conditionMap.get("source")):
Criteria.where("createTime").gte(conditionMap.get("beginDate")).lt(conditionMap.get("endDate"))
),//筛选符合条件的记录
Aggregation.group(Aggregation.fields().and("source").and("isMonthPayLamp"))//设置分组字段
.count().as("count").sum("times").as("totalAmount")
);
AggregationResults<Document> results = mongoTemplate.aggregate(agg, collectionName, Document.class);//agg日期会少一天,所以在前面传递过来的日期需要增加1天
List<Document> list = results.getMappedResults();
return list;
}
说明:dayAddOne:由于日期是大于等于beginDate 小于endDate;方法作用是endDate加一天
Aggregation.match里面为具体的查询条件
Aggregation.group为分组具体实现,此例以source及isMonthPayLamp做组合聚合,分组数量存在count字段,times累计的和存到totalAmount
mongoTemplate.aggregate执行查询语句
写法二:
public List<Document> testMethod2(Date from, Date to, String source, String collectionName) {
BasicDBObject createTime = new BasicDBObject("$gt", from).append("$lt", to);
Map<String, Object> map = new HashMap<>();
map.put("createTime", createTime);
if (!BlankUtil.isBlank(source)) {
map.put("source", source);
}
BasicDBObject matchQuery = new BasicDBObject(map);
BasicDBObject group = new BasicDBObject("_id", new BasicDBObject("isMonthPayLamp", "$isMonthPayLamp").append("source", "$source")).append("totalAmount", new BasicDBObject("$sum", "$times")).append("count", new BasicDBObject("$sum", 1));
BasicDBObject sort = new BasicDBObject("_id",1);
List<Document> aggregate = mongoClient.aggregate(collectionName, matchQuery, group, sort);
return aggregate;
}
此语句实现与写法1实现一样的功能
2、日期聚合:
正常的日期存储到mongodb时候,如果不做时区方面的处理。会比正常的北京时间少8个小时(
即2016-05-23 18:00存储到mongodb,会变成2016-05-23T10:00,在sql语句中,查询某天的数据:
where create_time >= "2016-05-08 00:00:00" and create_time < "2016-05-09 00:00:00";
但是,在mongodb中:
db.collection.find({"query":{"createDate":{$gte:ISODate("2016-05-07T16:00:00.000Z"),$lt:ISODate("2016-05-08T16:00:00.000Z")}}})
写法1与2的区别正是在时区部分
写法1:(该写法有时区问题)
public List<Document> monthLampList(Map<String, Object> conditionMap, String collectionName){
dayAddOne(conditionMap);
Aggregation agg = Aggregation.newAggregation(
Aggregation.project()
.andExpression("year(createDate)").as("year")
.andExpression("month(createDate)").as("month")
.andExpression("dayOfMonth(createDate)").as("day")
.andExpression("source").as("source")
.andExpression("createDate").as("createDate"),
Aggregation.match(
conditionMap.containsKey("source")?Criteria.where("createDate").gte(conditionMap.get("beginDate")).lt(conditionMap.get("endDate")).and("source").is(conditionMap.get("source")):
Criteria.where("createDate").gte(conditionMap.get("beginDate")).lt(conditionMap.get("endDate"))
),//筛选符合条件的记录
Aggregation.group(Aggregation.fields().and("source").and("year").and("month").and("day"))//设置分组字段
.count().as("count")
);
AggregationResults<Document> results = mongoTemplate.aggregate(agg, collectionName, Document.class);//agg日期会少一天,所以在前面传递过来的日期需要增加1天
List<Document> list = results.getMappedResults();
return list;
}
说明:
Aggregation.project()
.andExpression("year(createDate)").as("year")
.andExpression("month(createDate)").as("month")
.andExpression("dayOfMonth(createDate)").as("day")
.andExpression("source").as("source")
.andExpression("createDate").as("createDate"),
上述语句表示:解析createDate的年月日,并分别存储在year month day三个字段中;source、createDate表示字段保留,下面查询语句需要用到,此处如果不加,下面语句会找不到字段
Aggregation.group(Aggregation.fields().and("source").and("year").and("month").and("day"))//设置分组字段
.count().as("count")
表示以source及年月日做分组处理,数量保存到count字段。
写法二:实现与写法1一致的功能,区别在于加上了8小时时区差异处理
public List<Document> monthLampList(Map<String, Object> conditionMap, String collectionName){
dayAddOne(conditionMap);//日期增加1天
BasicDBObject createDate = new BasicDBObject("$gte", conditionMap.get("beginDate")).append("$lt", conditionMap.get("endDate"));
Map<String, Object> map = new HashMap<>();
map.put("createDate", createDate);
if (!BlankUtil.isBlank(conditionMap.get("source"))) {
map.put("source", conditionMap.get("source"));
}
BasicDBList dbList = new BasicDBList();
dbList.add("$createDate");
dbList.add(28800000);//解决timezone 8小时时差
Map<String, Object> dbObjIdMap = new HashMap<String, Object>();
dbObjIdMap.put("year", new BasicDBObject("$year", new BasicDBObject("$add", dbList)));
dbObjIdMap.put("month", new BasicDBObject("$month", new BasicDBObject("$add", dbList)));
dbObjIdMap.put("day",
new BasicDBObject("$dayOfMonth", new BasicDBObject("$add", dbList)));
BasicDBObject matchQuery = new BasicDBObject(map);
BasicDBObject group = new BasicDBObject("_id", new BasicDBObject(dbObjIdMap).
append("source", "$source")).append("count", new BasicDBObject("$sum", 1));
BasicDBObject sort = new BasicDBObject("_id",1);
List<Document> aggregate = mongoClient.aggregate(collectionName, matchQuery, group, sort);
List<Document> docList = new ArrayList<Document>();
if (null != aggregate && 0 < aggregate.size()) {
for (Document doc : aggregate) {
Document tempdoc = (Document)doc.get("_id");
tempdoc.put("count", doc.get("count").toString());
docList.add(tempdoc);
}
}
return docList;
}
注意:如果group中只有一个分组字段,那么该分组字段返回结果中是存放在key为_id的值中,如果是多个分组字段,则按实际字段名存放
3、distinct
spring mongodb没有具体的distinct关键字,可以用addToSet替代实现
例如,可以对group修改如下:
Aggregation.group(Aggregation.fields().and("source").and("isMonthPayLamp").and("year").and("month").and("day"))//设置分组字段
.count().as("count").sum("times").as("totalAmount").addToSet("fansUid").as("distinctFansUid")//增加publishDate为分组后输出的字段(addToSet相当于distinct,把distinct结果集存放到fansUid
说明:
addToSet("fansUid").as("distinctFansUid")该语句,把分组中fansUid集合起来,去重,存放到distinctFansUid返回。结果格式为 [ fansUid, fansUid ]
可以通过解析distinctFansUid获得distinct fansUid列表