Python操作Mongodb的高级操作
数据准备,向Mongodb中的test库下的students集合中插入4条数据:
from pymongo import MongoClient
client = MongoClient()
database = client.test
collection = database.students
students = [
{"name":"张三","gender":"男","age":20,"score":89},
{"name":"李四","gender":"男","age":21,"score":78},
{"name":"小花","gender":"女","age":19,"score":95},
{"name":"小白","gender":"女","age":21,"score":83}
]
collection.insert_many(students)
1、显式的OR查询:
语法:find({"$or":[{条件1},{条件2}...]}) ,例如:
students = collection.find({"$or":[{"age":{"$gt":20}},{"score":{"$gt":90}}]}) # 查询age>20或score>90的记录
2、显式的AND查询:
语法:find({"$and":[{条件1},{条件2}...]}) ,例如:
students = collection.find({"$and":[{"age":{"$gt":20}},{"score":{"$gt":80}}]}) # 查询age>20且score>80的记录
3、在AND中嵌套OR查询:
students = collection.find({"$and":[{"$or":[{"age":{"$gt":20}},{"gender":"女"}]},{"$or":[{"score":{"$gt":90}},{"name":"小白"}]}]}) # 查询(age>20或者gender为女)并且(score>90或name为小白)的记录
4、当记录中嵌套有字典和列表时的相关操作:
数据准备,向Mongodb中的test库下的students集合中插入5条数据:
from pymongo import MongoClient
client = MongoClient()
database = client.test
collection = database.students
students = [
{"name":"张三","address":{"province":"广东省","city":"深圳"},"score":[80,87,79]},
{"name":"李四","address":{"province":"广东省","city":"广州"},"score":[90,80,95]},
{"name":"王五","address":{"province":"广东省","city":"深圳"},"score":[78,86,96]},
{"name":"赵六","address":{"province":"广东省","city":"深圳"},"score":[90,92,86]},
{"name":"刘七","address":{"province":"广东省","city":"东莞"},"score":[69,97,98]}
]
collection.insert_many(students)
① 使用点符号定位嵌套的字典中字段
datas = collection.find({"address.city":"深圳"}) # 查询city字段值为深圳的所有记录
datas = collection.find({"$or":[{"address.city":"深圳"},{"address.city":"广州"}]}) # 查询city字段值为深圳或广州的所有记录
② 记录中嵌套数组的操作
datas = collection.find({"score":80}) # 查询数组score中包含80的所有记录
datas = collection.find({"score.0":80}) # 查询数组score中的第一个元素为80的所有记录
datas = collection.find({"score":{"$ne":80}}) # 查询数组score中不包含80的所有记录
datas = collection.find({"score":{"$size":3}}) # 查询数组score长度为3的所有记录
5、聚合操作
语法格式:aggregate([{聚合条件1},{聚合条件2}...])
聚合条件可以有多个,前一个聚合条件的输出,作为下一个聚合条件的输入,当聚合条件为0个时,相当于find()
datas = collection.aggregate([]) # 查询所有数据相当于find()
聚合关键字$match:筛选符合条件的记录,用法跟find()的第一个参数类似
datas = collection.aggregate([{"$match":{"address.city":"深圳"}}]) # 筛选city字段值为深圳的记录
聚合关键字$project:
①可以指定返回部分字段值或指定不返回某些字段值,用法跟find()的第二个参数类似,返回则value为1,不返回则value为0:
datas = collection.aggregate([{"$project":{"_id":0,"name":1,"score":1}}]) # 只返回记录中的name和score字段值,且不返回_id字段值
②将已有的字段的值赋值给新的字段(新的字段可以不存在)
datas = collection.aggregate([{"$project":{"城市":"$address.city"}}]) # 将address下的city的值赋值给新字段“城市”
③特殊值处理,当值为0或1这样的情况,要使用$literal关键字,否则会报错
datas = collection.aggregate([{"$project":{"score":{"$literal":1}}}]) # 将所有记录的score赋值为1
聚合关键字$group:
插入新数据:
students = [
{"name":"张三","score":90,"date":"2019-03-16"},
{"name":"李四","score":87,"date":"2019-03-16"},
{"name":"王五","score":89,"date":"2019-03-16"},
{"name":"张三","score":88,"date":"2019-03-17"},
{"name":"李四","score":98,"date":"2019-03-17"},
{"name":"王五","score":86,"date":"2019-03-17"}
]
collection.insert_many(students)
①使用$group进行分组操作(会去重)
datas = collection.aggregate([{"$group":{"_id":"$name"}}]) # 按name字段进行分组
返回结果:
②分组并计算最大值、最小值、平均值、总和、记录条数等结果
datas = collection.aggregate([{"$group":{
"_id":"$name", # 根据name进行分组
"max_score":{"$max":"$score"}, # 获取最大值
"min_score":{"$min":"$score"}, # 获取最小值
"avg_score":{"$avg":"$score"}, # 获取平均值
"sum_score":{"$sum":"$score"}, # 获取总分数
"count":{"$sum":1} # 统计每个name参与计算的记录数
}}])
返回结果:
③获取最早或最晚插入的记录
datas = collection.aggregate([{"$group":{"_id":"$name","date":{"$last":"$date"},"score":{"$last":"$score"}}}]) # 根据name字段分组,取最后插入的date和score字段值
返回结果:
datas = collection.aggregate([{"$group":{"_id":"$name","date":{"$first":"$date"},"score":{"$first":"$score"}}}]) # 根据name字段分组,取最早插入的date和score字段值
返回结果:
$last 和 $first 可以混合使用:
datas = collection.aggregate([{"$group":{"_id":"$name","date":{"$first":"$date"},"score":{"$last":"$score"}}}]) # # 根据name字段分组,取最早插入的date和最晚插入的score字段值
返回结果:
聚合关键字$unwind: