MongoDB CRUD Operations,官方文档参考:MongoDB CRUD Operations — MongoDB Manual
注意:不同mongodb版本,语法有些许差距,比如:db.{collection}.insertOne等部分版本不支持。如下图示例所示:
1.基本的增删改查操作
1.1 插入文档
# mongo shell操作,选择数据库rongsong
use rongsong
# 给集合fruit插入单条数据
db.fruit.insert({name: "apple"})
# 插入多条数据
db.fruit.insert([ {name: "apple"}, {name: "pear"}, {name: "orange"} ])
1.2 查看文档
(1)查询所有文档语句
db.fruit.find()
# 返回结果
{ "_id" : ObjectId("5e8955c6690ab826e1e07ee8"), "name" : "apple" }
{ "_id" : ObjectId("5e8955c6690ab826e1e07ee9"), "name" : "pear" }
{ "_id" : ObjectId("5e8955c6690ab826e1e07eea"), "name" : "orange" }
(2)过滤查询
db.fruit.find({"name": "apple"})
# 返回结果
{ "_id" : ObjectId("5e8955c6690ab826e1e07ee8"), "name" : "apple" }
只显示部分字段,如显示@type为Team的前5条数据,只展示name字段,1代表显示,0代表不展示
db.kg_link.find({"@type": ["Team"]}, {"name": 1, "_id": 0}).limit(5)
(3)正则查询
# 查询文档assert_info字段对应值中包含u'的内容,注意需要转义特殊字符,即:u\'
db.getCollection('http_action').find({"assert_info": {$regex:"u\'"}})
# 接合或操作,"$regex"后面可以接复杂的正则表达式
db.getCollection('job').find({"module": {"$regex": "召回|主题标签"}})
(4)不包含过滤查询
# 过滤status不等于removed的数据
db.getCollection('jobs').find({"status": {"$ne": "removed"}})
(5)大于,小于,大于或等于,小于或等于
db.collection.find({ “field” : { $gt: value } } ); // greater than : field > value
db.collection.find({ “field” : { $lt: value } } ); // less than : field < value
db.collection.find({ “field” : { $gte: value } } ); // greater than or equal to : field >= value
db.collection.find({ “field” : { $lte: value } } ); // less than or equal to : field <= value
(6)in 和 not in ($in $nin)
db.collection.find( { “field” : { $in : array } } );
示例:搜索version为20201101,且assessor为rongsong或wangzhe12的数据
eg: db.getCollection('article_quality_evaluation').find({"version": "20201101", "assessor": {$in: ["rongsong", "wangzhe12"]}})
示例:搜索version为20201101,且assessor不为rongsong或wangzhe12的数据
eg: db.getCollection('article_quality_evaluation').find({"version": "20201101", "assessor": {$nin: ["rongsong", "wangzhe12"]}})
参考:https://blog.csdn.net/u010808135/article/details/89955182?utm_medium=distribute.pc_aggpage_search_result.none-task-blog-2~all~sobaiduend~default-2-89955182.nonecase&utm_term=mongodb%20%E5%AD%97%E7%AC%A6%E4%B8%8D%E7%AD%89%E4%BA%8E
(7)去除重复的值,比如查询满足"version"为"FM_20210106"的所有assessor值(按这个值去重)
distinct
示例:
db.getCollection('recall_evaluation').distinct("assessor",{"version" : "FM_20210106"})
db.getCollection('new_feed_satisfaction_evaluation').distinct("assessor", {"version" : "20210716", "satisfaction": {"$ne": ""}})
(8) 字符串时间范围查询
# 示例
db.getCollection('h5_performance').find({"time": {$gte:"2021-11-11 17:17:35",$lte:"2021-11-11 17:17:38"}})
(9) 判断某个字段是否存在
db.getCollection('people_pooling').find({"team":{"$exists": false}})
db.getCollection('people_pooling').find({"team":{"$exists": true}})
如果字段存在但其值为其他非空值(如空字符串),这些查询将不会匹配到这样的文档。如果你想查询值为空字符串的文档,你可以将$eq运算符的值改为"",如下所示:
db.collection.find({ fieldName: { $exists: true, $eq: "" } })
1.3 删除文档
# 删除单条数据(apple)
db.fruit.remove({"name": "apple"})
# 删除所有数据,一定要谨慎操作
db.fruit.remove({})
1.4 更新文档
(1)单个更新
# 给pear这条数据增加1个字段"address",值为"fujian"
db.fruit.update({"name": "pear"}, {$set:{"address": "fujian"}})
# 将pear这条数据字段"address"的值改为"beijing"
db.fruit.update({"name": "pear"}, {$set:{"address": "beijing"}})
(2)按条件批量更新
# 给满足满足的_id批量更新"manually_tagging"和"manually_tagging_reason"这2个字段
db.getCollection('article_quality_evaluation').updateMany({"_id": {$in: [ObjectId("5faa63723d6865136d019940"), ObjectId("5faa63723d6865136d019941")]}}, {$set:{"manually_tagging": 1, "manually_tagging_reason": "质量高"}})
2.聚合操作
2.1 按某个字段的取值种类统计总数
数据示例:
{
_id: ObjectId(7df78ad8902c)
title: 'MongoDB Overview',
description: 'MongoDB is no sql database',
by_user: 'runoob.com',
url: 'http://www.runoob.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 100
},
{
_id: ObjectId(7df78ad8902d)
title: 'NoSQL Overview',
description: 'No sql database is very fast',
by_user: 'runoob.com',
url: 'http://www.runoob.com',
tags: ['mongodb', 'database', 'NoSQL'],
likes: 10
},
{
_id: ObjectId(7df78ad8902e)
title: 'Neo4j Overview',
description: 'Neo4j is no sql database',
by_user: 'Neo4j',
url: 'http://www.neo4j.com',
tags: ['neo4j', 'database', 'NoSQL'],
likes: 750
}
...
查询语句:
> db.mycol.aggregate([{$group : {_id : "$by_user", num_tutorial : {$sum : 1}}}])
{
"result" : [
{
"_id" : "runoob.com",
"num_tutorial" : 2
},
{
"_id" : "Neo4j",
"num_tutorial" : 1
}
],
"ok" : 1
}
Ps:
- _id字段表示你要基于哪个字段来进行分组(即制定字段值相同的为一组),这里的$by_user就表示要基于time字段来进行分组
- num_tutorial字段的值$sum: 1表示的是获取满足by_user字段相同的这一组的数量乘以后面给定的值(本例为1,那么就是同组的数量)。
2.2 按某个字段去重
数据示例:
{
"_id" : ObjectId("652bba385d5ead41bcee3f6b"),
"scene" : "client",
"plugin" : "bepTool",
"query" : "AI性能测试",
"user_id" : "zhangsan",
"intent" : "AI_PERF",
"query_time" : NumberLong(1697364536371)
},
{
"_id" : ObjectId("652bba445d5ead41bcee3f6c"),
"scene" : "client",
"plugin" : "tor",
"query" : "机器人",
"user_id" : "lisi",
"intent" : "OPENTESTING_ROBOT",
"query_time" : NumberLong(1697364548034)
},
{
"_id" : ObjectId("652bba445d5ead41bcee3f6c"),
"scene" : "client",
"plugin" : "tor",
"query" : "机器人",
"user_id" : "wangwu",
"intent" : "OPENTESTING_ROBOT",
"query_time" : NumberLong(1697364548034)
},
...
假设我们需要根据user_id字段来去重,查询语句:
db.getCollection("ai_plugin_data_statistics").aggregate([{
$group: {
_id: '$user_id',
uniqueValues: {
$addToSet: '$user_id'
}
}
}])
Ps:
- 注意查询语句是$user_id,而不是user_id,很容易遗漏$。
对应的Python代码如下,$fieldName换成$user_id即可。
from pymongo import MongoClient
# 连接到MongoDB数据库
client = MongoClient('mongodb://localhost:27017/')
db = client['your_database'] # 替换成你的数据库名
collection = db['your_collection'] # 替换成你的集合名
# 使用聚合框架进行查询
pipeline = [
{"$group": {"_id": "$fieldName", "uniqueValues": {"$addToSet": "$fieldName"}}}
]
# 执行聚合查询
result = collection.aggregate(pipeline)
# 打印结果
for doc in result:
print(doc['_id']) # 去重后的字段值
3、权限操作
3.1 创建只读用户
use mytest // 创建数据库
db.createCollection('book') // 创建集合,以方便 show dbs 能显示数据库
db.createUser({ user: 'myread', pwd: 'myread_pwd', roles: [{ role: 'read', db: 'mytest' }] })