绝好的MongoDB学习资料之三. Schema Design
1. Document-Oriented
MongoDB 是一种面向文档(document-oriented)的数据库,其内存储的是一种 JSON-like 结构化数据。尽管拥有和关系型数据库 Database/Table 类似的的 DB/Collection 概念,但同一 Collection 内的 Document 可以拥有不同的属性。
(注: 以下 > 提示符表示 mongo JS 代码,>>> 为 Python 代码)
-
> use blog
-
switched to db blog
-
> db.users.insert({name:"user1", age:15})
-
> db.users.insert({name:"user2", age:20, sex:1})
-
> db.users.find()
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "name" : "user1", "age" : 15 }
-
{ "_id" : ObjectId("4c479896089df9b53474170b"), "name" : "user2", "age" : 20, "sex" : 1 }
可以通过 $exists 判断某个字段是否存在。
-
> db.users.find({sex:{$exists:true}})
-
{ "_id" : ObjectId("4c479896089df9b53474170b"), "name" : "user2", "age" : 20, "sex" : 1 }
2. Embed vs. Reference
Document 采取 JSON-like 这种层级结构,因此我们可以直接用嵌入(Embed)代替传统关系型数据库的关联引用(Reference)。
-
> u = db.users.findOne({name:"user1"})
-
{
-
"_id" : ObjectId("4c479885089df9b53474170a"),
-
"name" : "user1",
-
"age" : 15
-
}
-
> u.address = ["address1", "address2"]
-
[ "address1", "address2" ]
-
> db.users.save(u)
-
> db.users.findOne({name:"user1"})
-
{
-
"_id" : ObjectId("4c479885089df9b53474170a"),
-
"name" : "user1",
-
"age" : 15,
-
"address" : [
-
"address1",
-
"address2"
-
]
-
}
-
> db.users.findOne({address:"address1"})
-
{
-
"_id" : ObjectId("4c479885089df9b53474170a"),
-
"name" : "user1",
-
"age" : 15,
-
"address" : [
-
"address1",
-
"address2"
-
]
-
}
MongoDB 支持以 "." 分割的 namespace 路径,但需要注意 key 不能以 "$" 开头,不能包含 "." 字符 (条件表达式中的多级路径须用引号)。
-
> u = db.users.findOne({address:"address1"})
-
{
-
"_id" : ObjectId("4c479885089df9b53474170a"),
-
"name" : "user1",
-
"age" : 15,
-
"address" : [
-
"address1",
-
"address2"
-
]
-
}
-
> u.im = {msn:"user1@hotmail.com", qq:12345678}
-
{ "msn" : "user1@hotmail.com", "qq" : 12345678 }
-
> db.users.save(u)
-
> u = db.users.findOne({"im.qq":12345678})
-
{
-
"_id" : ObjectId("4c479885089df9b53474170a"),
-
"name" : "user1",
-
"age" : 15,
-
"address" : [
-
"address1",
-
"address2"
-
],
-
"im" : {
-
"msn" : "user1@hotmail.com",
-
"qq" : 12345678
-
}
-
}
-
> u.im.qq
-
12345678
-
> u.im.msn
-
user1@hotmail.com
-
> db.users.update({"im.qq":12345678}, {$set:{"im.qq":12345}})
-
> u = db.users.findOne({"im.qq":12345})
-
{
-
"_id" : ObjectId("4c479885089df9b53474170a"),
-
"name" : "user1",
-
"age" : 15,
-
"address" : [
-
"address1",
-
"address2"
-
],
-
"im" : {
-
"msn" : "user1@hotmail.com",
-
"qq" : 12345
-
}
-
}
-
> u = db.users.find({"im.qq":{$exists:true}}, {"im.qq":1})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "im" : { "qq" : 12345 } }
注意: 由于每篇文档都包含完整的 key 数据,因此使用尽可能短的 key 可以有效节省存储空间。
3. Operator
(1) $all: 判断数组属性是否包含全部条件。
-
> db.users.insert({name:"user3", data:[1,2,3,4,5,6,7]})
-
> db.users.insert({name:"user4", data:[1,2,3]})
-
> db.users.find({data:{$all:[2,3,4]}})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "name" : "user3", "data" : [ 1, 2, 3, 4, 5, 6, 7 ] }
注意和 $in 的区别。$in 是检查目标属性值是条件表达式中的一员,而 $all 则要求属性值包含全部条件元素。
(2) $size: 匹配数组属性元素数量。
-
> db.users.insert({name:"user3", data:[1,2,3,4,5,6,7]})
-
> db.users.insert({name:"user4", data:[1,2,3]})
-
> db.users.find({data:{$size:3}})
-
{ "_id" : ObjectId("4c47a13bb48cde79c6780df1"), "name" : "user4", "data" : [ 1, 2, 3 ] }
(3) $type: 判断属性类型。
-
> db.users.insert({name:"user5", t:1})
-
> db.users.insert({name:"user6", t:1.34})
-
> db.users.insert({name:"user7", t:"abc"})
-
> db.users.find({t:{$type:1}})
-
{ "_id" : ObjectId("4c47a231b48cde79c6780df2"), "name" : "user5", "t" : 1 }
-
{ "_id" : ObjectId("4c47a23eb48cde79c6780df3"), "name" : "user6", "t" : 1.34 }
-
> db.users.find({t:{$type:2}})
-
{ "_id" : ObjectId("4c47a258b48cde79c6780df4"), "name" : "user7", "t" : "abc" }
类型值:
- double:1
- string: 2
- object: 3
- array: 4
- binary data: 5
- object id: 7
- boolean: 8
- date: 9
- null: 10
- regular expression: 11
- javascript code: 13
- symbol: 14
- javascript code with scope: 15
- 32-bit integer: 16
- timestamp: 17
- 64-bit integer: 18
- min key: 255
- max key: 127
(4) 使用正则表达式进行查询。
-
> db.users.find({name:/user[135]/i}, {name:1})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "name" : "user1" }
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "name" : "user3" }
-
{ "_id" : ObjectId("4c47a231b48cde79c6780df2"), "name" : "user5" }
-
>>> users = db.users.find({"name" : {"$regex" : r"(?i)user[135]"}}, ["name"])
-
>>> for u in users: print u
-
...
-
{u'_id': ObjectId('4c479885089df9b53474170a'), u'name': u'user1'}
-
{u'_id': ObjectId('4c47a133b48cde79c6780df0'), u'name': u'user3'}
-
{u'_id': ObjectId('4c47a231b48cde79c6780df2'), u'name': u'user5'}
正则表达式标记:
i: 忽略大小写。
m: 默认为单行处理,此标记表示多行。
x: 扩展。
(5) 数组属性元素值匹配。
-
> db.users.find({data:"abc"})
-
{ "_id" : ObjectId("4c47a481b48cde79c6780df5"), "name" : "user8", "data" : [ { "a" : 1, "b" : 10 }, 3, "abc" ] }
-
> db.users.find({data:{$elemMatch:{a:1, b:{$gt:5}}}})
-
{ "_id" : ObjectId("4c47a481b48cde79c6780df5"), "name" : "user8", "data" : [ { "a" : 1, "b" : 10 }, 3, "abc" ] }
{data:"abc"} 仅简单匹配数组属性是否包含该元素。$elemMatch 则可以处理更复杂的元素查找条件。当然也可以写成如下方式。
-
> db.users.find({"data.a":1, "data.b":{$gt:5}})
-
{ "_id" : ObjectId("4c47a481b48cde79c6780df5"), "name" : "user8", "data" : [ { "a" : 1, "b" : 10 }, 3, "abc" ] }
还可以直接使用序号进行操作。
-
> db.users.find({"data.2":3})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "name" : "user3", "data" : [ 1, 2, 3, 4, 5, 6, 7 ] }
-
{ "_id" : ObjectId("4c47a13bb48cde79c6780df1"), "name" : "user4", "data" : [ 1, 2, 3 ] }
(6) $not: 取反,表示返回条件不成立的文档。
似乎只能跟正则和 $mod 一起使用????
-
> u = db.users.find({name:{$not:/user3/}}, {name:1})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "name" : "user1" }
-
{ "_id" : ObjectId("4c47a13bb48cde79c6780df1"), "name" : "user4" }
-
{ "_id" : ObjectId("4c47a231b48cde79c6780df2"), "name" : "user5" }
-
{ "_id" : ObjectId("4c47a23eb48cde79c6780df3"), "name" : "user6" }
-
{ "_id" : ObjectId("4c47a258b48cde79c6780df4"), "name" : "user7" }
-
{ "_id" : ObjectId("4c47a481b48cde79c6780df5"), "name" : "user8" }
-
{ "_id" : ObjectId("4c479896089df9b53474170b"), "name" : "user2" }
(7) $unset: 和 $set 相反,表示移除文档属性。
-
> u = db.users.find({name:"user1"})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "name" : "user1", "age" : 15, "address" : [ "address1", "address2" ], "im" : { "msn" : "user1@hotmail.com", "qq" : 12345 } }
-
> db.users.update({name:"user1"}, {$unset:{address:1, im:1}})
-
> u = db.users.find({name:"user1"})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "age" : 15, "name" : "user1" }
(8) $push: 和 $ pushAll 都是向数组属性添加元素。
-
> u = db.users.find({name:"user1"})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "age" : 15, "name" : "user1" }
-
> db.users.update({name:"user1"}, {$push:{data:1}})
-
> u = db.users.find({name:"user1"})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "age" : 15, "data" : [ 1 ], "name" : "user1" }
-
> db.users.update({name:"user1"}, {$pushAll:{data:[2,3,4,5]}})
-
> u = db.users.find({name:"user1"})
-
{ "_id" : ObjectId("4c479885089df9b53474170a"), "age" : 15, "data" : [ 1, 2, 3, 4, 5 ], "name" : "user1" }
(9) $addToSet: 和 $push 类似,不过仅在该元素不存在时才添加 (Set 表示不重复元素集合)。
-
> db.users.update({name:"user2"}, {$addToSet:{data:1}})
-
> db.users.update({name:"user2"}, {$addToSet:{data:1}})
-
> u = db.users.find({name:"user2"})
-
{ "_id" : ObjectId("4c479896089df9b53474170b"), "data" : [ 1 ], "name" : "user2" }
-
> db.users.update({name:"user2"}, {$push:{data:1}})
-
> u = db.users.find({name:"user2"})
-
{ "_id" : ObjectId("4c479896089df9b53474170b"), "data" : [ 1, 1 ], "name" : "user2" }
要添加多个元素,使用 $each。
-
> db.users.update({name:"user2"}, {$addToSet:{data:{$each:[1,2,3,4]}}})
-
> u = db.users.find({name:"user2"})
-
{ "_id" : ObjectId("4c479896089df9b53474170b"), "data" : [ 1, 2, 3, 4 ], "name" : "user2" }
(10) $pop: 移除数组属性的元素,$pull 按值移除,$pullAll 移除所有符合提交的元素。
-
> u = db.users.find({name:"user3"})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "data" : [ 1, 2, 3, 4, 5, 6, 7, 2, 3 ], "name" : "user3" }
-
> db.users.update({name:"user3"}, {$pop:{data:1}}) // 移除最后一个元素
-
> u = db.users.find({name:"user3"})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "data" : [ 1, 2, 3, 4, 5, 6, 7, 2 ], "name" : "user3" }
-
> db.users.update({name:"user3"}, {$pop:{data:-1}}) // 移除第一个元素
-
> u = db.users.find({name:"user3"})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "data" : [ 2, 3, 4, 5, 6, 7, 2 ], "name" : "user3" }
-
> db.users.update({name:"user3"}, {$pull:{data:2}}) // 移除全部 2
-
> u = db.users.find({name:"user3"})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "data" : [ 3, 4, 5, 6, 7 ], "name" : "user3" }
-
> db.users.update({name:"user3"}, {$pullAll:{data:[4,5,6]}}) // 移除 4,5,6
-
> u = db.users.find({name:"user3"})
-
{ "_id" : ObjectId("4c47a133b48cde79c6780df0"), "data" : [ 3, 7 ], "name" : "user3" }
(11) $where: 用 JS 代码来代替有些丑陋的 $lt、$gt。
MongoDB 内置了 Javascript Engine (SpiderMonkey)。可直接使用 JS Expression,甚至使用 JS Function 写更复杂的 Code Block。
-
> for (var i = 0; i < 10; i++) db.users.insert({name:"user"+i, age:i})
-
> db.users.find()
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226e"), "name" : "user0", "age" : 0 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226f"), "name" : "user1", "age" : 1 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2270"), "name" : "user2", "age" : 2 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2271"), "name" : "user3", "age" : 3 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2272"), "name" : "user4", "age" : 4 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2273"), "name" : "user5", "age" : 5 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2274"), "name" : "user6", "age" : 6 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2275"), "name" : "user7", "age" : 7 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2276"), "name" : "user8", "age" : 8 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2277"), "name" : "user9", "age" : 9 }
-
> db.users.find({$where:"this.age > 7 || this.age < 3"})
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226e"), "name" : "user0", "age" : 0 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226f"), "name" : "user1", "age" : 1 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2270"), "name" : "user2", "age" : 2 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2276"), "name" : "user8", "age" : 8 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2277"), "name" : "user9", "age" : 9 }
-
> db.users.find("this.age > 7 || this.age < 3")
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226e"), "name" : "user0", "age" : 0 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226f"), "name" : "user1", "age" : 1 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2270"), "name" : "user2", "age" : 2 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2276"), "name" : "user8", "age" : 8 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2277"), "name" : "user9", "age" : 9 }
-
> db.users.find({$where: function(){ return this.age > 7 || this.age < 3;}})
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226e"), "name" : "user0", "age" : 0 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da226f"), "name" : "user1", "age" : 1 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2270"), "name" : "user2", "age" : 2 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2276"), "name" : "user8", "age" : 8 }
-
{ "_id" : ObjectId("4c47b3372a9b2be866da2277"), "name" : "user9", "age" : 9 }
-
>>> for u in db.users.find({"$where":"this.age > 7 || this.age < 3"}): print u
-
...
-
{u'age': 0.0, u'_id': ObjectId('4c47b3372a9b2be866da226e'), u'name': u'user0'}
-
{u'age': 1.0, u'_id': ObjectId('4c47b3372a9b2be866da226f'), u'name': u'user1'}
-
{u'age': 2.0, u'_id': ObjectId('4c47b3372a9b2be866da2270'), u'name': u'user2'}
-
{u'age': 8.0, u'_id': ObjectId('4c47b3372a9b2be866da2276'), u'name': u'user8'}
-
{u'age': 9.0, u'_id': ObjectId('4c47b3372a9b2be866da2277'), u'name': u'user9'}
-
>>> for u in db.users.find().where("this.age > 7 || this.age < 3"): print u
-
...
-
{u'age': 0.0, u'_id': ObjectId('4c47b3372a9b2be866da226e'), u'name': u'user0'}
-
{u'age': 1.0, u'_id': ObjectId('4c47b3372a9b2be866da226f'), u'name': u'user1'}
-
{u'age': 2.0, u'_id': ObjectId('4c47b3372a9b2be866da2270'), u'name': u'user2'}
-
{u'age': 8.0, u'_id': ObjectId('4c47b3372a9b2be866da2276'), u'name': u'user8'}
-
{u'age': 9.0, u'_id': ObjectId('4c47b3372a9b2be866da2277'), u'name': u'user9'}
-
>>> for u in db.users.find().where("function() { return this.age > 7 || this.age < 3;}"): print u
-
...
-
{u'age': 0.0, u'_id': ObjectId('4c47b3372a9b2be866da226e'), u'name': u'user0'}
-
{u'age': 1.0, u'_id': ObjectId('4c47b3372a9b2be866da226f'), u'name': u'user1'}
-
{u'age': 2.0, u'_id': ObjectId('4c47b3372a9b2be866da2270'), u'name': u'user2'}
-
{u'age': 8.0, u'_id': ObjectId('4c47b3372a9b2be866da2276'), u'name': u'user8'}
-
{u'age': 9.0, u'_id': ObjectId('4c47b3372a9b2be866da2277'), u'name': u'user9'}