mongodb02_索引、mongodb和python交互、代码练习

1.索引备份和python交互

1.1 为什么mongdb需要创建索引

  • 加快查询速度
  • 进行数据的去重

1.2 mongodb创建简单的索引方法

  • 语法:
    • db.集合.ensureIndex({属性:1}),1表示升序, -1表示降序
    • db.集合.createIndex({属性:1})
    • 上面两个命令效果等价
  • 具体操作:db.db_name.ensureIndex({name:1})

1.3 创建索引前后查询速度对比

测试:插入10万条数据到数据库中 插入数据:

`for(i=0;i<100000;i++){db.t255.insert({name:'test'+i,age:i})}`

创建索引前:

db.t1.find({name:'test10000'})
db.t1.find({name:'test10000'}).explain('executionStats')

创建索引后:

db.t255.ensureIndex({name:1})    //升序以name创建唯一索引,创建时不能有两个相同的name值
db.t1.find({name:'test10000'}).explain('executionStats')

1.4 索引的查看

默认情况下_id是集合的索引
查看方式:db.collection_name.getIndexes()
添加索引前:

> db.test2000.insert({"name":"hello",age:20})
WriteResult({ "nInserted" : 1 })
> db.test2000.find()
{ "_id" : ObjectId("5ae0232f625b9ddd91a0e7ae"), "name" : "hello", "age" : 20 }
> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    }
]

添加name为索引后:

> db.test2000.ensureIndex({name:1})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1
        },
        "name" : "name_1",
        "ns" : "test2000.test2000"
    }
]

1.5 mongodb创建唯一索引

在默认情况下mongdb的索引字段的值是可以相同的,仅仅能够提高查询速度

添加唯一索引的语法:

db.collection_name.ensureIndex({"name":1},{"unique":true})

使用普通索引的效果如下:

> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1
        },
        "name" : "name_1",
        "ns" : "test2000.test2000"
    }
]
> db.test2000.insert({name:"hello",age:40})
WriteResult({ "nInserted" : 1 })
> db.test2000.find()
{ "_id" : ObjectId("5ae0232f625b9ddd91a0e7ae"), "name" : "hello", "age" : 20 }
{ "_id" : ObjectId("5ae02421625b9ddd91a0e7af"), "name" : "hello", "age" : 30 }
{ "_id" : ObjectId("5ae02432625b9ddd91a0e7b0"), "name" : "hello", "age" : 40 }

添加age为唯一索引之后:

> db.test2000.createIndex({age:1},{unique:true})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 2,
    "numIndexesAfter" : 3,
    "ok" : 1
}
> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1
        },
        "name" : "name_1",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "unique" : true,
        "key" : {
            "age" : 1
        },
        "name" : "age_1",
        "ns" : "test2000.test2000"
    }
]
> db.test2000.insert({"name":"world",age:20})
WriteResult({
    "nInserted" : 0,
    "writeError" : {
        "code" : 11000,
        "errmsg" : "E11000 duplicate key error collection: test2000.test2000 index: age_1 dup key: { : 20.0 }"
    }
})

1.6 删除索引

语法:db.t1.dropIndex({'索引名称':1})

> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1
        },
        "name" : "name_1",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "unique" : true,
        "key" : {
            "age" : 1
        },
        "name" : "age_1",
        "ns" : "test2000.test2000"
    }
]
> db.test2000.dropIndex({age:1})
{ "nIndexesWas" : 3, "ok" : 1 }
> db.test2000.dropIndex({name:1})
{ "nIndexesWas" : 2, "ok" : 1 }
> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    }
]

1.6 建立复合索引

在进行数据去重的时候,可能用一个字段来保证数据的唯一性,这个时候可以考虑建立复合索引来实现。

例如:抓全贴吧信息,如果把帖子的名字作为唯一索引对数据进行去重是不可取的,因为可能有很多帖子名字相同

建立复合索引的语法:db.collection_name.ensureIndex({字段1:1,字段2:1})

> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    }
]
> db.test2000.createIndex({name:1,age:1})
{
    "createdCollectionAutomatically" : false,
    "numIndexesBefore" : 1,
    "numIndexesAfter" : 2,
    "ok" : 1
}
> db.test2000.getIndexes()
[
    {
        "v" : 2,
        "key" : {
            "_id" : 1
        },
        "name" : "_id_",
        "ns" : "test2000.test2000"
    },
    {
        "v" : 2,
        "key" : {
            "name" : 1,
            "age" : 1
        },
        "name" : "name_1_age_1",
        "ns" : "test2000.test2000"
    }
]

1.7 建立索引注意点

  • 根据需要选择是否需要建立唯一索引
  • 索引字段是升序还是降序在单个索引的情况下不影响查询效率,但是带复合索引的条件下会有影响
    例如:在进行查询的时候如果字段1需要升序的方式排序输出,字段2需要降序的方式排序输出,那么此时复合索引的建立需要把字段1设置为1,字段2设置为-1

2. mongodb的备份和恢复

2.1 备份

备份的语法:

`mongodump -h dbhost -d dbname -o dbdirectory`
  • -h: 服务器地址, 也可以指定端⼝号
  • -d: 需要备份的数据库名称
  • -o: 备份的数据存放位置, 此⽬录中存放着备份出来的数据

示例:mongodump -h 192.168.196.128:27017 -d test1 -o ~/Desktop/test1bak

2.2 恢复

恢复语法:mongorestore -h dbhost -d dbname --dir dbdirectory

  • -h: 服务器地址
  • -d: 需要恢复的数据库实例
  • –dir: 备份数据所在位置

示例:mongorestore -h 192.168.196.128:27017 -d test2 --dir ~/Desktop/test1bak/test1

mongodb和python交互

1. mongdb和python交互的模块

pymongo提供了mongdb和python交互的所有方法
安装方式: pip install pymongo

2. 使用pymongo

  1. 导入pymongo并选择要操作的集合 数据库和集合乜有会自动创建

     from pymongo import MongoClient
     client = MongoClient(host,port)   #实例化连接的客户端
     collection = client[db名][集合名]	#创建数据库的collections的连接
    
  2. 添加一条数据

    ret = collection.insert_one({"name":"test10010","age":33})
    print(ret)
    
  3. 添加多条数据

     item_list = [{"name":"test1000{}".format(i)} for i in range(10)]
         #insert_many接收一个列表,列表中为所有需要插入的字典
     t = collection.insert_many(item_list)
    
  4. 查找一条数据

     #find_one查找并且返回一个结果,接收一个字典形式的条件
     t = collection.find_one({"name":"test10005"})
     print(t)
    
  5. 查找全部数据
    结果是一个Cursor游标对象,是一个可迭代对象,可以类似读文件的指针,但是只能够进行一次读取

     #find返回所有满足条件的结果,如果条件为空,则返回数据库的所有
     t = collection.find({"name":"test10005"})
         #结果是一个Cursor游标对象,是一个可迭代对象,可以类似读文件的指针,
     for i in t:
         print(i)
     for i in t: #此时t中没有内容,这是游标的特殊属性
         print(i)
    
  6. 更新一条数据 注意使用$set 命令

     #update_one更新一条数据
     collection.update_one({"name":"test10005"},{"$set":{"name":"new_test10005"}})
    
  7. 更新全部数据, 注意使用$set 命令

     # update_one更新全部数据
     collection.update_many({"name":"test10005"},{"$set":{"name":"new_test10005"}})
    
  8. 删除一条数据

     #delete_one删除一条数据
     collection.delete_one({"name":"test10010"})
    
  9. 删除全部数据

     #delete_may删除所有满足条件的数据
     collection.delete_many({"name":"test10010"})
    

代码练习

    >db.hero.aggregate({$group:{_id:"$hometown",count:{$sum:1}}})
    { "_id" : "华⼭", "count" : 1 }
    { "_id" : "⼤理", "count" : 2 }
    { "_id" : "桃花岛", "count" : 2 }
    { "_id" : "蒙古", "count" : 2 }

    > db.hero.aggregate({$group:{_id:"$hometown",count:{$sum:1},total_age:{$sum:"$age"}}})
    { "_id" : "华⼭", "count" : 1, "total_age" : 18 }
    { "_id" : "⼤理", "count" : 2, "total_age" : 61 }
    { "_id" : "桃花岛", "count" : 2, "total_age" : 58 }
    { "_id" : "蒙古", "count" : 2, "total_age" : 38 }

{$sum:1}{$sum:"$age"}的区别

    > db.hero.aggregate({$group:{_id:"$hometown",count:{$sum:1},total_age:{$sum:"$age"},avg_age:{$avg:"$age"}}})
    { "_id" : "华⼭", "count" : 1, "total_age" : 18, "avg_age" : 18 }
    { "_id" : "⼤理", "count" : 2, "total_age" : 61, "avg_age" : 30.5 }
    { "_id" : "桃花岛", "count" : 2, "total_age" : 58, "avg_age" : 29 }
    { "_id" : "蒙古", "count" : 2, "total_age" : 38, "avg_age" : 19 }

    > db.hero.aggregate({$group:{_id:null,count:{$sum:1}}})
    { "_id" : null, "count" : 7 }

    > db.hero.aggregate({$group:{_id:"$gender",hometown:{$push:"$hometown"}}})
    { "_id" : false, "hometown" : [ "桃花岛", "蒙古" ] }
    { "_id" : true, "hometown" : [ "蒙古", "桃花岛", "⼤理", "⼤理", "华⼭" ] }

   > db.hero.aggregate({$group:{_id:"$gender",hometown:{$push:"$hometown"},name:{$push:"$name"}}})
    { "_id" : false, "hometown" : [ "桃花岛", "蒙古" ], "name" : [ "⻩蓉", "华筝" ] }
    { "_id" : true, "hometown" : [ "蒙古", "桃花岛", "⼤理", "⼤理", "华⼭" ], "name" : [ "郭靖", "⻩药师", "段誉", "段王爷", "洪七公" ] }

    > db.hero.aggregate({$group:{_id:"$gender",name:{$push:"$$ROOT"}}})
    { "_id" : false, "name" : [ { "_id" : ObjectId("5c416f8fea98ce04531787ee"), "name" : "⻩蓉", "hometown" : "桃花岛", "age" : 18, "gender" : false }, { "_id" : ObjectId("5c416f8fea98ce04531787ef"), "name" : "华筝", "hometown" : "蒙古", "age" : 18, "gender" : false } ] }
    { "_id" : true, "name" : [ { "_id" : ObjectId("5c416f8fea98ce04531787ed"), "name" : "郭靖", "hometown" : "蒙古", "age" : 20, "gender" : true }, { "_id" : ObjectId("5c416f8fea98ce04531787f0"), "name" : "⻩药师", "hometown" : "桃花岛", "age" : 40, "gender" : true }, { "_id" : ObjectId("5c416f8fea98ce04531787f1"), "name" : "段誉", "hometown" : "⼤理", "age" : 16, "gender" : true }, { "_id" : ObjectId("5c416f8fea98ce04531787f2"), "name" : "段王爷", "hometown" : "⼤理", "age" : 45, "gender" : true }, { "_id" : ObjectId("5c416f8fea98ce04531787f3"), "name" : "洪七公", "hometown" : "华⼭", "age" : 18, "gender" : true } ] }

    > db.hero.aggregate({$group:{_id:{hometown:"$hometown",gender:"$gender"},count:{$sum:1}}})
    { "_id" : { "hometown" : "华⼭", "gender" : true }, "count" : 1 }
    { "_id" : { "hometown" : "蒙古", "gender" : true }, "count" : 1 }
    { "_id" : { "hometown" : "桃花岛", "gender" : false }, "count" : 1 }
    { "_id" : { "hometown" : "⼤理", "gender" : true }, "count" : 2 }
    { "_id" : { "hometown" : "蒙古", "gender" : false }, "count" : 1 }
    { "_id" : { "hometown" : "桃花岛", "gender" : true }, "count" : 1 }

插入多条数据把字典放入列表中
db.land.insert([{ "country" : "china", "province" : "sh", "userid" : "a" },{ "country" : "china", "province" : "sh", "userid" : "b" },{ "country" : "china", "province" : "sh", "userid" : "a" },{ "country" : "china", "province" : "sh", "userid" : "c" },{ "country" : "china", "province" : "bj", "userid" : "da" },{ "country" : "china", "province" : "bj", "userid" : "fa" }])


   > db.land.aggregate({$group:{_id:{country:"$country",province:"$province",userid:"$userid"}}})
{ "_id" : { "country" : "china", "province" : "sh", "userid" : "a" } }
{ "_id" : { "country" : "china", "province" : "sh", "userid" : "c" } }
{ "_id" : { "country" : "china", "province" : "sh", "userid" : "b" } }
{ "_id" : { "country" : "china", "province" : "bj", "userid" : "fa" } }
{ "_id" : { "country" : "china", "province" : "bj", "userid" : "da" } }

 > db.land.aggregate({$group:{_id:{country:"$country",province:"$province",userid:"$userid"}}},{$group:{_id:{country:"$_id.country",province:"$_id.province"},count:{$sum:1}}})
    { "_id" : { "country" : "china", "province" : "bj" }, "count" : 2 }
    { "_id" : { "country" : "china", "province" : "sh" }, "count" : 3 }

 > db.hero.aggregate({$match:{age:{$gt:18}}},{$group:{_id:"$gender",count:{$sum:1}}})
    { "_id" : true, "count" : 3 }

> db.hero.aggregate({$group:{_id:"$hometown",count:{$sum:1}}},{$project:{_id:0,count:1,hometown:"$_id"}})
    { "count" : 1, "hometown" : "华⼭" }
    { "count" : 2, "hometown" : "⼤理" }
    { "count" : 2, "hometown" : "桃花岛" }
    { "count" : 2, "hometown" : "蒙古" }

 > db.hero.aggregate({$group:{_id:"$hometown",count:{$sum:1}}},{$project:{_id:0,sum:"$count",hometown:"$_id"}})
    ----------打印结果同上,只是把count重命名为sum
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值