1. 数据导出 mongoexport
命令: mongoexport -d database -c collection -o output_file 参数说明: -d 指明使用的数据库 -c 指明要导出的表 -o 指明要导出的文件名 导出csv格式: mongoexport -d database -c collection --csv -f uid,username,age -o output_file 参数说明: --csv 指明要导出为csv格式 -f 指明需要导出哪些列
2. 数据导入 mongoimport
导入json数据: mongoimport -d database -c collection input_file ☞ 导入数据时会隐式创建表结构 导入csv数据: mongoimport -d database -c collection --type csv --headerline --file input_file 参数说明: --type 指明要导入的文件格式 -headerline 指明不导入第一行 --file 指明要导入的文件路径
3. 数据备份 mongodump
命令: mongodump -d database [-o output_directory]
4. 数据恢复 mongorestore
命令: mongorestore -d database [--drop] input_directory/* 参数说明: --drop 导入之前先删除所有collection
5. 主从搭建
在主服务器上启动mongod时加上【--master】参数,在从服务器上启动mongod时加上【--slave】与【--source 主服务器ip】参数,即可实现同步。
☞ MongoDB最新版本不再推荐此方案
6. 集群部署(Replica Sets)
1) 创建文件存储路径
$ sudo mkdir -p /data/replicaset/db/r0 $ sudo mkdir -p /data/replicaset/db/r1 $ sudo mkdir -p /data/replicaset/db/r2
2) 创建日志文件路径
$ sudo mkdir -p /data/replicaset/log
3) 创建主从key文件,用于标识集群的私钥完整路径,如果各个实例的key file内容不一致,程序将不能正常使用
$ sudo mkdir -p /data/replicaset/key $ sudo touch /data/replicaset/key/r0 $ sudo vim /data/replicaset/key/r0 在r0文件中输入:this is rs1 super secret key, 保存退出. $ sudo cp /data/replicaset/key/r0 /data/replicaset/key/r1 $ sudo cp /data/replicaset/key/r0 /data/replicaset/key/r2 $ sudo chmod 600 /data/replicaset/key/r*
4) 启动三个实例
$ sudo /usr/local/mongodb/bin/mongod --replSet rs1 --keyFile /data/replicaset/key/r0 --fork --port 28010 --dbpath /data/replicaset/db/r0 --logpath=/data/replicaset/log/r0.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 3985 all output going to: /data/replicaset/log/r0.log child process started successfully, parent exiting $ sudo /usr/local/mongodb/bin/mongod --replSet rs1 --keyFile /data/replicaset/key/r1 --fork --port 28011 --dbpath /data/replicaset/db/r1 --logpath=/data/replicaset/log/r1.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 4038 all output going to: /data/replicaset/log/r1.log child process started successfully, parent exiting $ sudo /usr/local/mongodb/bin/mongod --replSet rs1 --keyFile /data/replicaset/key/r2 --fork --port 28012 --dbpath /data/replicaset/db/r2 --logpath=/data/replicaset/log/r2.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 4088 all output going to: /data/replicaset/log/r2.log child process started successfully, parent exiting
5) 配置及初始化Replica Sets
$ mongo -port 28010 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28010/test > config_rs1 = {_id:'rs1', members:[ ... {_id:0,host:'localhost:28010',priority:1}, ... {_id:1,host:'localhost:28011'}, ... {_id:2,host:'localhost:28012'}] ... } { "_id" : "rs1", "members" : [ { "_id" : 0, "host" : "localhost:28010", "priority" : 1 }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } > rs.initiate(config_rs1); { "info" : "Config now saved locally. Should come online in about a minute.", "ok" : 1 } >
6) 查看Replica Sets状态
> rs.status() { "set" : "rs1", "date" : ISODate("2014-05-14T06:45:00Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 1, --1表明正常;0表明异常 "state" : 1, --1表明是primary;2表明是Secondary "stateStr" : "PRIMARY", --表明此机器是主库 "uptime" : 532, "optime" : Timestamp(1400049697, 1), "optimeDate" : ISODate("2014-05-14T06:41:37Z"), "self" : true }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 202, "optime" : Timestamp(1400049697, 1), "optimeDate" : ISODate("2014-05-14T06:41:37Z"), "lastHeartbeat" : ISODate("2014-05-14T06:44:58Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T06:44:59Z"), "pingMs" : 0, "syncingTo" : "localhost:28010" }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 202, "optime" : Timestamp(1400049697, 1), "optimeDate" : ISODate("2014-05-14T06:41:37Z"), "lastHeartbeat" : ISODate("2014-05-14T06:44:58Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T06:44:59Z"), "pingMs" : 0, "syncingTo" : "localhost:28010" } ], "ok" : 1 } rs1:PRIMARY>
7) 主从操作日志 oplog
MongoDB的Replica Sets架构是通过一个日志来存储读写操作的,即oplog. oplog.rs是一个固定长度的capped collection,它存在于“local”数据库中,用于记录Replica Sets操作日志.在默认情况下,对于64 位的MongoDB,oplog 是比较大的,可以达到5%的磁盘空间。oplog 的大小是可以通过mongod 的参数”—oplogSize”来改变oplog 的日志大小。
rs1:PRIMARY> use local switched to db local rs1:PRIMARY> show collections oplog.rs slaves startup_log system.indexes system.replset rs1:PRIMARY> db.oplog.rs.find() { "ts" : Timestamp(1400049697, 1), "h" : NumberLong(0), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } } rs1:PRIMARY>
字段说明:
ts: 某个操作的时间戳
op: 操作类型,i, insert; d, delete; u, update
ns: 命名空间,及操作的collecion name
o: document的内容
查看master的replication数据信息:
rs1:PRIMARY> db.printReplicationInfo() configured oplog size: 2156.3529296875MB log length start to end: 0secs (0hrs) oplog first event time: Tue May 13 2014 23:41:37 GMT-0700 (PDT) oplog last event time: Tue May 13 2014 23:41:37 GMT-0700 (PDT) now: Tue May 13 2014 23:58:10 GMT-0700 (PDT) rs1:PRIMARY>
字段说明:
configured oplog size: 配置的oplog文件大小
log length start to end: oplog日志的启用时间段
oplog first event time: 第一个事务日志的产生时间
oplog last event time: 最后一个事务日志的产生时间
now: 当前时间
查看slave的replication数据信息:
rs1:PRIMARY> db.printSlaveReplicationInfo() source: localhost:28011 syncedTo: Tue May 13 2014 23:41:37 GMT-0700 (PDT) = 1284 secs ago (0.36hrs) source: localhost:28012 syncedTo: Tue May 13 2014 23:41:37 GMT-0700 (PDT) = 1284 secs ago (0.36hrs) rs1:PRIMARY>
字段说明:
source: 从库的IP及端口
syncedTo: 目前的同步情况,延迟的时间
8) 主从配置信息
在local库中不仅有主从日志oplog collection,还有一个用于记录主从配置信息:system.replset
rs1:PRIMARY> use local switched to db local rs1:PRIMARY> show collections; oplog.rs slaves startup_log system.indexes system.replset rs1:PRIMARY> db.system.replset.find() { "_id" : "rs1", "version" : 1, "members" : [ { "_id" : 0, "host" : "localhost:28010" }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } rs1:PRIMARY>
从上面可以看出Replica Sets的配置信息,也可以在任何一个成员实例上执行rs.conf()来查看配置信息:
rs1:PRIMARY> rs.conf() { "_id" : "rs1", "version" : 1, "members" : [ { "_id" : 0, "host" : "localhost:28010" }, { "_id" : 1, "host" : "localhost:28011" }, { "_id" : 2, "host" : "localhost:28012" } ] } rs1:PRIMARY>
9) 读写分离
在主库中插入一条测试数据:
rs1:PRIMARY> use test switched to db test rs1:PRIMARY> db.c1.insert({age:10}) rs1:PRIMARY> db.c1.find() { "_id" : ObjectId("53731736a6711e57d44675a7"), "age" : 10 } rs1:PRIMARY>
在从库中进行查询操作:
$ mongo -port 28011 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28011/test rs1:SECONDARY> show collections; Wed May 14 00:13:05.014 JavaScript execution failed: error: { "$err" : "not master and slaveOk=false", "code" : 13435 } at src/mongo/shell/query.js:L128 rs1:SECONDARY
当查询报错时,说明该从库不能执行操作.使用rs.slaveOk()或db.getMongo().setSlaveOk()让从库可以读,分担主库的压力
rs1:SECONDARY> rs.slaveOk() rs1:SECONDARY> show collections; c1 system.indexes rs1:SECONDARY> db.c1.find() { "_id" : ObjectId("53731736a6711e57d44675a7"), "age" : 10 } rs1:SECONDARY>
可以看出,collection c1表的数据从主库同步到从库中了
10)故障转移
Replica Sets比传统的Master-Slave有改进的地方就是它可以进行故障的自动转移,如果我们停掉Replica Sets中的一个成员,那么剩余成员会再自动选举出一个成员,作为主库。
停掉28010这个主库:
rs1:PRIMARY> use admin switched to db admin rs1:PRIMARY> db.shutdownServer() Wed May 14 00:20:24.659 DBClientCursor::init call() failed server should be down... Wed May 14 00:20:24.659 trying reconnect to 127.0.0.1:28010 Wed May 14 00:20:24.660 reconnect 127.0.0.1:28010 ok rs1:SECONDARY>
在28011实例上查看Replica Sets状态:
rs1:SECONDARY> rs.status() { "set" : "rs1", "date" : ISODate("2014-05-14T07:22:16Z"), "myState" : 2, "syncingTo" : "localhost:28012", "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : Timestamp(1400051510, 1), "optimeDate" : ISODate("2014-05-14T07:11:50Z"), "lastHeartbeat" : ISODate("2014-05-14T07:22:15Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:20:23Z"), "pingMs" : 0 }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 2676, "optime" : Timestamp(1400051510, 1), "optimeDate" : ISODate("2014-05-14T07:11:50Z"), "errmsg" : "syncing to: localhost:28012", "self" : true }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 2434, "optime" : Timestamp(1400051510, 1), "optimeDate" : ISODate("2014-05-14T07:11:50Z"), "lastHeartbeat" : ISODate("2014-05-14T07:22:15Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:22:14Z"), "pingMs" : 0, "syncingTo" : "localhost:28010" } ], "ok" : 1 } rs1:SECONDARY>
可以看到,28010这个实例的mongodb出现异常,而 系统自动选择28012这个实例作为主库,这种故障处理机制,能将系统的稳定性大大提高。
11)增加节点
MongoDB Replica Sets不仅提供高可用性的解决方案,同时提供了负载均衡的解决方案,增加Replica Sets节点在实际应用中非常普遍。例如当应用的读压力暴增时,3台节点的环境已不能满足需求,那么就需要增加一些节点将压力平均分配;当应用的压力小时,可以减少一些节点来减少硬件资源的成本。
增加节点有两种方案:一是通过oplog来增加节点;一是通过数据库快照(--fastsync)和oplog结合方式来增加节点。
a. 通过oplog增加节点,配置并启动新节点
$ sudo mkdir -p /data/replicaset/db/r3 $ sudo cp /data/replicaset/key/r0 /data/replicaset/key/r3 $ sudo chmod 600 /data/replicaset/key/r3 $ sudo /usr/local/mongodb/bin/mongod --replSet rs1 --keyFile /data/replicaset/key/r3 --fork --port 28013 --dbpath /data/replicaset/db/r3 --logpath=/data/replicaset/log/r3.log --logappend about to fork child process, waiting until server is ready for connections. forked process: 9170 all output going to: /data/replicaset/log/r3.log child process started successfully, parent exiting
添加此节点到现有的Replica Sets中:
$ mongo -port 28012 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28012/test rs1:PRIMARY> rs.add("localhost:28013") { "down" : [ "localhost:28010" ], "ok" : 1 } rs1:PRIMARY> rs.status() { "set" : "rs1", "date" : ISODate("2014-05-14T07:32:39Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : Timestamp(1400051510, 1), "optimeDate" : ISODate("2014-05-14T07:11:50Z"), "lastHeartbeat" : ISODate("2014-05-14T07:32:38Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:20:23Z"), "pingMs" : 0 }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 3052, "optime" : Timestamp(1400052742, 1), "optimeDate" : ISODate("2014-05-14T07:32:22Z"), "lastHeartbeat" : ISODate("2014-05-14T07:32:38Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:32:39Z"), "pingMs" : 0, "syncingTo" : "localhost:28012" }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 3274, "optime" : Timestamp(1400052742, 1), "optimeDate" : ISODate("2014-05-14T07:32:22Z"), "self" : true }, { "_id" : 3, "name" : "localhost:28013", "health" : 1, "state" : 5, "stateStr" : "STARTUP2", "uptime" : 17, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2014-05-14T07:32:38Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:32:39Z"), "pingMs" : 0, "lastHeartbeatMessage" : "initial sync need a member to be primary or secondary to do our initial sync" } ], "ok" : 1 } rs1:PRIMARY>
可以看到,从节点初始化到数据同步需要一段时间,等到完全同步完成后再查看Replica Sets状态
rs1:PRIMARY> rs.status() { "set" : "rs1", "date" : ISODate("2014-05-14T07:34:00Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : Timestamp(1400051510, 1), "optimeDate" : ISODate("2014-05-14T07:11:50Z"), "lastHeartbeat" : ISODate("2014-05-14T07:33:59Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:20:23Z"), "pingMs" : 0 }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 3133, "optime" : Timestamp(1400052742, 1), "optimeDate" : ISODate("2014-05-14T07:32:22Z"), "lastHeartbeat" : ISODate("2014-05-14T07:33:58Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:33:59Z"), "pingMs" : 0, "syncingTo" : "localhost:28012" }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 3355, "optime" : Timestamp(1400052742, 1), "optimeDate" : ISODate("2014-05-14T07:32:22Z"), "self" : true }, { "_id" : 3, "name" : "localhost:28013", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 98, "optime" : Timestamp(1400052742, 1), "optimeDate" : ISODate("2014-05-14T07:32:22Z"), "lastHeartbeat" : ISODate("2014-05-14T07:34:00Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:33:59Z"), "pingMs" : 0, "syncingTo" : "localhost:28012" } ], "ok" : 1 } rs1:PRIMARY>
验证数据是否已经同步过来:
$ mongo -port 28013 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28013/test rs1:SECONDARY> rs.slaveOk() rs1:SECONDARY> show collections; c1 system.indexes rs1:SECONDARY> db.c1.find() { "_id" : ObjectId("53731736a6711e57d44675a7"), "age" : 10 } rs1:SECONDARY>
b.通过数据库快照(--fastsync)和oplog结合增加节点
通过oplog 直接进行增加节点操作简单且无需人工干预过多,但oplog 是capped collection,采用循环的方式进行日志处理,所以采用oplog 的方式进行增加节点,有可能导致数据的不一致,因为日志中存储的信息有可能已经刷新过了。不过没关系,我们可以通过数据库快照(--fastsync)和oplog 结合的方式来增加节,这种方式的操作流程是,先取某一个复制集成员的物理文件来做为初始化数据,然后剩余的部分用oplog 日志来追,最终达到数据一致性。
取某一个Replica Sets成员的数据文件来作为初始化数据:
$ sudo scp -r /data/replicaset/db/r3 /data/replicaset/db/r4 $ sudo cp /data/replicaset/key/r3 /data/replicaset/key/r4 $ sudo chmod 600 /data/replicaset/key/r4
取完数据文件后,在collection c1中增加一条新document,用于验证此更新也同步了
$ mongo -port 28012 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28012/test rs1:PRIMARY> db.c1.find() { "_id" : ObjectId("53731736a6711e57d44675a7"), "age" : 10 } rs1:PRIMARY> db.c1.insert({age:20}) rs1:PRIMARY> db.c1.find() { "_id" : ObjectId("53731736a6711e57d44675a7"), "age" : 10 } { "_id" : ObjectId("53731f2e7f446fd6b048d515"), "age" : 20 } rs1:PRIMARY>
启用28014这个新节点,并添加至当前的Replica Sets:
$ sudo /usr/local/mongodb/bin/mongod --replSet rs1 --keyFile /data/replicaset/key/r4 --fork --port 28014 --dbpath /data/replicaset/db/r4 --logpath=/data/replicaset/log/r4.log --logappend --fastsync about to fork child process, waiting until server is ready for connections. forked process: 18764 all output going to: /data/replicaset/log/r4.log child process started successfully, parent exiting $ mongo -port 28012 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28012/test rs1:PRIMARY> rs.add("localhost:28014") { "down" : [ "localhost:28010" ], "ok" : 1 } rs1:PRIMARY>
验证数据已经同步到28014这个实例中:
$ mongo -port 28014 MongoDB shell version: 2.4.5 connecting to: 127.0.0.1:28014/test rs1:SECONDARY> rs.slaveOk() rs1:SECONDARY> show collections; c1 system.indexes rs1:SECONDARY> db.c1.find() { "_id" : ObjectId("53731736a6711e57d44675a7"), "age" : 10 } { "_id" : ObjectId("53731f2e7f446fd6b048d515"), "age" : 20 } rs1:SECONDARY>
12)减少节点
将后面添加的两个新节点28013和28014从Replica Sets中除掉:
rs1:PRIMARY> rs.remove("localhost:28013") Wed May 14 00:51:58.235 DBClientCursor::init call() failed Wed May 14 00:51:58.237 JavaScript execution failed: Error: error doing query: failed at src/mongo/shell/query.js:L78 Wed May 14 00:51:58.237 trying reconnect to 127.0.0.1:28012 Wed May 14 00:51:58.238 reconnect 127.0.0.1:28012 ok rs1:PRIMARY> rs.remove("localhost:28014") Wed May 14 00:52:29.567 DBClientCursor::init call() failed Wed May 14 00:52:29.567 JavaScript execution failed: Error: error doing query: failed at src/mongo/shell/query.js:L78 Wed May 14 00:52:29.567 trying reconnect to 127.0.0.1:28012 Wed May 14 00:52:29.567 reconnect 127.0.0.1:28012 ok rs1:PRIMARY> rs.status() { "set" : "rs1", "date" : ISODate("2014-05-14T07:52:54Z"), "myState" : 1, "members" : [ { "_id" : 0, "name" : "localhost:28010", "health" : 0, "state" : 8, "stateStr" : "(not reachable/healthy)", "uptime" : 0, "optime" : Timestamp(0, 0), "optimeDate" : ISODate("1970-01-01T00:00:00Z"), "lastHeartbeat" : ISODate("2014-05-14T07:52:53Z"), "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"), "pingMs" : 0 }, { "_id" : 1, "name" : "localhost:28011", "health" : 1, "state" : 2, "stateStr" : "SECONDARY", "uptime" : 25, "optime" : Timestamp(1400053949, 1), "optimeDate" : ISODate("2014-05-14T07:52:29Z"), "lastHeartbeat" : ISODate("2014-05-14T07:52:53Z"), "lastHeartbeatRecv" : ISODate("2014-05-14T07:52:53Z"), "pingMs" : 0, "lastHeartbeatMessage" : "syncing to: localhost:28012", "syncingTo" : "localhost:28012" }, { "_id" : 2, "name" : "localhost:28012", "health" : 1, "state" : 1, "stateStr" : "PRIMARY", "uptime" : 4489, "optime" : Timestamp(1400053949, 1), "optimeDate" : ISODate("2014-05-14T07:52:29Z"), "self" : true } ], "ok" : 1 } rs1:PRIMARY>