本文转载于运维生存时间http://www.ttlsa.com/html/1096.html
mongodb高可用性架构—Replica Set
Replica Set使用的是n个mongod节点,构建具备自动的容错功能(auto-failover),自动恢复的(auto-recovery)的高可用方案。
使用Replica Set来实现读写分离。通过在连接时指定或者在主库指定slaveOk,由Secondary来分担读的压力,Primary只承担写操作。
对于Replica Set中的secondary 节点默认是不可读的。
环境如下:
192.168.198.131 192.168.198.129 192.168.198.132
分别在三台服务器上安装mongod服务,安装如下:
# wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.0.3.tgz
# tar zxvf mongodb-linux-x86_64-2.0.3.tgz -C ../software/
# ln -s mongodb-linux-x86_64-2.0.3 /usr/local/mongodb
# useradd mongodb
# mkdir -p /data/mongodb/myset
# cd /usr/local/mongodb/bin
# ./mongod –replSet myset –dbpath /data/mongodb/myset –oplogSize 100 –logpath /data/mongodb/myset/myset.log –logappend –fork
# ./mongo //任选一台执行以下内容
> config={_id:”myset”,members:[
... {_id:0,host:"192.168.198.131:27017"},
... {_id:1,host:"192.168.198.129:27017"},
... {_id:2,host:"192.168.198.132:27017",arbiterOnly:true}]}
以下输出内容:
{
“_id” : “myset”,
“members” : [
{
"_id" : 0,
"host" : "192.168.198.131:27017"
},
{
"_id" : 1,
"host" : "192.168.198.129:27017"
},
{
"_id" : 2,
"host" : "192.168.198.132:27017",
"arbiterOnly" : true
}
]
}
> rs.initiate(config) //初始化
以下输出内容:
{
“info” : “Config now saved locally. Should come online in about a minute.”,
“ok” : 1
}
> rs.conf() //查看配置内容
{
“_id” : “myset”,
“version” : 1,
“members” : [
{
"_id" : 0,
"host" : "192.168.198.131:27017"
},
{
"_id" : 1,
"host" : "192.168.198.129:27017"
},
{
"_id" : 2,
"host" : "192.168.198.132:27017",
"arbiterOnly" : true
}
]
}
> rs.status() //查看状态信息
{
“set” : “myset”,
“date” : ISODate(“2012-03-01T08:45:01Z”),
“myState” : 1,
“members” : [
{
"_id" : 0,
"name" : "192.168.198.131:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1330591378000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-01T08:42:58Z"),
"self" : true
},
{
"_id" : 1,
"name" : "192.168.198.129:27017",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 121,
"optime" : {
"t" : 1330591378000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-01T08:42:58Z"),
"lastHeartbeat" : ISODate("2012-03-01T08:45:01Z"),
"pingMs" : 0
},
{
"_id" : 2,
"name" : "192.168.198.132:27017",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 121,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-03-01T08:45:01Z"),
"pingMs" : 1
}
],
“ok” : 1
}
state: 1表示当前可以进行读写,2表示不能读写
health: 1表示是正常的,0异常
在同一时刻,每组 Replica Sets 只有一个 Primary,用于接受写操作。而后会异步复制到其他成员数据库中。一旦 primary 死掉,会自动投票选出接任的 primary 来,原服务器恢复后成为普通成员。如果数据尚未从先前的 primary 复制到成员服务器,有可能会丢失数据。
PRIMARY> db.test.insert({“name”:”foobar”,”age”:25})
PRIMARY> db.test.find()
{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }
SECONDARY> db.test.find()
error: { “$err” : “not master and slaveok=false”, “code” : 13435 }
SECONDARY> db.test.insert({“name”:”foobar”,”age”:25})
not master
在主库上设置slaveok=ok
PRIMARY> db.getMongo().setSlaveOk()
SECONDARY> use test
switched to db test
SECONDARY> db.test.find()
{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }
192.168.198.131上pkill mongod
Thu Mar 1 17:17:51 got kill or ctrl c or hup signal 15 (Terminated), will terminate after current cmd ends
Thu Mar 1 17:17:51 [interruptThread] now exiting
Thu Mar 1 17:17:51 dbexit:
Thu Mar 1 17:17:51 [interruptThread] shutdown: going to close listening sockets…
Thu Mar 1 17:17:51 [interruptThread] closing listening socket: 7
Thu Mar 1 17:17:51 [interruptThread] closing listening socket: 8
Thu Mar 1 17:17:51 [interruptThread] closing listening socket: 9
Thu Mar 1 17:17:51 [interruptThread] removing socket file: /tmp/mongodb-27017.sock
Thu Mar 1 17:17:51 [interruptThread] shutdown: going to flush diaglog…
Thu Mar 1 17:17:51 [interruptThread] shutdown: going to close sockets…
Thu Mar 1 17:17:51 [conn1] end connection 127.0.0.1:58614
Thu Mar 1 17:17:51 [interruptThread] shutdown: waiting for fs preallocator…
Thu Mar 1 17:17:51 [interruptThread] shutdown: lock for final commit…
Thu Mar 1 17:17:51 [interruptThread] shutdown: final commit…
Thu Mar 1 17:17:52 [interruptThread] shutdown: closing all files…
Thu Mar 1 17:17:52 [interruptThread] closeAllFiles() finished
Thu Mar 1 17:17:52 [interruptThread] journalCleanup…
Thu Mar 1 17:17:52 [interruptThread] removeJournalFiles
Thu Mar 1 17:17:52 [interruptThread] shutdown: removing fs lock…
Thu Mar 1 17:17:52 dbexit: really exiting now
192.168.198.129选择为primary
Thu Mar 1 00:17:51 [conn144] end connection 192.168.198.131:35714
Thu Mar 1 00:17:51 [rsSync] replSet syncThread: 10278 dbclient error communicating with server: 192.168.198.131:27017
Thu Mar 1 00:17:52 [rsHealthPoll] DBClientCursor::init call() failed
Thu Mar 1 00:17:52 [rsHealthPoll] replSet info 192.168.198.131:27017 is down (or slow to respond): DBClientBase::findN: transport error: 192.168.198.131:27017 query: { replSetHeartbeat: “myset”, v: 1, pv: 1, checkEmpty: false, from: “192.168.198.129:27017″ }
Thu Mar 1 00:17:52 [rsHealthPoll] replSet member 192.168.198.131:27017 is now in state DOWN
Thu Mar 1 00:17:52 [rsMgr] not electing self, 192.168.198.132:27017 would veto
Thu Mar 1 00:17:58 [rsMgr] replSet info electSelf 1
Thu Mar 1 00:17:58 [rsMgr] replSet PRIMARY
【ARBITER】192.168.198.132日志
Thu Mar 1 04:17:51 [conn143] end connection 192.168.198.131:56260
Thu Mar 1 04:17:53 [rsHealthPoll] DBClientCursor::init call() failed
Thu Mar 1 04:17:53 [rsHealthPoll] replSet info 192.168.198.131:27017 is down (or slow to respond): DBClientBase::findN: transport error: 192.168.198.131:27017 query: { replSetHeartbeat: “myset”, v: 1, pv: 1, checkEmpty: false, from: “192.168.198.132:27017″ }
Thu Mar 1 04:17:53 [rsHealthPoll] replSet member 192.168.198.131:27017 is now in state DOWN
Thu Mar 1 04:17:58 [conn144] replSet info voting yea for 192.168.198.129:27017 (1)
Thu Mar 1 04:17:59 [rsHealthPoll] replSet member 192.168.198.129:27017 is now in state PRIMARY
Thu Mar 1 04:18:05 [rsHealthPoll] couldn’t connect to 192.168.198.131:27017: couldn’t connect to server 192.168.198.131:27017
PRIMARY> rs.status();
{
“set” : “myset”,
“date” : ISODate(“2012-03-01T09:20:37Z”),
“myState” : 1,
“syncingTo” : “192.168.198.131:27017″,
“members” : [
{
"_id" : 0,
"name" : "192.168.198.131:27017",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : {
"t" : 1330591997000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-01T08:53:17Z"),
"lastHeartbeat" : ISODate("2012-03-01T09:17:50Z"),
"pingMs" : 0,
"errmsg" : "socket exception"
},
{
"_id" : 1,
"name" : "192.168.198.129:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1330591997000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-01T08:53:17Z"),
"self" : true
},
{
"_id" : 2,
"name" : "192.168.198.132:27017",
"health" : 1,
"state" : 7,
"stateStr" : "ARBITER",
"uptime" : 2244,
"optime" : {
"t" : 0,
"i" : 0
},
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2012-03-01T09:20:36Z"),
"pingMs" : 0
}
],
“ok” : 1
}
PRIMARY> db.test.find()
{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }
{ “_id” : ObjectId(“4f4f3fe2a7c9a9d1eb78392f”), “name” : “ttlsa”, “age” : 1 }
再次启动192.168.198.131的mongod服务
Thu Mar 1 17:23:24 [initandlisten] MongoDB starting : pid=6977 port=27017 dbpath=/data/mongodb/myset 64-bit host=node2
Thu Mar 1 17:23:24 [initandlisten] db version v2.0.3, pdfile version 4.5
Thu Mar 1 17:23:24 [initandlisten] git version: 05bb8aa793660af8fce7e36b510ad48c27439697
Thu Mar 1 17:23:24 [initandlisten] build info: Linux ip-10-110-9-236 2.6.21.7-2.ec2.v1.2.fc8xen #1 SMP Fri Nov 20 17:48:28 EST 2009 x86_64 BOOST_LIB_VERSION=1_41
Thu Mar 1 17:23:24 [initandlisten] options: { dbpath: “/data/mongodb/myset”, fork: true, logappend: true, logpath: “/data/mongodb/myset/myset.log”, oplogSize: 100, replSet: “myset” }
Thu Mar 1 17:23:24 [initandlisten] journal dir=/data/mongodb/myset/journal
Thu Mar 1 17:23:24 [initandlisten] recover : no journal files present, no recovery needed
Thu Mar 1 17:23:26 [initandlisten] waiting for connections on port 27017
Thu Mar 1 17:23:26 [websvr] admin web console waiting for connections on port 28017
Thu Mar 1 17:23:27 [initandlisten] connection accepted from 192.168.198.129:43753 #1
Thu Mar 1 17:23:27 [initandlisten] connection accepted from 127.0.0.1:37253 #2
Thu Mar 1 17:23:27 [rsStart] trying to contact 192.168.198.129:27017
Thu Mar 1 17:23:27 [rsStart] replSet STARTUP2
Thu Mar 1 17:23:27 [rsSync] replSet SECONDARY
SECONDARY> use test
switched to db test
SECONDARY> db.test.find()
{ “_id” : ObjectId(“4f4f38fc47db2bfa5ceb2aee”), “name” : “foobar”, “age” : 25 }
{ “_id” : ObjectId(“4f4f3fe2a7c9a9d1eb78392f”), “name” : “ttlsa”, “age” : 1 }