假设MongoDB副本集部署在多机房环境中,下面模拟下机房故障发生时的现象已经处理过程。
0.架构规划
主机名 IP 角色 机房
hadoop1 10.1.245.72 primary 北京
hadoop2 10.1.245.73 secondary1 北京
hadoop3 10.1.245.74 secondary2 上海
1.查看副本集状态
[mgousr01@hadoop1 ~]$ mongo 10.1.245.72:37027
rstl:PRIMARY> rs.status()
{
"set" : "rstl",
"date" : ISODate("2015-12-23T03:26:38.917Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "10.1.245.72:37027",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 815,
"optime" : Timestamp(1450841104, 1),
"optimeDate" : ISODate("2015-12-23T03:25:04Z"),
"electionTime" : Timestamp(1450841106, 1),
"electionDate" : ISODate("2015-12-23T03:25:06Z"),
"configVersion" : 1,
"self" : true
},
{
"_id" : 1,
"name" : "10.1.245.73:37037",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 94,
"optime" : Timestamp(1450841104, 1),
"optimeDate" : ISODate("2015-12-23T03:25:04Z"),
"lastHeartbeat" : ISODate("2015-12-23T03:26:38.312Z"),
"lastHeartbeatRecv" : ISODate("2015-12-23T03:26:38.188Z"),
"pingMs" : 0,
"configVersion" : 1
},
{
"_id" : 2,
"name" : "10.1.245.74:37047",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 94,
"optime" : Timestamp(1450841104, 1),
"optimeDate" : ISODate("2015-12-23T03:25:04Z"),
"lastHeartbeat" : ISODate("2015-12-23T03:26:38.312Z"),
"lastHeartbeatRecv" : ISODate("2015-12-23T03:26:38.238Z"),
"pingMs" : 0,
"lastHeartbeatMessage" : "could not find member to sync from",
"configVersion" : 1
}
],
"ok" : 1
}
2.测试数据同步功能
2.1 primary执行写入操作
rstl:PRIMARY> use test
switched to db test
rstl:PRIMARY> db.goods.insert({name:"eggs",price:38})
WriteResult({ "nInserted" : 1 })
rstl:PRIMARY> show collections;
goods
system.indexes
rstl:PRIMARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
2.2 secondary1查看同步数据
[mgousr01@hadoop2 ~]$ mongo 10.1.245.73:37037
rstl:SECONDARY> use test
switched to db test
rstl:SECONDARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
2.3 secondary2查看同步数据
rstl:SECONDARY> use test
switched to db test
rstl:SECONDARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
注:数据同步正常
3.模拟北京机房故障
3.1 同时将primary和secondary1上的mongod进程kill掉
[mgousr01@hadoop1 ~]$ ps -ef|grep mongod|grep -v grep
mgousr01 32555 1 0 11:13 ? 00:00:04 mongod -f mongodb/conf/mg72.conf
[mgousr01@hadoop1 ~]$ kill -9 32555
[mgousr01@hadoop2 ~]$ ps -ef|grep mongod |grep -v grep
mgousr01 15981 1 0 11:13 ? 00:00:05 mongod -f mongodb/conf/mg73.conf
[mgousr01@hadoop2 ~]$ kill -9 15981
3.2 secondary2上查看当前副本集状态
rstl:SECONDARY> rs.status()
{
"set" : "rstl",
"date" : ISODate("2015-12-23T03:36:08.448Z"),
"myState" : 2,
"members" : [
{
"_id" : 0,
"name" : "10.1.245.72:37027",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : Timestamp(0, 0),
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2015-12-23T03:36:07.019Z"),
"lastHeartbeatRecv" : ISODate("2015-12-23T03:34:51Z"),
"pingMs" : 0,
"lastHeartbeatMessage" : "Failed attempt to connect to 10.1.245.72:37027; couldn't connect to server 10.1.245.72:37027 (10.1.245.72), connection attempt failed", "configVersion" : -1
},
{
"_id" : 1,
"name" : "10.1.245.73:37037",
"health" : 0,
"state" : 8,
"stateStr" : "(not reachable/healthy)",
"uptime" : 0,
"optime" : Timestamp(0, 0),
"optimeDate" : ISODate("1970-01-01T00:00:00Z"),
"lastHeartbeat" : ISODate("2015-12-23T03:36:07.021Z"),
"lastHeartbeatRecv" : ISODate("2015-12-23T03:34:52.948Z"),
"pingMs" : 0,
"lastHeartbeatMessage" : "Failed attempt to connect to 10.1.245.73:37037; couldn't connect to server 10.1.245.73:37037 (10.1.245.73), connection attempt failed",
"configVersion" : -1
},
{
"_id" : 2,
"name" : "10.1.245.74:37047",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 1345,
"optime" : Timestamp(1450841388, 2),
"optimeDate" : ISODate("2015-12-23T03:29:48Z"),
"configVersion" : 1,
"self" : true
}
],
"ok" : 1
}
rstl:SECONDARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
注:目前已经无法连接到primary和secondary1,当前还可以查询数据
rstl:SECONDARY> db.goods.insert({name:"cake",price:100});
WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } })
注:secondary成员是无法接受写入请求的
3.3 secondary2上重新初始化副本集配置
rstl:SECONDARY> cfg = rs.conf()
rstl:SECONDARY> printjson(cfg) --备份当前配置
rstl:SECONDARY> cfg.members = [cfg.members[2]] --在配置中删除不可用成员
rstl:SECONDARY> rs.reconfig(cfg, {force : true}) --让新配置重新生效
{ "ok" : 1 }
rstl:PRIMARY>
注:这个时候提示符已经从"SECONDARY"变成了"PRIMARY"
rstl:PRIMARY> rs.status(); --查看当前副本集状态
{
"set" : "rstl",
"date" : ISODate("2015-12-23T05:49:39.772Z"),
"myState" : 1,
"members" : [
{
"_id" : 2,
"name" : "10.1.245.74:37047",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"uptime" : 9356,
"optime" : Timestamp(1450841388, 2),
"optimeDate" : ISODate("2015-12-23T03:29:48Z"),
"electionTime" : Timestamp(1450849750, 1),
"electionDate" : ISODate("2015-12-23T05:49:10Z"),
"configVersion" : 21037,
"self" : true
}
],
"ok" : 1
}
rstl:PRIMARY> db.goods.insert({name:"cake",price:100});
WriteResult({ "nInserted" : 1 })
rstl:PRIMARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
{ "_id" : ObjectId("567a361bb72708516df2cb36"), "name" : "cake", "price" : 100 }
至此,新的primary配置成功并能够正常接收写入请求。
来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/20801486/viewspace-1878235/,如需转载,请注明出处,否则将追究法律责任。
转载于:http://blog.itpub.net/20801486/viewspace-1878235/