小丸子学MongoDB系列之——副本集在容灾环境中的故障演练

假设MongoDB副本集部署在多机房环境中,下面模拟下机房故障发生时的现象已经处理过程。
0.架构规划
主机名          IP               角色                 机房
hadoop1    10.1.245.72    primary          北京
hadoop2    10.1.245.73    secondary1    北京
hadoop3    10.1.245.74    secondary2    上海


1.查看副本集状态

[mgousr01@hadoop1 ~]$ mongo 10.1.245.72:37027
rstl:PRIMARY> rs.status()
{
        "set" : "rstl",
        "date" : ISODate("2015-12-23T03:26:38.917Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "10.1.245.72:37027",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 815,
                        "optime" : Timestamp(1450841104, 1),
                        "optimeDate" : ISODate("2015-12-23T03:25:04Z"),
                        "electionTime" : Timestamp(1450841106, 1),
                        "electionDate" : ISODate("2015-12-23T03:25:06Z"),
                        "configVersion" : 1,
                        "self" : true
                },
                {
                        "_id" : 1,
                        "name" : "10.1.245.73:37037",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 94,
                        "optime" : Timestamp(1450841104, 1),
                        "optimeDate" : ISODate("2015-12-23T03:25:04Z"),
                        "lastHeartbeat" : ISODate("2015-12-23T03:26:38.312Z"),
                        "lastHeartbeatRecv" : ISODate("2015-12-23T03:26:38.188Z"),
                        "pingMs" : 0,
                        "configVersion" : 1
                },
                {
                        "_id" : 2,
                        "name" : "10.1.245.74:37047",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 94,
                        "optime" : Timestamp(1450841104, 1),
                        "optimeDate" : ISODate("2015-12-23T03:25:04Z"),
                        "lastHeartbeat" : ISODate("2015-12-23T03:26:38.312Z"),
                        "lastHeartbeatRecv" : ISODate("2015-12-23T03:26:38.238Z"),
                        "pingMs" : 0,
                        "lastHeartbeatMessage" : "could not find member to sync from",
                        "configVersion" : 1
                }
        ],
        "ok" : 1
}



2.测试数据同步功能
2.1 primary执行写入操作
rstl:PRIMARY> use test
switched to db test
rstl:PRIMARY> db.goods.insert({name:"eggs",price:38})
WriteResult({ "nInserted" : 1 })
rstl:PRIMARY> show collections;
goods
system.indexes
rstl:PRIMARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }


2.2 secondary1查看同步数据
[mgousr01@hadoop2 ~]$ mongo 10.1.245.73:37037
rstl:SECONDARY> use test
switched to db test
rstl:SECONDARY>  db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }


2.3 secondary2查看同步数据
rstl:SECONDARY> use test
switched to db test
rstl:SECONDARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
注:数据同步正常



3.模拟北京机房故障

3.1 同时将primary和secondary1上的mongod进程kill掉
[mgousr01@hadoop1 ~]$ ps -ef|grep mongod|grep -v grep
mgousr01 32555     1  0 11:13 ?        00:00:04 mongod -f mongodb/conf/mg72.conf
[mgousr01@hadoop1 ~]$ kill -9 32555

[mgousr01@hadoop2 ~]$ ps -ef|grep mongod |grep -v grep
mgousr01 15981     1  0 11:13 ?        00:00:05 mongod -f mongodb/conf/mg73.conf
[mgousr01@hadoop2 ~]$ kill -9 15981


3.2 secondary2上查看当前副本集状态
rstl:SECONDARY> rs.status()
{
        "set" : "rstl",
        "date" : ISODate("2015-12-23T03:36:08.448Z"),
        "myState" : 2,
        "members" : [
                {
                        "_id" : 0,
                        "name" : "10.1.245.72:37027",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : Timestamp(0, 0),
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2015-12-23T03:36:07.019Z"),
                        "lastHeartbeatRecv" : ISODate("2015-12-23T03:34:51Z"),
                        "pingMs" : 0,
                        "lastHeartbeatMessage" : "Failed attempt to connect to 10.1.245.72:37027; couldn't connect to server 10.1.245.72:37027 (10.1.245.72), connection attempt failed",                         "configVersion" : -1
                },
                {
                        "_id" : 1,
                        "name" : "10.1.245.73:37037",
                        "health" : 0,
                        "state" : 8,
                        "stateStr" : "(not reachable/healthy)",
                        "uptime" : 0,
                        "optime" : Timestamp(0, 0),
                        "optimeDate" : ISODate("1970-01-01T00:00:00Z"),
                        "lastHeartbeat" : ISODate("2015-12-23T03:36:07.021Z"),
                        "lastHeartbeatRecv" : ISODate("2015-12-23T03:34:52.948Z"),
                        "pingMs" : 0,
                        "lastHeartbeatMessage" : "Failed attempt to connect to 10.1.245.73:37037; couldn't connect to server 10.1.245.73:37037 (10.1.245.73), connection attempt failed",
                        "configVersion" : -1
                },
                {
                        "_id" : 2,
                        "name" : "10.1.245.74:37047",
                        "health" : 1,
                        "state" : 2,
                        "stateStr" : "SECONDARY",
                        "uptime" : 1345,
                        "optime" : Timestamp(1450841388, 2),
                        "optimeDate" : ISODate("2015-12-23T03:29:48Z"),
                        "configVersion" : 1,
                        "self" : true
                }
        ],
        "ok" : 1
}
rstl:SECONDARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
注:目前已经无法连接到primary和secondary1,当前还可以查询数据

rstl:SECONDARY> db.goods.insert({name:"cake",price:100});
WriteResult({ "writeError" : { "code" : undefined, "errmsg" : "not master" } })
注:secondary成员是无法接受写入请求的


3.3 secondary2上重新初始化副本集配置
rstl:SECONDARY> cfg = rs.conf()
rstl:SECONDARY> printjson(cfg)                                --备份当前配置

rstl:SECONDARY> cfg.members = [cfg.members[2]]   --在配置中删除不可用成员

rstl:SECONDARY> rs.reconfig(cfg, {force : true})        --让新配置重新生效
{ "ok" : 1 }
rstl:PRIMARY>
注:这个时候提示符已经从"SECONDARY"变成了"PRIMARY"

rstl:PRIMARY> rs.status();                                            --查看当前副本集状态
{
        "set" : "rstl",
        "date" : ISODate("2015-12-23T05:49:39.772Z"),
        "myState" : 1,
        "members" : [
                {
                        "_id" : 2,
                        "name" : "10.1.245.74:37047",
                        "health" : 1,
                        "state" : 1,
                        "stateStr" : "PRIMARY",
                        "uptime" : 9356,
                        "optime" : Timestamp(1450841388, 2),
                        "optimeDate" : ISODate("2015-12-23T03:29:48Z"),
                        "electionTime" : Timestamp(1450849750, 1),
                        "electionDate" : ISODate("2015-12-23T05:49:10Z"),
                        "configVersion" : 21037,
                        "self" : true
                }
        ],
        "ok" : 1
}

rstl:PRIMARY> db.goods.insert({name:"cake",price:100}); 
WriteResult({ "nInserted" : 1 })
rstl:PRIMARY> db.goods.find();
{ "_id" : ObjectId("567a152c8d26acb133400410"), "name" : "eggs", "price" : 38 }
{ "_id" : ObjectId("567a361bb72708516df2cb36"), "name" : "cake", "price" : 100 }

至此,新的primary配置成功并能够正常接收写入请求。

来自 “ ITPUB博客 ” ,链接:http://blog.itpub.net/20801486/viewspace-1878235/,如需转载,请注明出处,否则将追究法律责任。

转载于:http://blog.itpub.net/20801486/viewspace-1878235/

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值