目前部署是mongoS有三台,即mongo服务器
mongoDb有三组,每一组有三台主副机,其中一台主,两台副
起初在设置的时候,每组副本集其IP都在server.conf里面写死了,一组三台,就得修改这三台的配置:
shardsvr=true
replSet=unilog-shard03/10.70.175.69:27017,10.70.175.70:27017,10.70.175.71:27017
port=27017
oplogSize=512
logappend=true
fork=true
auth=true
而在mongodS配置分片时:
db.runCommand({addshard:"unilog-shard01/10.70.175.63:27017,10.70.175.64:27017,10.70.175.65:27017"});
最近因为机器扩容,移除了一条主机65,导致mongodS连接不上,出现如下
查看其端口号,netstat -an | grep 27017,:
在去连接65时,一直被于接收请求状态
需要调整mongoDb里面的server.conf配置文件:
shardsvr=true
replSet=unilog-shard02
port=27017
oplogSize=512
logappend=true
fork=true
auth=true
~
ESC,:wq!,保存退出即可,重新启动mongoDb
然后在主机上如62为主机,执行如下命令:
var cfg={ _id:'unilog-shard02', members:[ {_id:0,host:'10.70.175.66:27017',priority:2}, {_id:1,host:'10.70.175.67:27017',priority:1}, {_id:1,host:'10.70.175.68:27017'} }; \
var s= rs.status(); s;\
var m=s.members; \
if(m&&m.length>0){ rs.reconfig(cfg); }else{ rs.initiate(cfg); } \
rs.slaveOk();rs.status()
要进行
mongo
admin用户
mongo@oa-04-va59:~> mongo
MongoDB shell version: 2.0.2
connecting to: test
> use admin
switched to db admin
> db.auth("admin","admin123")
1
PRIMARY> var cfg={ _id:'shard0', members:[ {_id:0,host:'10.70.175.63:27017',priority:2}, {_id:1,host:'10.70.175.63:27017',priority:1} }; \
Tue Jun 9 11:28:46 SyntaxError: missing ] after element list (shell):1
PRIMARY> var s= rs.status(); s;\
Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
PRIMARY> var m=s.members; \
Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
PRIMARY> if(m&&m.length>0){ rs.reconfig(cfg); }else{ rs.initiate(cfg); } \
Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
PRIMARY> rs.slaveOk();rs.status()
{
"set" : "unilog-shard01",
"date" : ISODate("2015-06-09T03:28:48Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "10.70.175.63:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1433820527000,
"i" : 2
},
"optimeDate" : ISODate("2015-06-09T03:28:47Z"),
"self" : true
},
{
"_id" : 3,
"name" : "10.70.175.64:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 2097,
"optime" : {
"t" : 1403622942000,
"i" : 1
},
"optimeDate" : ISODate("2014-06-24T15:15:42Z"),
"lastHeartbeat" : ISODate("2015-06-09T03:28:47Z"),
"pingMs" : 0,
"errmsg" : "error RS102 too stale to catch up"
}
],
"ok" : 1
}
"errmsg" : "error RS102 too stale to catch up"说明这几台机器没有同步,执行db.repairDatabase()时,出现
PRIMARY> db.repairDatabase()
{
"errmsg" : "Cannot repair database logs having size: 328891891712 (bytes) because free disk space is: 158043725824 (bytes)",
"ok" : 0
}
空间不够,除了扩容外,其余方案还待去实践(后续补充)
第二步,修改mongoS里面,刚开始我用的是db.runCommand({removeshard:"unilog-shard01/10.70.175.65:27017"});
mongo@oa-04-va70:/mongodb/mongo/bin> ./mongo admin
MongoDB shell version: 2.0.2
connecting to: admin
> db.runCommand({removeshard:"oa-04-va61"});
{ "ok" : 0, "errmsg" : "unauthorized" }
> db.auth("admin","admin123")
1
mongos> db.runCommand({removeshard:"unilog-shard01/10.70.175.65:27017"});
{
"msg" : "draining started successfully",
"state" : "started",
"shard" : "unilog-shard01",
"ok" : 1
}
再执行时就出现
msg:
"draining ongoing"
, state:
"ongoing",表示正在迁移,该组其draining状态为true:
mongos> use config
switched to db config
mongos> db.shards.find()
{ "_id" : "unilog-shard02", "host" : "unilog-shard02/10.70.175.66:27017,10.70.175.67:27017,10.70.175.68:27017" }
{ "_id" : "unilog-shard03", "host" : "unilog-shard03/10.70.175.69:27017,10.70.175.71:27017,10.70.175.70:27017" }
{ "_id" : "unilog-shard01", "draining" : true, "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" }
update这个状态:
mongos> db.shards.update({ "_id" : "unilog-shard01"},{ "_id" : "unilog-shard01", "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" })
然后再查看时:
mongos> db.shards.find()
{ "_id" : "unilog-shard02", "host" : "unilog-shard02/10.70.175.66:27017,10.70.175.67:27017,10.70.175.68:27017" }
{ "_id" : "unilog-shard03", "host" : "unilog-shard03/10.70.175.69:27017,10.70.175.71:27017,10.70.175.70:27017" }
{ "_id" : "unilog-shard01", "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" }
用mongoVUE,连接成功
mongo相关的命令:
查看mongoDb分副本集的状态: rs.status();
查看数据库:show dbs
查看某个库的状态:
RECOVERING> db.stats()
{
"db" : "logs",
"collections" : 13,
"objects" : 119434683,
"avgObjSize" : 2392.0039992403213,
"dataSize" : 285688239384,
"storageSize" : 297047594944,
"numExtents" : 181,
"indexes" : 24,
"indexSize" : 19946562048,
"fileSize" : 343363551232,
"nsSizeMB" : 16,
"ok" : 1
}
这是在mongoDb上执行的,
RECOVERING>
还处在修复状态,即副本集没有同步,在空间足够多的情况,尝试修复下。
可以在mongoS上验证该集合:
mongo@oa-04-va59:~> mongo
MongoDB shell version: 2.0.2
connecting to: test
> use admin
switched to db admin
> db.auth("admin","admin123")
1
PRIMARY> help
PRIMARY> show dbs
PRIMARY> use logs
switched to db logs
PRIMARY> show collections
PRIMARY> db.oplog.findOne()
删除锁文件:
mongo@oa-04-va63:/mongodb/unilog-shard02-02/data> pwd
/mongodb/unilog-shard02-02/data
mongo@oa-04-va63:/mongodb/unilog-shard02-02/data> rm -r mongod.lock
查看日志文件:
mongo@oa-04-va63:/mongodb/unilog-shard02-02/logs> pwd
/mongodb/unilog-shard02-02/logs
mongo@oa-04-va63:/mongodb/unilog-shard02-02/logs> tail -f server.log