目前部署是mongoS有三台,即mongo服务器
mongoDb有三组,每一组有三台主副机,其中一台主,两台副
起初在设置的时候,每组副本集其IP都在server.conf里面写死了,一组三台,就得修改这三台的配置:
- shardsvr=true
- replSet=unilog-shard03/10.70.175.69:27017,10.70.175.70:27017,10.70.175.71:27017
- port=27017
- oplogSize=512
- logappend=true
- fork=true
- auth=true
shardsvr=true
replSet=unilog-shard03/10.70.175.69:27017,10.70.175.70:27017,10.70.175.71:27017
port=27017
oplogSize=512
logappend=true
fork=true
auth=true
而在mongodS配置分片时:
- db.runCommand({addshard:"unilog-shard01/10.70.175.63:27017,10.70.175.64:27017,10.70.175.65:27017"});
db.runCommand({addshard:"unilog-shard01/10.70.175.63:27017,10.70.175.64:27017,10.70.175.65:27017"});
最近因为机器扩容,移除了一条主机65,导致mongodS连接不上,出现如下
查看其端口号,netstat -an | grep 27017,:
在去连接65时,一直被于接收请求状态
需要调整mongoDb里面的server.conf配置文件:
- shardsvr=true
- replSet=unilog-shard02
- port=27017
- oplogSize=512
- logappend=true
- fork=true
- auth=true
- ~
shardsvr=true
replSet=unilog-shard02
port=27017
oplogSize=512
logappend=true
fork=true
auth=true
~
ESC,:wq!,保存退出即可,重新启动mongoDb
然后在主机上如62为主机,执行如下命令:
var cfg={ _id:'unilog-shard02', members:[ {_id:0,host:'10.70.175.66:27017',priority:2}, {_id:1,host:'10.70.175.67:27017',priority:1}, {_id:1,host:'10.70.175.68:27017'} }; \
var s= rs.status(); s;\
var m=s.members; \
if(m&&m.length>0){ rs.reconfig(cfg); }else{ rs.initiate(cfg); } \
rs.slaveOk();rs.status()
要进行
- mongo
mongo
admin用户
- mongo@oa-04-va59:~> mongo
- MongoDB shell version: 2.0.2
- connecting to: test
- > use admin
- switched to db admin
- > db.auth("admin","admin123")
- 1
- PRIMARY> var cfg={ _id:'shard0', members:[ {_id:0,host:'10.70.175.63:27017',priority:2}, {_id:1,host:'10.70.175.63:27017',priority:1} }; \
- Tue Jun 9 11:28:46 SyntaxError: missing ] after element list (shell):1
- PRIMARY> var s= rs.status(); s;\
- Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
- PRIMARY> var m=s.members; \
- Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
- PRIMARY> if(m&&m.length>0){ rs.reconfig(cfg); }else{ rs.initiate(cfg); } \
- Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
- PRIMARY> rs.slaveOk();rs.status()
- {
- "set" : "unilog-shard01",
- "date" : ISODate("2015-06-09T03:28:48Z"),
- "myState" : 1,
- "members" : [
- {
- "_id" : 0,
- "name" : "10.70.175.63:27017",
- "health" : 1,
- "state" : 1,
- "stateStr" : "PRIMARY",
- "optime" : {
- "t" : 1433820527000,
- "i" : 2
- },
- "optimeDate" : ISODate("2015-06-09T03:28:47Z"),
- "self" : true
- },
- {
- "_id" : 3,
- "name" : "10.70.175.64:27017",
- "health" : 1,
- "state" : 3,
- "stateStr" : "RECOVERING",
- "uptime" : 2097,
- "optime" : {
- "t" : 1403622942000,
- "i" : 1
- },
- "optimeDate" : ISODate("2014-06-24T15:15:42Z"),
- "lastHeartbeat" : ISODate("2015-06-09T03:28:47Z"),
- "pingMs" : 0,
- "errmsg" : "error RS102 too stale to catch up"
- }
- ],
- "ok" : 1
- }
mongo@oa-04-va59:~> mongo
MongoDB shell version: 2.0.2
connecting to: test
> use admin
switched to db admin
> db.auth("admin","admin123")
1
PRIMARY> var cfg={ _id:'shard0', members:[ {_id:0,host:'10.70.175.63:27017',priority:2}, {_id:1,host:'10.70.175.63:27017',priority:1} }; \
Tue Jun 9 11:28:46 SyntaxError: missing ] after element list (shell):1
PRIMARY> var s= rs.status(); s;\
Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
PRIMARY> var m=s.members; \
Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
PRIMARY> if(m&&m.length>0){ rs.reconfig(cfg); }else{ rs.initiate(cfg); } \
Tue Jun 9 11:28:46 SyntaxError: illegal character (shell):1
PRIMARY> rs.slaveOk();rs.status()
{
"set" : "unilog-shard01",
"date" : ISODate("2015-06-09T03:28:48Z"),
"myState" : 1,
"members" : [
{
"_id" : 0,
"name" : "10.70.175.63:27017",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1433820527000,
"i" : 2
},
"optimeDate" : ISODate("2015-06-09T03:28:47Z"),
"self" : true
},
{
"_id" : 3,
"name" : "10.70.175.64:27017",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 2097,
"optime" : {
"t" : 1403622942000,
"i" : 1
},
"optimeDate" : ISODate("2014-06-24T15:15:42Z"),
"lastHeartbeat" : ISODate("2015-06-09T03:28:47Z"),
"pingMs" : 0,
"errmsg" : "error RS102 too stale to catch up"
}
],
"ok" : 1
}
"errmsg" : "error RS102 too stale to catch up"说明这几台机器没有同步,执行db.repairDatabase()时,出现
- PRIMARY> db.repairDatabase()
- {
- "errmsg" : "Cannot repair database logs having size: 328891891712 (bytes) because free disk space is: 158043725824 (bytes)",
- "ok" : 0
- }
PRIMARY> db.repairDatabase()
{
"errmsg" : "Cannot repair database logs having size: 328891891712 (bytes) because free disk space is: 158043725824 (bytes)",
"ok" : 0
}
空间不够,除了扩容外,其余方案还待去实践(后续补充)
第二步,修改mongoS里面,刚开始我用的是db.runCommand({removeshard:"unilog-shard01/10.70.175.65:27017"});
- mongo@oa-04-va70:/mongodb/mongo/bin> ./mongo admin
- MongoDB shell version: 2.0.2
- connecting to: admin
- > db.runCommand({removeshard:"oa-04-va61"});
- { "ok" : 0, "errmsg" : "unauthorized" }
- > db.auth("admin","admin123")
- 1
- mongos> db.runCommand({removeshard:"unilog-shard01/10.70.175.65:27017"});
- {
- "msg" : "draining started successfully",
- "state" : "started",
- "shard" : "unilog-shard01",
- "ok" : 1
- }
mongo@oa-04-va70:/mongodb/mongo/bin> ./mongo admin
MongoDB shell version: 2.0.2
connecting to: admin
> db.runCommand({removeshard:"oa-04-va61"});
{ "ok" : 0, "errmsg" : "unauthorized" }
> db.auth("admin","admin123")
1
mongos> db.runCommand({removeshard:"unilog-shard01/10.70.175.65:27017"});
{
"msg" : "draining started successfully",
"state" : "started",
"shard" : "unilog-shard01",
"ok" : 1
}
再执行时就出现
msg:
"draining
ongoing"
,
state:
"ongoing",表示正在迁移,该组其draining状态为true:
- mongos> use config
- switched to db config
- mongos> db.shards.find()
- { "_id" : "unilog-shard02", "host" : "unilog-shard02/10.70.175.66:27017,10.70.175.67:27017,10.70.175.68:27017" }
- { "_id" : "unilog-shard03", "host" : "unilog-shard03/10.70.175.69:27017,10.70.175.71:27017,10.70.175.70:27017" }
- { "_id" : "unilog-shard01", "draining" : true, "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" }
mongos> use config
switched to db config
mongos> db.shards.find()
{ "_id" : "unilog-shard02", "host" : "unilog-shard02/10.70.175.66:27017,10.70.175.67:27017,10.70.175.68:27017" }
{ "_id" : "unilog-shard03", "host" : "unilog-shard03/10.70.175.69:27017,10.70.175.71:27017,10.70.175.70:27017" }
{ "_id" : "unilog-shard01", "draining" : true, "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" }
update这个状态:
- mongos> db.shards.update({ "_id" : "unilog-shard01"},{ "_id" : "unilog-shard01", "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" })
mongos> db.shards.update({ "_id" : "unilog-shard01"},{ "_id" : "unilog-shard01", "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" })
然后再查看时:
- mongos> db.shards.find()
- { "_id" : "unilog-shard02", "host" : "unilog-shard02/10.70.175.66:27017,10.70.175.67:27017,10.70.175.68:27017" }
- { "_id" : "unilog-shard03", "host" : "unilog-shard03/10.70.175.69:27017,10.70.175.71:27017,10.70.175.70:27017" }
- { "_id" : "unilog-shard01", "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" }
mongos> db.shards.find()
{ "_id" : "unilog-shard02", "host" : "unilog-shard02/10.70.175.66:27017,10.70.175.67:27017,10.70.175.68:27017" }
{ "_id" : "unilog-shard03", "host" : "unilog-shard03/10.70.175.69:27017,10.70.175.71:27017,10.70.175.70:27017" }
{ "_id" : "unilog-shard01", "host" : "unilog-shard01/10.70.175.64:27017,10.70.175.63:27017" }
用mongoVUE,连接成功
mongo相关的命令:
查看mongoDb分副本集的状态: rs.status();
查看数据库:show dbs
查看某个库的状态:
- RECOVERING> db.stats()
- {
- "db" : "logs",
- "collections" : 13,
- "objects" : 119434683,
- "avgObjSize" : 2392.0039992403213,
- "dataSize" : 285688239384,
- "storageSize" : 297047594944,
- "numExtents" : 181,
- "indexes" : 24,
- "indexSize" : 19946562048,
- "fileSize" : 343363551232,
- "nsSizeMB" : 16,
- "ok" : 1
- }
RECOVERING> db.stats()
{
"db" : "logs",
"collections" : 13,
"objects" : 119434683,
"avgObjSize" : 2392.0039992403213,
"dataSize" : 285688239384,
"storageSize" : 297047594944,
"numExtents" : 181,
"indexes" : 24,
"indexSize" : 19946562048,
"fileSize" : 343363551232,
"nsSizeMB" : 16,
"ok" : 1
}
这是在mongoDb上执行的,
- RECOVERING>
RECOVERING>
还处在修复状态,即副本集没有同步,在空间足够多的情况,尝试修复下。
可以在mongoS上验证该集合:
- mongo@oa-04-va59:~> mongo
- MongoDB shell version: 2.0.2
- connecting to: test
- > use admin
- switched to db admin
- > db.auth("admin","admin123")
- 1
- PRIMARY> help
- PRIMARY> show dbs
- PRIMARY> use logs
- switched to db logs
- PRIMARY> show collections
- PRIMARY> db.oplog.findOne()
mongo@oa-04-va59:~> mongo
MongoDB shell version: 2.0.2
connecting to: test
> use admin
switched to db admin
> db.auth("admin","admin123")
1
PRIMARY> help
PRIMARY> show dbs
PRIMARY> use logs
switched to db logs
PRIMARY> show collections
PRIMARY> db.oplog.findOne()
删除锁文件:
- mongo@oa-04-va63:/mongodb/unilog-shard02-02/data> pwd
- /mongodb/unilog-shard02-02/data
- mongo@oa-04-va63:/mongodb/unilog-shard02-02/data> rm -r mongod.lock
mongo@oa-04-va63:/mongodb/unilog-shard02-02/data> pwd
/mongodb/unilog-shard02-02/data
mongo@oa-04-va63:/mongodb/unilog-shard02-02/data> rm -r mongod.lock
查看日志文件:
- mongo@oa-04-va63:/mongodb/unilog-shard02-02/logs> pwd
- /mongodb/unilog-shard02-02/logs
- mongo@oa-04-va63:/mongodb/unilog-shard02-02/logs> tail -f server.log
mongo@oa-04-va63:/mongodb/unilog-shard02-02/logs> pwd
/mongodb/unilog-shard02-02/logs
mongo@oa-04-va63:/mongodb/unilog-shard02-02/logs> tail -f server.log