MongoDB集群包括一定数量的mongod(分片存储数据)、mongos(路由处理)、config server(配置节点)、clients(客户端)、arbiter(仲裁节点:为了选举某个分片存储数据节点那台为主节点)。
1、 shards:一个shard为一组mongod,通常一组为两台,主从或互为主从,这一组mongod中的数据时相同的,具体可见《mongodb分布式之数据复制》。数据分割按有序分割方式,每个分片上的数据为某一范围的数据块,故可支持指定分片的范围查询,这同google的BigTable 类似。数据块有指定的最大容量,一旦某个数据块的容量增长到最大容量时,这个数据块会切分成为两块;当分片的数据过多时,数据块将被迁移到系统的其他分片中。另外,新的分片加入时,数据块也会迁移。
2、 mongos:可以有多个,相当于一个控制中心,负责路由和协调操作,使得集群像一个整体的系统。mongos可以运行在任何一台服务器上,有些选择放在 shards服务器上,也有放在client 服务器上的。mongos启动时需要从config servers上获取基本信息,然后接受client 端的请求,路由到shards服务器上,然后整理返回的结果发回给client服务器。
3、config server:存储集群的信息,包括分片和块数据信息。主要存储块数据信息,每个config server上都有一份所有块数据信息的拷贝,以保证每台config server上的数据的一致性。
4、shard key:为了分割数据集,需要制定分片key的格式,类似于用于索引的key格式,通常由一个或多个字段组成以分发数据,比如:
{ name : 1 }
{ _id : 1 }
{ lastname : 1, firstname : 1 }
{ tag : 1, timestamp : -1 }
mongoDB的分片为有序存储(1为升序,-1为降序),shard key相邻的数据通常会存在同一台服务(数据块)上。
环境
CentOS6.4_64
网络配置
192.168.10.151 os-1
192.168.10.152 os-2
192.168.10.153 os-3
192.168.10.154 os-4
192.168.10.155 os-5
192.168.10.156 os-6
192.168.10.167 os-14
192.168.10.166 os-13
架构和资源分配
192.168.10.151||=========client 不集群的参与任何角色
================================================
192.168.10.152|| || 1.1
192.168.10.153||__mongos(集群的控制系统,路由处理) || 1.2
192.168.10.154|| || 1.3
192.168.10.155|| || 2.1 ==nongod(分片存储节点,每个节点担任两种角色,这些节点也可以分布在不同的节点上任意安排)
192.168.10.156 || || 2.2
192.168.10.167 ||===config(集群配置节点) || 2.3
192.168.10.166 || || 2.4
开始安装
cd /usr/src/ && wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.4.3.tgz && tar -zxvf mongodb-linux-x86_64-2.4.3.tgz && PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/mfs/bin:/usr/local/mfs/sbin:/usr/local/mfs/bin:/usr/local/mfs/sbin:/usr/lpp/mmfs/bin:/usr/lpp/mmfs/bin:/root/bin:/usr/src/mongodb-linux-x86_64-2.4.3/bin && echo "PATH=/usr/lib64/qt-3.3/bin:/usr/local/sbin:/usr/local/bin:/sbin:/bin:/usr/sbin:/usr/bin:/root/bin:/usr/local/mfs/bin:/usr/local/mfs/sbin:/usr/local/mfs/bin:/usr/local/mfs/sbin:/usr/lpp/mmfs/bin:/usr/lpp/mmfs/bin:/root/bin:/usr/src/mongodb-linux-x86_64-2.4.3/bin" >> /etc/profile && mkdir -p /home/mongodb/data/ && mkdir -p /home/mongodb/log && mkdir -p /home/mongodb/config && mkdir -p /home/mongodb/arbiter
单节点配置参数说明
mongod --fork --bind_ip 127.0.0.1 --port 11811 --dbpath /data0/mongodb/data --directoryperdb --logpath /data0/mongodb/log/db1.log --logappend --nohttpinterface
netstat -ntlp|grep mongod
参数说明:
–logpath 日志文件路径
–master 指定为主机器
–slave 指定为从机器
–source 指定主机器的IP地址
–pologSize 指定日志文件大小不超过64M.因为resync是非常操作量大且耗时,最好通过设置一个足够大的oplogSize来避免resync(默认的 oplog大小是空闲磁盘大小的5%)。
–logappend 日志文件末尾添加
–port 启用端口号
–fork 在后台运行
–only 指定只复制哪一个数据库
–slavedelay 指从复制检测的时间间隔
–auth 是否需要验证权限登录(用户名和密码)
–noauth 不需要验证权限登录(用户名和密码)
###################################################################
每台机器 mongod --shardsvr --replSet NGD2 --fork --port 11813 --maxConns 20000 --dbpath /home/mongodb/data --directoryperdb --logpath /home/mongodb/log/db.log --logappend --nohttpinterface
mongo --port 11813
config={_id:"NGD2",members:[ {_id:0,host:"192.168.10.152:11813"}, {_id:1,host:"192.168.10.153:11813"}, {_id:2,host:"192.168.10.154:11813"}, {_id:3,host:"192.168.10.154:11813"}, {_id:4,host:"192.168.10.155:11813"}, {_id:5,host:"192.168.10.156:11813"}, {_id:6,host:"192.168.10.166:11813"}, {_id:7,host:"192.168.10.167:11813"}] }
rs.initiate(config)
rs.status()
###################################################################
配置config
mongod --configsvr --dbpath /home/mongodb/config --port 20000 --logpath /home/mongodb/log/config1.log --logappend --fork
###################################################################
配置mongos
mongos --configdb 192.168.10.156:20000,192.168.10.166:20000,192.168.10.167:20000 --port 30000 --chunkSize 1 --logpath /home/mongodb/log/mongos.log --logappend --fork
###################################################################
配置shard cluster
[root@os-3 home]# mongo --port 30000
MongoDB shell version: 2.4.3
connecting to: 127.0.0.1:30000/test
mongos>db.runCommand({addshard:“shard1/192.168.10.152:11813,192.168.10.153:11813,,192.168.10.154:11813,192.168.10.155:11813,192.168.10.156:11813,192.168.10.166:11813,192.168.10.167:11813”});
mongos> db.runCommand({listshards:1}) #查看shard
mongos> db.runCommand({enablesharding:"test"}); #激活数据库分片
mongos> printShardingStatus() #查看shard信息
mongos> for (var i=1;i<=500000;i++) db.data.save ({_id:i,value:”www.ttlsa.com”}) #大量插入数据
Mongodb自带了Web控制台,默认和数据服务一同开启。他的端口在Mongodb数据库服务器端口的基础上加1000,此处位21000,
############################################################################################
原文章收集
MongoDB的auto-sharding功能是指mongodb通过mongos自动建立一个水平扩展的数据库集群系统,将数据库分表存储在sharding的各个节点上。
通过把Sharding和Replica Sets相结合,可以搭建一个分布式的,高可用性,自动水平扩展的集群。
要构建MongoDB Sharding Cluster,需要三种角色:
Shard Server: mongod 实例, 使用 Replica Sets,确保每个数据节点都具有备份、自动容错转移、自动恢复能力。用于存储实际的数据块,实际生产环境中一个shard server角色可由几台机器组个一个relica set承担,防止主机单点故障
Config Server: mongod 实例,使用 3 个配置服务器,确保元数据完整性(two-phase commit)。存储了整个 Cluster Metadata,其中包括 chunk 信息。
Route Server: mongos 实例,配合 LVS,实现负载平衡,提高接入性能(high performance)。前端路由,客户端由此接入,且让整个集群看上去像单一数据库,前端应用可以透明使用。
环境如下:
192.168.198.131
shard1:10001
shard2:10002
shard3:10003
config1:20000
192.168.198.129
shard1:10001
shard2:10002
shard3:10003
config2:20000
192.168.198.132
shard1:10001
shard2:10002
shard3:10003
config3:20000
192.168.198.133
mongos:30000
分别在三台服务器上安装mongod服务,安装如下:
# wget http://fastdl.mongodb.org/linux/mongodb-linux-x86_64-2.0.3.tgz
# tar zxvf mongodb-linux-x86_64-2.0.3.tgz -C ../software/
# ln -s mongodb-linux-x86_64-2.0.3 /usr/local/mongodb
# useradd mongodb
# mkdir -p /data/mongodb/shard1
# mkdir -p /data/mongodb/shard2
# mkdir -p /data/mongodb/shard3
# mkdir -p /data/mongodb/config1
配置shard1的replica set
192.168.198.131
# cd /usr/local/mongodb/bin
# ./mongod –shardsvr –replSet shard1 –port 10001 –dbpath /data/mongodb/shard1 –oplogSize 100 –logpath /data/mongodb/shard1/shard1.log –logappend –fork
192.168.198.129
# ./mongod –shardsvr –replSet shard1 –port 10001 –dbpath /data/mongodb/shard1 –oplogSize 100 –logpath /data/mongodb/shard1/shard1.log –logappend –fork
192.168.198.132
# ./mongod –shardsvr –replSet shard1 –port 10001 –dbpath /data/mongodb/shard1 –oplogSize 100 –logpath /data/mongodb/shard1/shard1.log –logappend –fork
连接到192.168.198.131
# ./mongo –port 10001
> config={_id:”shard1″,members:[
... {_id:0,host:"192.168.198.131:10001"},
... {_id:1,host:"192.168.198.129:10001"},
... {_id:2,host:"192.168.198.132:10001"}]
… }
> rs.initiate(config)
{
“info” : “Config now saved locally. Should come online in about a minute.”,
“ok” : 1
}
PRIMARY> rs.status()
{
“set” : “shard1″,
“date” : ISODate(“2012-03-02T02:37:55Z”),
“myState” : 1,
“members” : [
{
"_id" : 0,
"name" : "192.168.198.131:10001",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1330655827000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T02:37:07Z"),
"self" : true
},
{
"_id" : 1,
"name" : "192.168.198.129:10001",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 36,
"optime" : {
"t" : 1330655827000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T02:37:07Z"),
"lastHeartbeat" : ISODate("2012-03-02T02:37:53Z"),
"pingMs" : 0
},
{
"_id" : 2,
"name" : "192.168.198.132:10001",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 36,
"optime" : {
"t" : 1330655827000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T02:37:07Z"),
"lastHeartbeat" : ISODate("2012-03-02T02:37:53Z"),
"pingMs" : 466553
}
],
“ok” : 1
}
配置shard2的replica set
192.168.198.129
# ./mongod –shardsvr –replSet shard2 –port 10002 –dbpath /data/mongodb/shard2 –oplogSize 100 –logpath /data/mongodb/shard2/shard2.log –logappend –fork
192.168.198.131
# ./mongod –shardsvr –replSet shard2 –port 10002 –dbpath /data/mongodb/shard2 –oplogSize 100 –logpath /data/mongodb/shard2/shard2.log –logappend –fork
192.168.198.132
# ./mongod –shardsvr –replSet shard2 –port 10002 –dbpath /data/mongodb/shard2 –oplogSize 100 –logpath /data/mongodb/shard2/shard2.log –logappend –fork
连接到192.168.198.129
# ./mongo –port 10002
> config={_id:”shard2″,members:[
... {_id:0,host:"192.168.198.129:10002"},
... {_id:1,host:"192.168.198.131:10002"},
... {_id:2,host:"192.168.198.132:10002"}]
… }
{
“_id” : “shard2″,
“members” : [
{
"_id" : 0,
"host" : "192.168.198.129:10002"
},
{
"_id" : 1,
"host" : "192.168.198.131:10002"
},
{
"_id" : 2,
"host" : "192.168.198.132:10002"
}
]
}
> rs.initiate(config)
{
“info” : “Config now saved locally. Should come online in about a minute.”,
“ok” : 1
}
> rs.status()
{
“set” : “shard2″,
“date” : ISODate(“2012-03-02T02:53:17Z”),
“myState” : 1,
“members” : [
{
"_id" : 0,
"name" : "192.168.198.129:10002",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1330656717000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T02:51:57Z"),
"self" : true
},
{
"_id" : 1,
"name" : "192.168.198.131:10002",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 73,
"optime" : {
"t" : 1330656717000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T02:51:57Z"),
"lastHeartbeat" : ISODate("2012-03-02T02:53:17Z"),
"pingMs" : 1
},
{
"_id" : 2,
"name" : "192.168.198.132:10002",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 73,
"optime" : {
"t" : 1330656717000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T02:51:57Z"),
"lastHeartbeat" : ISODate("2012-03-02T02:53:17Z"),
"pingMs" : 209906
}
],
“ok” : 1
}
配置shard3的replica set
192.168.198.132
# ./mongod –shardsvr –replSet shard3 –port 10003 –dbpath /data/mongodb/shard3 –oplogSize 100 –logpath /data/mongodb/shard3/shard3.log –logappend –fork
192.168.198.129
# ./mongod –shardsvr –replSet shard3 –port 10003 –dbpath /data/mongodb/shard3 –oplogSize 100 –logpath /data/mongodb/shard3/shard3.log –logappend –fork
192.168.198.131
# ./mongod –shardsvr –replSet shard3 –port 10003 –dbpath /data/mongodb/shard3 –oplogSize 100 –logpath /data/mongodb/shard3/shard3.log –logappend –fork
连接到192.168.198.132
# ./mongo –port 10003
> config={_id:”shard3″,members:[
... {_id:0,host:"192.168.198.132:10003"},
... {_id:1,host:"192.168.198.131:10003"},
... {_id:2,host:"192.168.198.129:10003"}]
… }
{
“_id” : “shard3″,
“members” : [
{
"_id" : 0,
"host" : "192.168.198.132:10003"
},
{
"_id" : 1,
"host" : "192.168.198.131:10003"
},
{
"_id" : 2,
"host" : "192.168.198.129:10003"
}
]
}
>rs.initiate(config)
{
“info” : “Config now saved locally. Should come online in about a minute.”,
“ok” : 1
}
> rs.status()
{
“set” : “shard3″,
“date” : ISODate(“2012-03-02T03:04:52Z”),
“myState” : 1,
“members” : [
{
"_id" : 0,
"name" : "192.168.198.132:10003",
"health" : 1,
"state" : 1,
"stateStr" : "PRIMARY",
"optime" : {
"t" : 1330657451000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T03:04:11Z"),
"self" : true
},
{
"_id" : 1,
"name" : "192.168.198.131:10003",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 39,
"optime" : {
"t" : 1330657451000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T03:04:11Z"),
"lastHeartbeat" : ISODate("2012-03-02T03:04:52Z"),
"pingMs" :
},
{
"_id" : 2,
"name" : "192.168.198.129:10003",
"health" : 1,
"state" : 2,
"stateStr" : "SECONDARY",
"uptime" : 39,
"optime" : {
"t" : 1330657451000,
"i" : 1
},
"optimeDate" : ISODate("2012-03-02T03:04:11Z"),
"lastHeartbeat" : ISODate("2012-03-02T03:04:52Z"),
"pingMs" : 0
}
],
“ok” : 1
}
配置config
192.168.198.131
# ./mongod –configsvr –dbpath /data/mongodb/config1 –port 20000 –logpath /data/mongodb/config1/config1.log –logappend –fork
192.168.198.129
# ./mongod –configsvr –dbpath /data/mongodb/config2 –port 20000 –logpath /data/mongodb/config2/config2.log –logappend –fork
192.168.198.132
# ./mongod –configsvr –dbpath /data/mongodb/config3 –port 20000 –logpath /data/mongodb/config3/config3.log –logappend –fork
配置mongos
# ./mongos –configdb 192.168.198.131:20000,192.168.198.129:20000,192.168.198.132:20000 –port 30000 –chunkSize 1 –logpath /data/mongodb/mongos.log –logappend –fork
配置shard cluster
# ./mongo –port 30000
mongos> use admin
switched to db admin
加入shards
mongos> db.runCommand({addshard:”shard1/192.168.198.131:10001,192.168.198.129:10001,192.168.198.132:10001″});
{ “shardAdded” : “shard1″, “ok” : 1 }
mongos> db.runCommand({addshard:”shard2/192.168.198.131:10002,192.168.198.129:10002,192.168.198.132:10002″});
{ “shardAdded” : “shard2″, “ok” : 1 }
mongos> db.runCommand({addshard:”shard3/192.168.198.131:10003,192.168.198.129:10003,192.168.198.132:10003″});
{ “shardAdded” : “shard3″, “ok” : 1 }
列出shards
mongos> db.runCommand({listshards:1})
{
“shards” : [
{
"_id" : "shard1",
"host" : "shard1/192.168.198.129:10001,192.168.198.131:10001,192.168.198.132:10001"
},
{
"_id" : "shard2",
"host" : "shard2/192.168.198.129:10002,192.168.198.131:10002,192.168.198.132:10002"
},
{
"_id" : "shard3",
"host" : "shard3/192.168.198.129:10003,192.168.198.131:10003,192.168.198.132:10003"
}
],
“ok” : 1
}
激活数据库分片
mongos> db.runCommand({enablesharding:”test”});
{ “ok” : 1 }
通过以上命令,可以将数据库test跨shard,如果不执行,数据库只会存放在一个shard,一旦激活数据库分片,数据库中的不同的 collection将被存放在不同的shard上,但一个collection仍旧存放在同一个shard上,要使collection也分片,需对 collection做些其他操作。
collection分片
mongos> db.runCommand({shardcollection:”test.data”,key:{_id:1}})
{ “collectionsharded” : “test.data”, “ok” : 1 }
分片的collection只能有一个在分片key上的唯一索引,其他唯一索引不被允许。
查看shard信息
mongos> printShardingStatus()
— harding Status —
sharding version: { “_id” : 1, “version” : 3 }
shards:
{ “_id” : “shard1″, “host” : “shard1/192.168.198.129:10001,192.168.198.131:10001,192.168.198.132:10001″ }
{ “_id” : “shard2″, “host” : “shard2/192.168.198.129:10002,192.168.198.131:10002,192.168.198.132:10002″ }
{ “_id” : “shard3″, “host” : “shard3/192.168.198.129:10003,192.168.198.131:10003,192.168.198.132:10003″ }
databases:
{ “_id” : “admin”, “partitioned” : false, “primary” : “config” }
{ “_id” : “test”, “partitioned” : true, “primary” : “shard1″ }
test.data chunks:
shard1 1
{ “_id” : { $minKey : 1 } } –>> { “_id” : { $maxKey : 1 } } on : shard1 { “t” : 1000, “i” : 0 }
mongos> use test
switched to db test
mongos> db.data.stats()
{
“sharded” : true,
“flags” : 1,
“ns” : “test.data”,
“count” : 0,
“numExtents” : 1,
“size” : 0,
“storageSize” : 8192,
“totalIndexSize” : 8176,
“indexSizes” : {
“_id_” : 8176
},
“avgObjSize” : 0,
“nindexes” : 1,
“nchunks” : 1,
“shards” : {
“shard1″ : {
“ns” : “test.data”,
“count” : 0,
“size” : 0,
“storageSize” : 8192,
“numExtents” : 1,
“nindexes” : 1,
“lastExtentSize” : 8192,
“paddingFactor” : 1,
“flags” : 1,
“totalIndexSize” : 8176,
“indexSizes” : {
“_id_” : 8176
},
“ok” : 1
}
},
“ok” : 1
}
测试:插入大量数据
mongos> for (var i=1;i<=500000;i++) db.data.save ({_id:i,value:”www.ttlsa.com”})
mongos> printShardingStatus()
— Sharding Status —
sharding version: { “_id” : 1, “version” : 3 }
shards:
{ “_id” : “shard1″, “host” : “shard1/192.168.198.129:10001,192.168.198.131:10001,192.168.198.132:10001″ }
{ “_id” : “shard2″, “host” : “shard2/192.168.198.129:10002,192.168.198.131:10002,192.168.198.132:10002″ }
{ “_id” : “shard3″, “host” : “shard3/192.168.198.129:10003,192.168.198.131:10003,192.168.198.132:10003″ }
databases:
{ “_id” : “admin”, “partitioned” : false, “primary” : “config” }
{ “_id” : “test”, “partitioned” : true, “primary” : “shard1″ }
test.data chunks:
shard1 6
shard2 5
shard3 11
too many chunks to print, use verbose if you want to force print
mongos> db.data.stats()
{
“sharded” : true,
“flags” : 1,
“ns” : “test.data”,
“count” : 500000,
“numExtents” : 19,
“size” : 22000084,
“storageSize” : 43614208,
“totalIndexSize” : 14062720,
“indexSizes” : {
“_id_” : 14062720
},
“avgObjSize” : 44.000168,
“nindexes” : 1,
“nchunks” : 22,
“shards” : {
“shard1″ : {
“ns” : “test.data”,
“count” : 112982,
“size” : 4971232,
“avgObjSize” : 44.00021242321786,
“storageSize” : 11182080,
“numExtents” : 6,
“nindexes” : 1,
“lastExtentSize” : 8388608,
“paddingFactor” : 1,
“flags” : 1,
“totalIndexSize” : 3172288,
“indexSizes” : {
“_id_” : 3172288
},
“ok” : 1
},
“shard2″ : {
“ns” : “test.data”,
“count” : 124978,
“size” : 5499056,
“avgObjSize” : 44.00019203379795,
“storageSize” : 11182080,
“numExtents” : 6,
“nindexes” : 1,
“lastExtentSize” : 8388608,
“paddingFactor” : 1,
“flags” : 1,
“totalIndexSize” : 3499328,
“indexSizes” : {
“_id_” : 3499328
},
“ok” : 1
},
“shard3″ : {
“ns” : “test.data”,
“count” : 262040,
“size” : 11529796,
“avgObjSize” : 44.000137383605555,
“storageSize” : 21250048,
“numExtents” : 7,
“nindexes” : 1,
“lastExtentSize” : 10067968,
“paddingFactor” : 1,
“flags” : 1,
“totalIndexSize” : 7391104,
“indexSizes” : {
“_id_” : 7391104
},
“ok” : 1
}
},
“ok” : 1
}
#########################
本文属笔者原创
作者:john
转载请注明出处