简单介绍
分片将大的数据集分配到多台主机上,每个分片是一个独立的数据库,这些分片整体上构成一个完整的逻辑数据库。分片减少了每台服务器上的数据操作量,随着集群的增长,每台分片处理越来越少的数据,结果,增加了系统整体服务能力。另外,分片还减少了每台服务器需要存储的数据量。
组成
- 分片:shard用来存储数据,为了提供系统可用性和数据一致性,一个生产环境的分片集群,通常每个分片是一个副本集。
- 查询路由:mongos指客户端应用访问每个分片的路径。
- 配置服务器:config servers存储集群的元数据,这些数据包含了集群数据集到各分片的映射关系。查询路由就是通过这些元数据到特定的分片上执行指定的数据操作。
数据划分
-
对集合分片,你需要指定一个shard key。shard key既可以是集合的每个文档的索引字段也可以是集合中每个文档都有的组合索引字段。MongoDB将shard keys值按照块(chunks)划分,并且均匀的将这些chunks分配到各个分片上。MongoDB使用基于范围划分或基于散列划分来划分chunks的。
-
范围划分:MongoDB通过shard key值将数据集划分到不同的范围就称为基于范围划分。(类似 1-9,10-19,20-29)
-
散列划分:MongoDB计算每个字段的hash值,然后用这些hash值建立chunks。
-
-
片键
-
在分发集合中文件时,mongodb的分区使用的收集片键关键,在片键由存在目标集合中的每个文档中的一个不可变或多个字段
-
在分割集合的时候选择片键,<font color=red size=4>分片键完成之后是不能更改的</font>,分片集合只能有1个片键,到片键的非空集合,集合必须有一个索引,与片键启动,对于空空集合,如果集合尚未具有指定分片键的相关索引,则Mongodb会创建索引
-
分片键的选择会影响分片集群的性能和效率以及可伸缩性,具有最佳可能的硬件可以通过分片达到瓶颈,片键和其支持指数的选择也可以影响数据的拆分,但集群可以使用
-
片键决定了集群中一个集合的文件咋不同的片键中的分布,片键字段必须被索引,且在集合中的每条记录都不能为空,可以是单个字段或者是复合字段
-
Mongodb使用片键的范围是吧数据分布在分片中,每个范围,又称为数据块,定义了一个不重叠的片键范围Mongodb把数据块与他们存储的文档分布到集群中的不同分布中,当一个数据块的大小超过数据块最大大小的时候,Mongodb会宜聚片键的范围将数据块分裂为更小的数据块
-
# 片键的使用语法 sh.shardCollection(namespace, key)
-
数据平衡
- 拆分是一个后台进程,防止块变得太大。当一个块增长到指定块大小的时候,拆分进程就会块一分为二,整个拆分过程是高效的。不会涉及到数据的迁移等操作。
- 平衡器是一个后台进程,管理块的迁移。平衡器能够运行在集群任何的mongd实例上。当集群中数据分布不均匀时,平衡器就会将某个分片中比较多的块迁移到拥有块较少的分片中,直到数据分片平衡为止。
shard集群部署
- 部署ip规划
172.17.237.33:30001 config1
172.17.237.34:30002 config2
172.17.237.36:30003 config3
172.17.237.37:40000 mongos
172.17.237.38:50000 shard1
172.17.237.39:50001 shard2
172.17.237.40:50002 shard3
172.17.237.41:60000 sha1
172.17.237.42:60001 sha2
172.17.237.43:60002 sha3
配置config server 副本集
- 配置confi1配置文件
[root@My-Dev db2]# vim config1.conf
[root@My-Dev db1]# vim configsvr.conf
logpath=/home/mongodb/test/db1/log/db1.log
pidfilepath=/home/mongodb/test/db1/db1.pid
logappend=true
port=30000
fork=true
dbpath=/home/mongodb/test/db1/data
configsvr=true # 在配置文件添加此项就行
oplogSize=512
replSet=config
- 配置confi2配置文件
[root@My-Dev db2]# vim config2.conf
logpath=/home/mongodb/test/db2/log/db2.log
pidfilepath=/home/mongodb/test/db2/db2.pid
logappend=true
port=30001
fork=true
dbpath=/home/mongodb/test/db2/data
oplogSize=512
replSet=config
configsvr=true
- 配置confi3配置文件
[root@My-Dev db2]# vim config3.conf
logpath=/home/mongodb/test/db3/log/db3.log
pidfilepath=/home/mongodb/test/db3/db3.pid
logappend=true
port=30002
fork=true
dbpath=/home/mongodb/test/db3/data
oplogSize=512
replSet=config
configsvr=true
- 启动config server
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db1/config1.conf
about to fork child process, waiting until server is ready for connections.
forked process: 5260
child process started successfully, parent exiting
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db2/config2.conf
about to fork child process, waiting until server is ready for connections.
forked process: 5202
child process started successfully, parent exiting
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db3/config3.conf
about to fork child process, waiting until server is ready for connections.
forked process: 4260
child process started successfully, parent exiting
- 配置config副本集
> use admin
switched to db admin
> config = { _id:"config",members:[ {_id:0,host:"conf1:30000"}, {_id:1,host:"conf2:30001"}, {_id:2,host:"conf3:30002"}] } #定义副本集
{
"_id" : "config",
"members" : [
{
"_id" : 0,
"host" : "conf1:30000"
},
{
"_id" : 1,
"host" : "conf2:30001"
},
{
"_id" : 2,
"host" : "conf3:30002"
}
]
}
> rs.initiate(config) #初始化副本集
{ "ok" : 1 }
配置mongos
- 添加配置mongos配置文件
遇到坑了,在启动mongos的时候启动失败,结果是mongodb3.0以后的版本config server
必须是复制集才行,结果我的版本是3.4最新的版本,所以说还需要添加两台confi server
[root@My-Dev db4]# vim mongos.conf
logpath=/home/mongodb/test/db4/log/db4.log
pidfilepath=/home/mongodb/test/db4/db4.pid
logappend=true
port=40004
fork=true
configdb=mongos/172.17.237.33:30000,172.17.237.34:30001,172.17.237.36:30002 #如果有多个mongo confi的话就用逗号分隔开
- 启动mongos
[root@My-Dev bin]# ./mongos -f /home/mongodb/test/db4/mongos.conf
about to fork child process, waiting until server is ready for connections.
forked process: 6268
child process started successfully, parent exiting
shard2副本集集群部署
- 配置sha配置文件
[root@My-Dev db8]# more shard21.conf
logpath=/home/mongodb/test/db8/log/db8.log
pidfilepath=/home/mongodb/test/db8/db8.pid
directoryperdb=true
logappend=true
port=60000
fork=true
dbpath=/home/mongodb/test/db8/data
oplogSize=512
replSet=sha
shardsvr=true
[root@My-Dev db9]# more shard22.conf
logpath=/home/mongodb/test/db9/log/db9.log
pidfilepath=/home/mongodb/test/db9/db9.pid
directoryperdb=true
logappend=true
port=60001
fork=true
dbpath=/home/mongodb/test/db9/data
oplogSize=512
replSet=sha
shardsvr=true
[root@My-Dev db10]# more shard23.conf
logpath=/home/mongodb/test/db10/log/db10.log
pidfilepath=/home/mongodb/test/db10/db10.pid
directoryperdb=true
logappend=true
port=60002
fork=true
dbpath=/home/mongodb/test/db10/data
oplogSize=512
replSet=sha
shardsvr=true
- 启动shard
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db8/shard21.conf
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db9/shard22.conf
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db10/shard23.conf
- 配置shard2副本集集群
> use admin
switched to db admin
> sha = { _id:"sha",members:[ {_id:0,host:"sha1:60000"}, {_id:1,host:"sha2:60001"}, {_id:2,host:"sha3:60002"}]}
{
"_id" : "sha",
"members" : [
{
"_id" : 0,
"host" : "sha1:60000"
},
{
"_id" : 1,
"host" : "sha2:60001"
},
{
"_id" : 2,
"host" : "sha3:60002"
}
]
}
> rs.initiate(sha)
{ "ok" : 1 }
shard1副本集集群部署
- 配置shard配置文件
[root@My-Dev db5]# vim shard1.conf
logpath=/home/mongodb/test/db5/log/db5.log
pidfilepath=/home/mongodb/test/db5/db5.pid
directoryperdb=true
logappend=true
port=50000
fork=true
dbpath=/home/mongodb/test/db5/data
oplogSize=512
replSet=shard
shardsvr=true
[root@My-Dev db6]# vim shard2.conf
logpath=/home/mongodb/test/db6/log/db6.log
pidfilepath=/home/mongodb/test/db6/db6.pid
directoryperdb=true
logappend=true
port=50001
fork=true
dbpath=/home/mongodb/test/db6/data
oplogSize=512
replSet=shard
shardsvr=true
[root@My-Dev db7]# vim shard3.conf
logpath=/home/mongodb/test/db7/log/db7.log
pidfilepath=/home/mongodb/test/db7/db7.pid
directoryperdb=true
logappend=true
port=50002
fork=true
dbpath=/home/mongodb/test/db7/data
oplogSize=512
replSet=shard
shardsvr=true
- 启动shard
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db7/shard1.conf
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db7/shard2.conf
[root@My-Dev bin]# ./mongod -f /home/mongodb/test/db7/shard3.conf
- 配置shard2副本集集群
> use admin
switched to db admin
> shard = { _id:"shard",members:[ {_id:0,host:"shard1:50000"}, {_id:1,host:"shard2:50001"}, {_id:2,host:"shard3:50002"}] }
{
"_id" : "shard",
"members" : [
{
"_id" : 0,
"host" : "shard1:50000"
},
{
"_id" : 1,
"host" : "shard2:50001"
},
{
"_id" : 2,
"host" : "shard3:50002"
}
]
}
> rs.initiate(shard)
{ "ok" : 1 }
分片配置
-
分片集合中是否有数据
默认第一个添加的shard就是主shard,存放没有被分割的shard就是主shard
在创建分片的时,必须在索引中创建的,如果这个集合中有数据,则首先自己先创建索引,然后进行分片,如果是分片集合中没有数据的话,则就不需要创建索引,就可以分片 -
登陆mongos配置分片,向分区集群中添加shard服务器和副本集
[root@My-Dev bin]# ./mongo mongos:40004 #登陆到mongos中
mongos> sh.status() #查看分片状态
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
shards:
active mongoses:
"3.4.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
Failed balancer rounds in last 5 attempts: 0
Migration Results for the last 24 hours:
No recent migrations
databases:
- 添加shard副本集
#首先要登陆到shard副本集中查看那个是主节点,本次实验室使用了两个shard副本集 sh.addShard("<replSetName>/主节点IP/port")
mongos> sh.addShard("shard/shard1:50000")
{ "shardAdded" : "shard", "ok" : 1 }
mongos> sh.addShard("sha/sha:60000")
{ "shardAdded" : "shard", "ok" : 1 }
mongos> sh.status() #查看分片集群已经成功把shard加入分片中
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
shards:
{ "_id" : "sha", "host" : "sha/sha1:60000,sha2:60001,sha3:60002", "state" : 1 }
{ "_id" : "shard", "host" : "shard/shard1:50000,shard2:50001,shard3:50002", "state" : 1 }
active mongoses:
"3.4.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
Failed balancer rounds in last 5 attempts: 5
Last reported error: Cannot accept sharding commands if not started with --shardsvr
Time of Reported error: Thu Feb 09 2017 17:42:21 GMT+0800 (CST)
Migration Results for the last 24 hours:
No recent migrations
databa
- 指定那个数据库使用分片,创建片键
mongos> sh.enableSharding("zhao") #指定zhao数据库中使用分片
{ "ok" : 1 }
mongos> sh.shardCollection("zhao.call",{name:1,age:1}) #在zhao数据库和call集合中创建了name和age为升序的片键
{ "collectionsharded" : "zhao.call", "ok" : 1 }
- 查看
sh.status()
信息
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
shards:
{ "_id" : "sha", "host" : "sha/sha1:60000,sha2:60001,sha3:60002", "state" : 1 }
{ "_id" : "shard", "host" : "shard/shard1:50000,shard2:50001,shard3:50002", "state" : 1 }
active mongoses:
"3.4.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
Failed balancer rounds in last 5 attempts: 5
Last reported error: Cannot accept sharding commands if not started with --shardsvr
Time of Reported error: Thu Feb 09 2017 17:56:02 GMT+0800 (CST)
Migration Results for the last 24 hours:
No recent migrations
databases:
{ "_id" : "zhao", "primary" : "shard", "partitioned" : true }
zhao.call
shard key: { "name" : 1, "age" : 1 }
unique: false
balancing: true
chunks:
shard 1
{ "name" : { "$minKey" : 1 }, "age" : { "$minKey" : 1 } } -->> { "name" : { "$maxKey" : 1 }, "age" : { "$maxKey" : 1 } } on : shard Timestamp(1, 0)
- 测试批量插入数据验证
mongos> for ( var i=1;i<10000000;i++){db.call.insert({"name":"user"+i,age:i})};
- 查看当前是否已经分片到两个shard中去了
mongos> sh.status()
--- Sharding Status ---
sharding version: {
"_id" : 1,
"minCompatibleVersion" : 5,
"currentVersion" : 6,
"clusterId" : ObjectId("589b0cff36b0915841e2a0a2")
}
shards:
{ "_id" : "sha", "host" : "sha/sha1:60000,sha2:60001,sha3:60002", "state" : 1 }
{ "_id" : "shard", "host" : "shard/shard1:50000,shard2:50001,shard3:50002", "state" : 1 }
active mongoses:
"3.4.1" : 1
autosplit:
Currently enabled: yes
balancer:
Currently enabled: yes
Currently running: no
Balancer lock taken at Wed Feb 08 2017 20:20:16 GMT+0800 (CST) by ConfigServer:Balancer
Failed balancer rounds in last 5 attempts: 5
Last reported error: Cannot accept sharding commands if not started with --shardsvr
Time of Reported error: Thu Feb 09 2017 17:56:02 GMT+0800 (CST)
Migration Results for the last 24 hours:
4 : Success
databases:
{ "_id" : "zhao", "primary" : "shard", "partitioned" : true }
zhao.call
shard key: { "name" : 1, "age" : 1 }
unique: false
balancing: true
chunks: #数据已经分片到两个chunks里面了
sha 4
shard 5
{ "name" : { "$minKey" : 1 }, "age" : { "$minKey" : 1 } } -->> { "name" : "user1", "age" : 1 } on : sha Timestamp(4, 1)
{ "name" : "user1", "age" : 1 } -->> { "name" : "user1", "age" : 21 } on : shard Timestamp(5, 1)
{ "name" : "user1", "age" : 21 } -->> { "name" : "user1", "age" : 164503 } on : shard Timestamp(2, 2)
{ "name" : "user1", "age" : 164503 } -->> { "name" : "user1", "age" : 355309 } on : shard Timestamp(2, 3)
{ "name" : "user1", "age" : 355309 } -->> { "name" : "user1", "age" : 523081 } on : sha Timestamp(3, 2)
{ "name" : "user1", "age" : 523081 } -->> { "name" : "user1", "age" : 710594 } on : sha Timestamp(3, 3)
{ "name" : "user1", "age" : 710594 } -->> { "name" : "user1", "age" : 875076 } on : shard Timestamp(4, 2)
{ "name" : "user1", "age" : 875076 } -->> { "name" : "user1", "age" : 1056645 } on : shard Timestamp(4, 3)
{ "name" : "user1", "age" : 1056645 } -->> { "name" : { "$maxKey" : 1 }, "age" : { "$maxKey" : 1 } } on : sha Timestamp(5, 0)
- 查看当前分片中是否均匀的分配到连个shard当中,
true
是均匀的
,false
不是均匀的
mongos> sh.getBalancerState()
true
选择sharing kes'
注意点
- 考虑应该在哪里储存数据?
- 应该在哪里读取数据?
- sharding key 应该是主键
- sharding key 应该你能尽量保证避免分片查询
sharing 进级
- 如果sharing 分片不均匀没有分片均匀
- sharding : 新增
shard
和移除shard
mongos> sh.addShard("sha4/192.168.2.10:21001")
Balancer
- 开启Balncer
开启Balancer之后,chunks之后会自动均分
mongos> sh.startBalancer()
- 设置Balancer进程运行时间窗口
默认情况ixaBalancing进程在运行时为降低Balancing进程对系统的影响,可以设置Balancer进程的运行时间窗口,让Balancer进程在指定时间窗口操作
#设置时间窗口
db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "23:00", stop : "6:00" } } }, true )
- 查看Balancer运行时间窗口
# 查看Balancer时间窗口
mongos> db.settings.find();
{ "_id" : "balancer", "activeWindow" : { "start" : "23:00", "stop" : "6:00" }, "stopped" : false }
mongos> sh.getBalancerWindow()
{ "start" : "23:00", "stop" : "6:00" }
- 删除Balancer进程运行时间窗口
mongos> db.settings.update({ "_id" : "balancer" }, { $unset : { activeWindow : 1 }});
mongos> db.settings.find();
{ "_id" : "chunksize", "value" : 10 }
{ "_id" : "balancer", "stopped" : false }
在shell脚本中执行mongodb
[root@My-Dev ~]# echo -e "use zhao \n db.call.find()" |mongo --port 60001
Mongodb片键的添加
- 首先进入mongos的的admin数据库中
mongos> use admin
switched to db admin
mongos> db.runCommand({"enablesharding":"zl"}) #创建zl库中
{ "ok" : 1 }
mongos> db.runCommand(db.runCommand({"shardcollection":"$ent.t_srvappraise_back","key")
- 分片脚本
#!/bin/bash
url=10.241.96.155
port=30000
ent=test1
./mongo $url:$port/admin <<EOF
db.runCommand({"enablesharding":"$ent"});
db.runCommand({"shardcollection":"$ent.t_srvappraise_back","key":{"sa_seid":"hashed"}})
exit;
EOF
原网址:https://www.jianshu.com/p/cb55bb333e2d
别人的我还未验证,单纯的记下笔记,方便自己翻阅。