NoSQL-Mongodb分片集群 (二)

熬夜泡枸杞

已于 2022-08-23 01:22:57 修改

阅读量167

点赞数

分类专栏： NoSQL 文章标签： mongodb nosql 数据库

于 2022-08-08 00:55:28 首次发布

本文链接：https://blog.csdn.net/weixin_46818279/article/details/126220006

版权

NoSQL 专栏收录该内容

4 篇文章 0 订阅

订阅专栏

文章目录

- 7. MongoDB Sharding Cluster分片集群

7. MongoDB Sharding Cluster分片集群

MongoDB Sharding Cluster（MSC）分布式解决方案的鼻祖

在这里插入图片描述
图文解释：
(1) mongos接收App Server的插入文档的请求，mongos去找Config Servers节点去查寻shard节点的信息(一个shard是一个高可用)，里面的配置会告诉mongos请求插入数据用什么策略去存，往哪个节点去存。然后mongos在把App Server的插入数据的请求往某个指定的shard节点上进行操作存储。
(2) 通俗理解： App Server请求: 客人 mongos: 接客服务员 Config Servers: 大堂经理 Shard: 房间（客人来住房，接客服务员询问大堂经理哪个房间有空房，大堂经理说Shard1房间有空房，然后服务员领着客人去了Shard1房间）
(3) router有banlance功能，会把shard节点的chunk均匀分布（也可以在config server手工配置），shard里面有chunk，一个chunk默认64M，存数据。把shard节点规划成一个一个的64M(chunk)存储空间。
(4) mongodb分片集群的优点：多分片策略（自动或者手工配置），配置信息统一管理，节点灵活的增加和删除。mongodb中，增加加节点或者删除节点都会保证chunk数据的均衡，不需要向redis那样还需人工操作

7.1 分片集群规划

10个实例: 38017-38026
(1) configserver: 38018-38020
	3台构成的复制集(1主两从，不支持arbiter) 38018-38020（复制集名字configsvr）
(2) shard节点：
	sh1: 38021-23 (1主两从，其中一个节点为arbiter，复制集名字sh1)
	sh2: 38024-26 (1主两从，其中一个节点为arbiter，复制集名字sh2)
(3) mongos: 38017

7.2 Shard节点配置过程

7.2.1 目录创建

mkdir -p /mongodb/38021/conf  /mongodb/38021/log  /mongodb/38021/data
mkdir -p /mongodb/38022/conf  /mongodb/38022/log  /mongodb/38022/data
mkdir -p /mongodb/38023/conf  /mongodb/38023/log  /mongodb/38023/data
mkdir -p /mongodb/38024/conf  /mongodb/38024/log  /mongodb/38024/data
mkdir -p /mongodb/38025/conf  /mongodb/38025/log  /mongodb/38025/data
mkdir -p /mongodb/38026/conf  /mongodb/38026/log  /mongodb/38026/data

7.2.2 修改配置文件

# 第一组复制集搭建: 21-23(1主 1从 1Arbiter)
cat >  /mongodb/38021/conf/mongodb.conf  <<EOF
systemLog:
  destination: file
  path: /mongodb/38021/log/mongodb.log   
  logAppend: true
storage:
  journal:
    enabled: true
  dbPath: /mongodb/38021/data
  directoryPerDB: true
  #engine: wiredTiger
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true
net:
  bindIp: 10.0.0.61,127.0.0.1
  port: 38021
replication:
  oplogSizeMB: 2048
  replSetName: sh1
sharding:
  clusterRole: shardsvr
processManagement: 
  fork: true
EOF


\cp  /mongodb/38021/conf/mongodb.conf  /mongodb/38022/conf/
\cp  /mongodb/38021/conf/mongodb.conf  /mongodb/38023/conf/

sed 's#38021#38022#g' /mongodb/38022/conf/mongodb.conf -i
sed 's#38021#38023#g' /mongodb/38023/conf/mongodb.conf -i

# 第一组复制集搭建: 24-26(1主 1从 1Arbiter)
cat > /mongodb/38024/conf/mongodb.conf <<EOF
systemLog:
  destination: file
  path: /mongodb/38024/log/mongodb.log   
  logAppend: true
storage:
  journal:
    enabled: true
  dbPath: /mongodb/38024/data
  directoryPerDB: true
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true
net:
  bindIp: 10.0.0.61,127.0.0.1
  port: 38024
replication:
  oplogSizeMB: 2048
  replSetName: sh2
sharding:
  clusterRole: shardsvr
processManagement: 
  fork: true
EOF

\cp  /mongodb/38024/conf/mongodb.conf  /mongodb/38025/conf/
\cp  /mongodb/38024/conf/mongodb.conf  /mongodb/38026/conf/

sed 's#38024#38025#g' /mongodb/38025/conf/mongodb.conf -i
sed 's#38024#38026#g' /mongodb/38026/conf/mongodb.conf -i

7.2.3 启动所有节点并搭建复制集

mongod -f  /mongodb/38021/conf/mongodb.conf 
mongod -f  /mongodb/38022/conf/mongodb.conf 
mongod -f  /mongodb/38023/conf/mongodb.conf 
mongod -f  /mongodb/38024/conf/mongodb.conf 
mongod -f  /mongodb/38025/conf/mongodb.conf 
mongod -f  /mongodb/38026/conf/mongodb.conf 
ps -ef | grep mongod

mongo --port 38021
use  admin
config = {_id: 'sh1', members: [
                          {_id: 0, host: '10.0.0.61:38021'},
                          {_id: 1, host: '10.0.0.61:38022'},
                          {_id: 2, host: '10.0.0.61:38023',"arbiterOnly":true}]
           }

rs.initiate(config)


mongo --port 38024 
use admin
config = {_id: 'sh2', members: [
                          {_id: 0, host: '10.0.0.61:38024'},
                          {_id: 1, host: '10.0.0.61:38025'},
                          {_id: 2, host: '10.0.0.61:38026',"arbiterOnly":true}]
           }
  
rs.initiate(config)

7.3 config server节点配置

7.3.1 目录创建

# config server:38018-38020

mkdir -p /mongodb/38018/conf  /mongodb/38018/log  /mongodb/38018/data
mkdir -p /mongodb/38019/conf  /mongodb/38019/log  /mongodb/38019/data
mkdir -p /mongodb/38020/conf  /mongodb/38020/log  /mongodb/38020/data

7.3.2 修改配置文件

cat > /mongodb/38018/conf/mongodb.conf <<EOF
systemLog:
  destination: file
  path: /mongodb/38018/log/mongodb.conf
  logAppend: true
storage:
  journal:
    enabled: true
  dbPath: /mongodb/38018/data
  directoryPerDB: true
  #engine: wiredTiger
  wiredTiger:
    engineConfig:
      cacheSizeGB: 1
      directoryForIndexes: true
    collectionConfig:
      blockCompressor: zlib
    indexConfig:
      prefixCompression: true
net:
  bindIp: 10.0.0.61,127.0.0.1
  port: 38018
replication:
  oplogSizeMB: 2048
  replSetName: configReplSet
sharding:
  clusterRole: configsvr
processManagement: 
  fork: true
EOF

\cp /mongodb/38018/conf/mongodb.conf /mongodb/38019/conf/
\cp /mongodb/38018/conf/mongodb.conf /mongodb/38020/conf/

sed 's#38018#38019#g' /mongodb/38019/conf/mongodb.conf -i
sed 's#38018#38020#g' /mongodb/38020/conf/mongodb.conf -i

7.3.3 启动节点并配置复制集

mongod -f /mongodb/38018/conf/mongodb.conf 
mongod -f /mongodb/38019/conf/mongodb.conf 
mongod -f /mongodb/38020/conf/mongodb.conf 

mongo --port 38018
use  admin
config = {_id: 'configReplSet', members: [
                          {_id: 0, host: '10.0.0.61:38018'},
                          {_id: 1, host: '10.0.0.61:38019'},
                          {_id: 2, host: '10.0.0.61:38020'}]
           }
           
rs.initiate(config)

小提示：
（1）configserver 可以是一个节点，官方建议复制集。configserver不能有arbiter。
新版本中，要求必须是复制集。
（2）mongodb 3.4之后，虽然要求config server为replica set，但是不支持arbiter

7.4 mongos节点配置

7.4.1 创建目录

mkdir -p /mongodb/38017/conf  /mongodb/38017/log

7.4.2 修改配置文件

cat > /mongodb/38017/conf/mongos.conf <<EOF
systemLog:
  destination: file
  path: /mongodb/38017/log/mongos.log
  logAppend: true
net:
  bindIp: 10.0.0.61,127.0.0.1
  port: 38017
sharding:
  configDB: configReplSet/10.0.0.61:38018,10.0.0.61:38019,10.0.0.61:38020
processManagement: 
  fork: true
EOF

7.4.3 启动mongos

mongos -f /mongodb/38017/conf/mongos.conf

7.5 分片集群添加节点

连接到其中一个mongos（10.0.0.61），做以下配置

(1) 连接到mongs的admin数据库
su - mongod
$ mongo 10.0.0.61:38017/admin

(2) 添加分片
db.runCommand( { addshard : "sh1/10.0.0.61:38021,10.0.0.61:38022,10.0.0.61:38023",name:"shard1"} )
db.runCommand( { addshard : "sh2/10.0.0.61:38024,10.0.0.61:38025,10.0.0.61:38026",name:"shard2"} )

(3) 列出分片
mongos> db.runCommand( { listshards : 1 } )

(4) 整体状态查看
mongos> sh.status();

到这里分片集群就可以使用了，但是这里有个缺点：默认自动管理chunk，比如Shard1中有6个chunk，Shard2中没有chunk，这是Shard1中的chunk会向Shard2中迁移3个chunk，这就会影响磁盘io，网络io等，长时间积累会非常影响性能。所以下面会介绍range（如果有100M数据，每个Shard分50M）和hash两种分片策略，手动进行配置，事先会把数据均匀分布，hash会比range更好一点。

7.6 使用分片集群

7.6.1 RANGE分片配置及测试

# test库下的vast大表进行手工分片
1、激活数据库分片功能
mongo --port 38017/admin
use admin
db.runCommand( { enablesharding : "test" } )
eg：
admin>  ( { enablesharding : "数据库名称" } )

2、指定分片建对集合分片
# 创建索引
use test
> db.vast.ensureIndex( { id: 1 } )

# 开启分片
use admin
> db.runCommand( { shardcollection : "test.vast",key : {id: 1} } )


3、集合分片验证
use test
for(i=1;i<1000000;i++){ db.vast.insert({"id":i,"name":"shenzheng","age":70,"date":new Date()}); }
db.vast.stats()


4、分片结果测试
shard1:
mongo --port 38021
use test
db.vast.count();

shard2:
mongo --port 38024
use test
db.vast.count();

7.6.2 HASH分片配置及测试

# 对oldboy库下的vast大表进行hash,创建哈希索引
(1) 对于oldboy开启分片功能
mongo --port 38017 admin
use admin
db.runCommand( { enablesharding : "oldboy" } )

(2) 对于oldboy库下的vast表建立hash索引
use oldboy
db.vast.ensureIndex( { id: "hashed" } )

(3) 开启分片 
use admin
sh.shardCollection( "oldboy.vast", { id: "hashed" } )

(4) 录入10w行数据测试
use oldboy
for(i=1;i<100000;i++){ db.vast.insert({"id":i,"name":"shenzheng","age":70,"date":new Date()}); }

(5) hash分片结果测试
mongo --port 38021
use oldboy
db.vast.count();

mongo --port 38024
use oldboy
db.vast.count();

小总结：


mongodb把一个表的数据分成两半分别存在Shard中，所以需要两组以上的节点
hash： 一边创建俩个chunk，进行存储数据
range：先在shard1中创建1个chunk，直到存满了之后，再从shard2中创建chunk去存数据
             如果写入100w条数据，实验中查看有可能超过100w条数据，这是因为chunk有分裂
             的功能，写满了之后，会分成两个chunk，这是数据就会变多，这是正常的，会自动
             清理，之后会恢复正常。这个也是range的一个缺点。比较慢

7.7 分片结群的查询及管理

7.7.1 判断是否Shard集群

admin> db.runCommand({ isdbgrid : 1})

7.7.2 列出所有分片信息

admin> db.runCommand({ listshards : 1})

7.7.3 列出开启分片的数据库

admin> use config
config> db.databases.find( { "partitioned": true } )
或者：
config> db.databases.find()   //列出所有数据库分片情况

7.7.4 查看分片的片键

# 查看分片使用的range还是hash
config> db.collections.find().pretty()
{
    "_id" : "test.vast",
    "lastmodEpoch" : ObjectId("58a599f19c898bbfb818b63c"),
    "lastmod" : ISODate("1970-02-19T17:02:47.296Z"),
    "dropped" : false,
    "key" : {
        "id" : 1
    },
    "unique" : false
}

7.7.5 查看分片的详细信息

# 这个一个查询可以代替上面的查询
admin> sh.status()
(1) 所有分片节点信息
  shards:
  	 # 其中10.0.0.61:38023和10.0.0.61:38026是Arbiter所以不会在这里显示
     {  "_id" : "shard1",  "host" : "sh1/10.0.0.61:38021,10.0.0.61:38022",  "state" : 1 }
     {  "_id" : "shard2",  "host" : "sh2/10.0.0.61:38024,10.0.0.61:38025",  "state" : 1 }
(2) balancer状态
 balancer:
     Currently enabled:  yes
     Currently running:  no
     Failed balancer rounds in last 5 attempts:  0
     Migration Results for the last 24 hours: 
            2 : Success
(3) 开启分片功能的库和表信息
databases:
# config是系统内置的一个库，不用管他
{  "_id" : "config",  "primary" : "config",  "partitioned" : true }

# oldboy库，vast表
{  "_id" : "oldboy",  "primary" : "shard1",  "partitioned" : true,  "version" : {  "uuid" : UUID("909c68cb-d986-4900-9815-d18a01ecf6c1"),  "lastMod" : 1 } }
oldboy.vast
# 采用hash的分片策略
hard key: { "id" : "hashed" }
balancing: true
chunks:
shard1	2
shard2	2

# test库，vast表
{  "_id" : "test",  "primary" : "shard2",  "partitioned" : true,  "version" : {  "uuid" : UUID("26644a79-e406-47c9-9df8-8f9158799eec"),  "lastMod" : 1 } }
test.vast
# 采用range的分片方式
shard key: { "id" : 1 }
balancing: true
chunks:
shard1	2
shard2	2

7.7.6 删除分片节点

（1）确认blance是否在工作
sh.getBalancerState()
（2）删除shard2节点(谨慎)
mongos> db.runCommand( { removeShard: "shard2" } )
注意：删除操作一定会立即触发blancer。

7.7.7 添加分片节点

(1)  搭建复制集shard3（可以参看上面的操作）
(2)  添加分片
db.runCommand( { addshard : "sh1/10.0.0.61:38027,10.0.0.61:38028,10.0.0.61:38029",name:"shard3"} )

7.8 balancer管理

7.8.1 balancer介绍


介绍
mongos的一个重要功能，自动巡查所有shard节点上的chunk的情况，自动做chunk迁移。
什么时候工作？
1、自动运行，会检测系统不繁忙的时候做迁移
2、在做节点删除的时候，立即开始迁移工作
3、balancer只能在预设定的时间窗口内运行(可以人为的去控制，凌晨在让balancer去工作)

集群备份期间要避开balancer的窗口
有需要时可以关闭和开启blancer（备份的时候）
mongos> sh.stopBalancer()
mongos> sh.startBalancer()

7.8.2 自定义-自动平衡进行的时间段

自定义 自动平衡进行的时间段
https://docs.mongodb.com/manual/tutorial/manage-sharded-cluster-balancer/#schedule-the-balancing-window
// connect to mongos

# 连接mongos
mongo --port 38017 admin
use config
sh.setBalancerState( true )
# 设置balancer工作的时间为凌晨的3:00-5:00
db.settings.update({ _id : "balancer" }, { $set : { activeWindow : { start : "3:00", stop : "5:00" } } }, true )

# 查询balancer工作的时间窗口
sh.getBalancerWindow()
sh.status()

关于集合的balancer（了解）
# 关闭某个集合的balance
sh.disableBalancing("students.grades")

# 打开某个集合的balancer
sh.enableBalancing("students.grades")

# 确定某个集合的balance是开启或者关闭
db.getSiblingDB("config").collections.findOne({_id : "students.grades"}).noBalance;