mongoDB分片实战

融极

已于 2022-07-11 17:30:01 修改

阅读量879

点赞数

分类专栏：数据库文章标签： mongodb 数据库

于 2022-07-07 18:32:41 首次发布

本文链接：https://blog.csdn.net/tianzhonghaoqing/article/details/125664654

版权

数据库专栏收录该内容

63 篇文章 2 订阅

订阅专栏

概述

随着大数据海量数据的不断涌现，分布式、横向扩展是系统扩展的重要方式之一。基于文档的NoSQL领头羊MongoDB正式这样一个分布式系统，通过分片集群将索引数据分成数据段，并将每个数据段写入不同的节点。本文简要描述MongoDB分片特性，给出分片示例。

为什么需要分片

存储容量需求超出单机磁盘容量。
活跃的数据集超出单机内存容量，导致很多请求都要从磁盘读取数据，影响性能。
写IOPS超出单个mongoDB节点的写服务能力。
mongoDB支持自动分片以及手动分片，分片的基本单位是集合。

分片集群架构

Mongos

客户端访问路由节点，mongos进行数据读写。

Config Server

保存元数据以及集群配置信息。

Shard Server（分片服务）

每一个shard包含特定集合数据的一部分，且shard可以配置为复制集。

什么是主分片

主分片用于存储所有未开启分片集合的数据。
每个数据库都有一个主分片。
通过movePrimary命令改变主分片。
基于已经使用了复制集的环境，在开启一个分片集群的情形下，已经存在的数据依旧位于原有的分片。
可以创建指向单个分片的片键。

mongodb分片示例

环境准备

操作系统：CentOS Linux release 8.5.2111
mongo版本：v4.4.6
3台虚拟机
192.168.230.128,192.168.230.129,192.168.230.130
集群环境
2个分片复制集
shard1（192.168.230.128:27017、192.168.230.129:27017、192.168.230.130:27017）
shard2（192.168.230.128:27018、192.168.230.129:27018、192.168.230.130:27018）
1个config复制集
（192.168.230.128:28018、192.168.230.129:28018、192.168.230.130:28018）
1个mongos节点
(192.168.230.128:28017)

# more /etc/redhat-release 
# cd /usr/local/mongodb/bin
[root@master bin]# ./mongod --version
db version v4.4.6

搭建分片复制集

添加（yidian_repl）复制集配置文件。mongo.conf（128/129/130）

# 添加复制集配置文件
fork=true
# 数据路径
dbpath=/opt/mongo/data/db
port=27017
bind_ip=0.0.0.0
# 日志路径
logpath=/opt/mongo/logs/mongodb.log
logappend=true
# 复制集的名称
replSet=yidian_repl
smallfiles=true
# 分片集群必须要有的属性
shardsvr=true

添加（yidian_repl2）复制集配置文件。mongo2.conf（128/129/130）

# 添加复制集配置文件
fork=true
# 数据路径
dbpath=/opt/mongo/data/db2
port=27018
bind_ip=0.0.0.0
# 日志路径
logpath=/opt/mongo/logs/mongodb2.log
logappend=true
# 复制集的名称
replSet=yidian_repl2
smallfiles=true
# 分片集群必须要有的属性
shardsvr=true

启动副本集

在192.168.230.128，192.168.230.129，192.168.230.130上分别启动副本。

# 先确保配置文件中的目录存在，不然不存在创建
# cd /opt/mongo
# mkdir -pv data/db
# mkdir -pv data/db2
# mkdir logs
# 启动副本集1
./mongod -f /opt/mongo/mongo.conf 
# 启动副本集2
./mongod -f /opt/mongo/mongo2.conf

登录副本集

# 进入mongo客户端
# 配置yidian_repl副本集
./mongo --port 27017
# 配置yidian_repl2副本集
./mongo --port 27018

27017进入客户端后，执行初始命令

./mongo --port 27017

var rsconf = {
	_id:'yidian_repl', // 这里的_id要与配置文件中指定的服务所属的复制集相同
	members:
	[
		{
			_id:1, // 成员的id
			host:'192.168.230.128:27017' // 成员所属节点的ip以及该成员服务启动时所占的端口
		},
		{
			_id:2,
			host:'192.168.230.129:27017'
		},
		{
			_id:3,
			host:'192.168.230.130:27017'
		},
	]
};
# 初始配置(加载rsconf配置)
rs.initiate(rsconf);	
# 查看状态
rs.status();

27018进入客户端后，执行初始命令

./mongo --port 27018

var rsconf = {
	_id:'yidian_repl2', // 这里的_id要与配置文件中指定的服务所属的复制集相同
	members:
	[
		{
			_id:1, // 成员的id
			host:'192.168.230.128:27018' // 成员所属节点的ip以及该成员服务启动时所占的端口
		},
		{
			_id:2,
			host:'192.168.230.129:27018'
		},
		{
			_id:3,
			host:'192.168.230.130:27018'
		},
	]
};
# 初始配置(加载rsconf配置)
rs.initiate(rsconf);	
# 查看状态
rs.status();

搭建config节点复制集

cd /opt/mongo
mkdir -pv mongo-cfg/logs
cd mongo-cfg
mkdir data

创建配置节点配置文件mongo-cfg.conf(128/129/130三个阶段都配置)，
129,130两个节点改下ip地址。

systemLog:
  # MongoDB发送所有日志输出的目标指定为文件
  destination: file
  # 日志存储位置
  path: /opt/mongo/mongo-cfg/logs/mongodb.log
  # Mongos或Mongod实例重新启动时，mongos或mongod会将新条目附加到现有日志文件的末尾
  logAppend: true
storage:
  # Mongod实例存储其数据的目录。storage.dbPath设置仅适用于mongod。
  dbPath: /opt/mongo/mongo-cfg/data
  journal:
    # 启用或禁用持久性日志以确保数据文件保持有效和可恢复。
    enabled: true
  # 是否一个库一个文件夹
  directoryPerDB: true
  wiredTiger:
    engineConfig:
	  # 最大使用cache(根据真实情况自行调节)
	  cacheSizeGB: 1
	  # 是否将索引也按照数据库名单独存储
	  directoryForIndexes: true
	collectionConfig:
	  # 表压缩配置
	  blockCompressor: zlib
	indexConfig:
	  prefixCompression: true
net:
  # 服务实例绑定的ip
  bindIp: 192.168.230.128
  # 绑定的端口
  port: 28018
replication:
  oplogSizeMB: 2048
  # 配置节点副本集名称
  replSetName: configReplSet
sharding:
  clusterRole: configsvr
processManagement:
  # 启用在后台运行mongos或mongod进程的守护进程模式
  fork: true

启动配置集
在每个节点启动配置节点

./mongod -f /opt/mongo/mongo-cfg.conf

登录并配置节点
和分片节点一样，只要登录人员其中一个，配置集群即可。

./mongo --host 192.168.230.128 --port 28018
# 初始化配置集
rs.initiate(
	{
		_id:"configReplSet",
		configsvr:true,
		members:[
			{_id:0, host:"192.168.230.128:28018"},
			{_id:1, host:"192.168.230.129:28018"},
			{_id:2, host:"192.168.230.130:28018"}
		]
	}
);

搭建mongos路由服务

只需配置一个mongos节点128。

创建mongos配置文件。
创建mongos/log目录。
创建mongos.conf配置文件。

systemLog:
  destination: file
  path: /opt/mongo/mongos/log/mongos.log
  logAppend: true
net:
  bindIp: 192.168.230.128
  port: 28017
sharding:
  configDB: configReplSet/192.168.230.128:28018,192.168.230.129:28018,192.168.230.130:28018
processManagement:
  fork: true

启动路由服务
在128节点启动路由服务。

[root@master mongos]# /usr/local/mongodb/bin/mongos --config /opt/mongo/mongos/mongos.conf

登录mongos节点
ip，端口为mongos服务的ip和端口。

[root@master mongos]# /usr/local/mongodb/bin/mongo 192.168.230.128:28017

添加集群中的分片节点

登录mongos节点，进行下列配置，登录方法同上《登录mongos节点》。

切换到admin数据库。

mongos> use admin;
switched to db admin

添加shard1复制集

db.runCommand({addshard:
"yidian_repl/192.168.230.128:27017,192.168.230.129:27017,192.168.230.130:27017",name:"shard1"});

添加shard2复制集

db.runCommand({addshard:
"yidian_repl2/192.168.230.128:27018,192.168.230.129:27018,192.168.230.130:27018",name:"shard2"});

查看分片信息

# 查看分片列表
mongos> db.runCommand({listshards:1});
# 查看分片状态
mongos> sh.status();

开启分片相关配置

登录mongos节点，进行下列配置，登录方法同上《登录mongos节点》。

对集合所在的数据库启用分片功能

db.runCommand({enablesharding:"testdb"});
或者
sh.enableSharding("<database>")

说明：数据库名。
示例：sh.enableSharding(“mongodbtest”)
说明：您可以通过sh.status()查看分片状态。

对片键的字段建立索引

db.<collection>.createIndex(<keyPatterns>,<options>)

说明
：集合名。
：包含用于建立索引的字段和索引类型。
常见的索引类型如下：
1：创建升序索引
-1：创建降序索引
“hashed”：创建哈希索引
：表示接收可选参数，本操作示例中暂未使用到该字段。

创建升序索引示例：
db.customer.createIndex({name:1})

对集合设置数据分片

db.runCommand({shardcollection:"testdb.users",key:{name:1}}); // 需要切换到admin库，再执行相关命令
或者
db.adminCommand({shardcollection:"testdb.users",key:{name:1}})  // 无需切换到admin库，直接执行admin库中的命令
或者
sh.shardCollection("<database>.<collection>",{ "<key>":<value> } ) // 需要切换到admin库，再执行相关命令

说明
：数据库名。
：集合名。
：分片的键，MongoDB将根据片键的值进行数据分片。

1：表示基于范围分片，通常能很好地支持基于片键的范围查询。
“hashed”：表示基于哈希分片，通常能将写入均衡分布到各Shard节点中。

示例：
基于范围分片的配置示例：
sh.shardCollection(“mongodbtest.customer”,{“name”:1})
基于哈希分片的配置示例：
sh.shardCollection(“mongodbtest.customer”,{“name”:“hashed”})

查询创建结果：
db.collectionName.stats();

测试分片集群

添加测试数据

var arr=[];
for (var i = 0; i < 1500000; i++) {
	var uid = i;
	var name = "name" + i;
	arr.push({"name":name,"id":uid});
}
db.users.insertMany(arr);

注意：数据如果少了可能看不到分片的效果。

查看数据分布

mongos> sh.status();
--- Sharding Status --- 
  sharding version: {
  	"_id" : 1,
  	"minCompatibleVersion" : 5,
  	"currentVersion" : 6,
  	"clusterId" : ObjectId("62c7a02f3e10b2f25a292763")
  }
  shards:
        {  "_id" : "shard1",  "host" : "yidian_repl/192.168.230.128:27017,192.168.230.129:27017,192.168.230.130:27017",  "state" : 1 }
        {  "_id" : "shard2",  "host" : "yidian_repl2/192.168.230.128:27018,192.168.230.129:27018,192.168.230.130:27018",  "state" : 1 }
  active mongoses:
        "4.4.6" : 1
  autosplit:
        Currently enabled: yes
  balancer:
        Currently enabled:  yes
        Currently running:  no
        Failed balancer rounds in last 5 attempts:  0
        Migration Results for the last 24 hours: 
                514 : Success
  databases:
        {  "_id" : "config",  "primary" : "config",  "partitioned" : true }
                config.system.sessions
                        shard key: { "_id" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard1	512
                                shard2	512
                        too many chunks to print, use verbose if you want to force print
        {  "_id" : "testdb",  "primary" : "shard2",  "partitioned" : true,  "version" : {  "uuid" : UUID("e9eca766-c259-46dc-af16-c51a520e568d"),  "lastMod" : 1 } }
                testdb.users
                        shard key: { "name" : 1 }
                        unique: false
                        balancing: true
                        chunks:
                                shard1	2
                                shard2	3
                        { "name" : { "$minKey" : 1 } } -->> { "name" : "name0" } on : shard1 Timestamp(2, 0) 
                        { "name" : "name0" } -->> { "name" : "name331231" } on : shard1 Timestamp(3, 0) 
                        { "name" : "name331231" } -->> { "name" : "name554493" } on : shard2 Timestamp(3, 1) 
                        { "name" : "name554493" } -->> { "name" : "name779270" } on : shard2 Timestamp(1, 4) 
                        { "name" : "name779270" } -->> { "name" : { "$maxKey" : 1 } } on : shard2 Timestamp(1, 5) 
mongos>