副本集
创建目录
创建如下目录结构,一个replset1就是一个副本集
└── sharding
└── replset1
├── node1
│ ├── data
│ │ └── db
│ ├── log
│ └── mongodb.conf
├── node2
│ ├── data
│ │ └── db
│ ├── log
│ └── mongodb.conf
└── node3
├── data
│ └── db
├── log
└── mongodb.conf
修改配置文件
分别修改三个节点的配置文件,内容如下:
replSet = replication1很重要,三个节点保持一致
dbpath=/usr/local/mongodb/sharding/replset1/node3/data/db
logpath=/usr/local/mongodb/sharding/replset1/node3/log/mongod.log
logappend = true
fork = true
replSet = replication1
bind_ip = xxxxxxxxx
port = 27003
启动3个节点
- 分别到3个节点对应目录下启动三个节点:
mongod --config mongodb.conf
- 查看节点状态
[root@iZwz9io83xn1czh8rty3qlZ node1]# netstat -antp | grep mongod
tcp 0 0 xxxxxxxxx:27001 0.0.0.0:* LISTEN 9734/mongod
tcp 0 0 xxxxxxxxx:27002 0.0.0.0:* LISTEN 9752/mongod
tcp 0 0 xxxxxxxxx:27003 0.0.0.0:* LISTEN 9770/mongod
初始化replSet
- 连接到mongodb
[root@iZwz9io83xn1czh8rty3qlZ node3]# mongo --host xxxxxxxx -port 27001
MongoDB shell version: 3.2.10
connecting to: xxxxxxx:27001/test
- 查看replSet状态
> rs.status()
{
"info" : "run rs.initiate(...) if not yet done for the set",
"ok" : 0,
"errmsg" : "no replset config has been received",
"code" : 94
}
提示需要运行初始化方法
- 仲裁模式初始化replSet
rs.initiate({"_id":"replication1",members:[{"_id":1,"host":"xxxxxxxxx:27001",priority:2},{"_id":2,"host":"xxxxxxxxx:27002",priority:3},{"_id":3,"host":"xxxxxxx:27003",arbiterOnly:true}]});
{ "ok" : 1 }
replication1:OTHER>
replication1:SECONDARY>
- 非仲裁模式初始化replSet
> rs.initiate({"_id":"replication2",members:[{"_id":1,"host":"xxxxxx:27004"},{"_id":2,"host":"xxxxxx:27005"},{"_id":3,"host":"xxxxxx:27006" }]})
{ "ok" : 1 }
replication2:OTHER>
replication2:SECONDARY>
replication2:PRIMARY>
一般采用第二种方式
- 副本集高可用演示
杀掉PRIMARY节点,SECONDARY节点自动升级为PRIMARY节点;
replication2:PRIMARY> rs.status()
...
{
"_id" : 1,
"name" : "xxxxxxx:27004",
...
"stateStr" : "(not reachable/healthy)",
...
},
{
"_id" : 2,
"name" : "xxxxxx:27005",
...
"stateStr" : "PRIMARY",
...
}
...
重启挂掉的节点,该节点自动加入副本集,并置为SECONDARY节点
replication2:PRIMARY> rs.status()
...
{
"_id" : 1,
"name" : "xxxxxxx:27004",
...
"stateStr" : "SECONDARY",
...
},
{
"_id" : 2,
"name" : "xxxxxx:27005",
...
"stateStr" : "PRIMARY",
...
}
...
副本集的备份机制
replication1:PRIMARY> show dbs
local 0.000GB
replication1:PRIMARY> use local
switched to db local
replication1:PRIMARY> show collections
me
oplog.rs
replset.election
replset.minvalid
startup_log
system.replset
replication1:PRIMARY> db.oplog.rs.find();
{ "ts" : Timestamp(1479627942, 1), "h" : NumberLong("1151216259917658225"), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "initiating set" } }
{ "ts" : Timestamp(1479627953, 1), "t" : NumberLong(1), "h" : NumberLong("5065139276939243047"), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "new primary" } }
{ "ts" : Timestamp(1479627966, 1), "t" : NumberLong(2), "h" : NumberLong("-1762348660098026158"), "v" : 2, "op" : "n", "ns" : "", "o" : { "msg" : "new primary" } }
如上,PRIMARY节点会把该节点上的所有操作记录到local.oplog.rs中,然后让SECONDARY节点执行同样的操作来保存一份同样的数据。
Sharding集群
创建文件夹
└── sharding
├── mongos
│ ├── data
│ │ └── db
│ ├── log
│ └── mongos.conf
├── configsvr
│ ├── configsvr1
│ │ ├── data
│ │ │ └── db
│ │ ├── log
│ │ └── configsvr.conf
│ ├── configsvr2
│ ...
│ └── configsvr3
│ ...
└── shards
├── shard1
│ ├── node1
│ │ ├── data
│ │ │ └── db
│ │ ├── log
│ │ └── mongodb.conf
│ ├── node2
│ │ ...
│ └── node3
│ ...
├── shard2
│ ...
└── shard3
...
每一个shard就是一个副本集
configsvr
制作configsvr副本集,配置如下
dbpath=/usr/local/mongodb/sharding/configsvr/configsvr1/data/db
logpath=/usr/local/mongodb/sharding/configsvr/configsvr1/logs/config.log
logappend = true
fork = true
replSet = shardConfig
bind_ip = xxxxxxxx
port = 28001
configsvr = true
连接、启动、初始化与副本集一致
> rs.initiate({"_id":"shardConfig",members:[{"_id":1,"host":"xxxxxxx:28001"},{"_id":2,"host":"xxxxxxx:28002"},{"_id":3,"host":"xxxxxxx:28003" }]})
{ "ok" : 1 }
mongos
- 制作mongos副本集,配置如下
logpath=/usr/local/mongodb/sharding/mongos/logs/config.log
logappend = true
fork = true
bind_ip = xxxxxx
port = 29001
configdb = shardConfig/xxxxx:28001,xxxxxxxx:28002,xxxxxxxx:28003
- 启动
mongos --config mongos.conf
about to fork child process, waiting until server is ready for connections.
forked process: 11368
child process started successfully, parent exiting
- 连接
[root@iZwz9io83xn1czh8rty3qlZ mongos]# mongo --host xxxx --port 29001
MongoDB shell version: 3.2.10
connecting to: xxxxxx:29001/test
mongos>
- 添加shard
mongos> sh.addShard("replication1/xxxxxxxxxx:27001")
{ "shardAdded" : "replication1", "ok" : 1 }
- 查看状态
mongos> sh.status()
--- Sharding Status ---
sharding version: {
...
}
shards:
{ "_id" : "replication1", "host" : "replication1/xxx:27001,xxx:27002" }
{ "_id" : "replication2", "host" : "replication2/xxx:27004,xxxx:27005,xxxx:27006" }
...
databases:
{ "_id" : "test", "primary" : "replication2", "partitioned" : false }
数据库分片
选择数据库
mongos> show dbs
config 0.001GB
test 0.000GB
mongos> use test
switched to db test
设置该数据库支持分片
mongos> sh.enableSharding("test")
{ "ok" : 1 }
此时数据块chunk已生成,并且范围是($minKey,$maxKey)
按索引分片
数据可能不均匀,但查询速度很快
设置索引
mongos> db.test.ensureIndex({name:1})
{
"raw" : {
"replication2/xxxx:27004,xxxx:27005,xxxx:27006" : {
"createdCollectionAutomatically" : false,
"numIndexesBefore" : 1,
"numIndexesAfter" : 2,
"ok" : 1,
"$gleStats" : {
"lastOpTime" : Timestamp(1479650951, 1),
"electionId" : ObjectId("7fffffff0000000000000007")
}
}
},
"ok" : 1
}
将集合以索引name划分片段
mongos> sh.shardCollection("test.test",{"name":1})
{ "collectionsharded" : "test.test", "ok" : 1 }
按hash分片
数据分布均匀,全局检索,速度慢
mongos> sh.shardCollection("test.test",{"name":"hashed"})
插入数据
mongos> for(var i=0;i<100;i++) db.test.insert({name:"test_"+i});
WriteResult({ "nInserted" : 1 })
随着数据的插入,分片会逐渐增加,分片范围逐渐变小,并且所有占用的存储空间趋于平衡。
查看分片效果
mongos> use config
switched to db config
mongos> db.databases.find()
{ "_id" : "test", "primary" : "replication2", "partitioned" : true }
mongos> db.chunks.find()
{ "_id" : "test.test-name_MinKey", "lastmod" : Timestamp(2, 0), "lastmodEpoch" : ObjectId("5831ae8a18b75802b03e0af2"), "ns" : "test.test", "min" : { "name" : { "$minKey" : 1 } }, "max" : { "name" : "test_0" }, "shard" : "replication1" }
{ "_id" : "test.test-name_\"test_0\"", "lastmod" : Timestamp(2, 1), "lastmodEpoch" : ObjectId("5831ae8a18b75802b03e0af2"), "ns" : "test.test", "min" : { "name" : "test_0" }, "max" : { "name" : "test_8" }, "shard" : "replication2" }
{ "_id" : "test.test-name_\"test_8\"", "lastmod" : Timestamp(1, 3), "lastmodEpoch" : ObjectId("5831ae8a18b75802b03e0af2"), "ns" : "test.test", "min" : { "name" : "test_8" }, "max" : { "name" : { "$maxKey" : 1 } }, "shard" : "replication2" }
此时,通过name作为条件查询就不会全局扫描,而是通过这个分区范围进行查找,从而提高效率。
支持聚合分片
sh.shardCollection("test.test",{"name":1,"age":1})