数据失衡时,用户需要确认集合的分区键和现有数据在分区键字段上的值,确认数据失衡原因。同时了解集合整体结构,重新选择分区键。
- 查看数据失衡集合的分区键
> db.snapshot(SDB_SNAP_CATALOG, {Name:"test.nobalance"})
{
"_id": {
"$oid": "5cef6f39c07175a73358f64a"
},
"Name": "test.nobalance",
"UniqueID": 12884901909,
"Version": 4,
"ReplSize": -1,
"Attribute": 1,
"AttributeDesc": "Compressed",
"CompressionType": 1,
"CompressionTypeDesc": "lzw",
"ShardingKey": {
"status": 1
},
"EnsureShardingIndex": false,
"ShardingType": "hash",
"Partition": 4096,
"InternalV": 3,
"CataInfo": [
{
"ID": 0,
"GroupID": 1000,
"GroupName": "group1",
"LowBound": {
"": 0
},
"UpBound": {
"": 1365
}
},
{
"ID": 1,
"GroupID": 1001,
"GroupName": "group2",
"LowBound": {
"": 1365
},
"UpBound": {
"": 2730
}
},
{
"ID": 2,
"GroupID": 1002,
"GroupName": "group3",
"LowBound": {
"": 2730
},
"UpBound": {
"": 4096
}
}
],
"AutoSplit": true
}
Return 1 row(s).
Takes 0.007257s.
集合 test.nobalance
现有分区键为 status
,查看现有数据中分区字段的值
> db.test.nobalance.find({},{account:null,age:null,status:null}).limit(5)
{
"account": D00005,
"age": 26,
"status": "正常"
}
{
"account": D00006,
"age": 27,
"status": "正常"
}
{
"account": D00008,
"age": 29,
"status": "正常"
}
{
"account": D00009,
"age": 20,
"status": "正常"
}
{
"account": D00012,
"age": 20,
"status": "正常"
}
Return 5 row(s).
Takes 0.001466s.
Note:
集合的 status
字段都是一样的字符串,不适合做分区键、重新规划分区键和重建集合
- 创建临时集合
nobalance_temp
,以字段account
为分区键
> db.test.createCL("nobalance_temp",{ShardingKey:{account:1},ShardingType:"hash",EnsureShardingIndex:false,AutoSplit:true,Compressed:true,CompressionType:"lzw"});
localhost:11810.test.nobalance_temp
Takes 0.107423s.
- 导出
nobalance
集合中的数据
$ sdbexprt --hostname localhost --svcname 11810 --type json --dir ./ --cscl test.nobalance
Exported successfully with 1 successful collections, 1800000 successful records
done!
- 数据导入
nobalance_temp
集合
$ sdbimprt --hosts=localhost:11810 --type=json -c test -l nobalance_temp --file=test.nobalance.json
parsed records: 1800000
parse failure: 0
sharding records: 1800000
sharding failure: 0
imported records: 1800000
import failure: 0
- 确认数据分布是否均衡
> db.snapshot(SDB_SNAP_COLLECTIONS, {Name:"test.nobalance_temp"})
{
"Name": "test.nobalance_temp",
"UniqueID": 12884901911,
"Details": [
{
"GroupName": "group1",
"Group": [
{
"ID": 2,
"LogicalID": 7,
"Sequence": 1,
"Indexes": 1,
"Status": "Normal",
"TotalRecords": 599207,
"TotalDataPages": 1237,
"TotalIndexPages": 376,
"TotalLobPages": 0,
"TotalDataFreeSpace": 27324,
"TotalIndexFreeSpace": 7191122,
"NodeName": "test:11820"
}
]
},
{
"GroupName": "group2",
"Group": [
{
"ID": 2,
"LogicalID": 5,
"Sequence": 1,
"Indexes": 1,
"Status": "Normal",
"TotalRecords": 600950,
"TotalDataPages": 1250,
"TotalIndexPages": 352,
"TotalLobPages": 0,
"TotalDataFreeSpace": 60536,
"TotalIndexFreeSpace": 5568215,
"NodeName": "test:11830"
}
]
},
{
"GroupName": "group3",
"Group": [
{
"ID": 2,
"LogicalID": 5,
"Sequence": 1,
"Indexes": 1,
"Status": "Normal",
"TotalRecords": 599843,
"TotalDataPages": 1235,
"TotalIndexPages": 368,
"TotalLobPages": 0,
"TotalDataFreeSpace": 32688,
"TotalIndexFreeSpace": 6648558,
"NodeName": "test:11840"
}
]
}
]
}
Return 1 row(s).
Takes 0.015044s.
- 数据分布均衡后,集合
nobalance
改名为nobalance_bak
,集合nobalance_temp
改名为nobalance
> db.test.renameCL("nobalance", "nobalance_bak");
Takes 0.031626s.
> db.test.renameCL("nobalance_temp", "nobalance");
Takes 0.012640s.