遇到的问题
生产上线了未对外开放的接口,业务隔几天点开功能测试发现查询经常耗时较长。观察日志后为业务隔几天点击一次,第一次查询慢,其余查询正常。慢的表基本没有数据更新和查询。数据库使用分布式巨杉数据库。
解决办法
1、业务几天不查询,执行查询explain,发现有部分节点查询返回慢。 explain示例:
sdbadmin@sdbserver1:~$ sdb 'db.company.employee.find({ "age": 18}).explain({Run:true})'
{
"NodeName": "sdbserver1:11820",
"GroupName": "group1",
"Role": "data",
"Name": "company.employee",
"ScanType": "tbscan",
"IndexName": "",
"UseExtSort": false,
"Query": {
"$and": [
{
"age": {
"$et": 18
}
}
]
},
"IXBound": null,
"NeedMatch": true,
"ReturnNum": 0,
"ElapsedTime": 0.000038,
"DataRead": 0,
"IndexRead": 0,
"UserCPU": 0,
"SysCPU": 0
}
{
"NodeName": "sdbserver1:11830",
"GroupName": "group2",
"Role": "data",
"Name": "company.employee",
"ScanType": "tbscan",
"IndexName": "",
"UseExtSort": false,
"Query": {
"$and": [
{
"age": {
"$et": 18
}
}
]
},
"IXBound": null,
"NeedMatch": true,
"ReturnNum": 0,
"ElapsedTime": 0.000056,
"DataRead": 1,
"IndexRead": 0,
"UserCPU": 0,
"SysCPU": 0
}
{
"NodeName": "sdbserver1:11840",
"GroupName": "group3",
"Role": "data",
"Name": "company.employee",
"ScanType": "tbscan",
"IndexName": "",
"UseExtSort": false,
"Query": {
"$and": [
{
"age": {
"$et": 18
}
}
]
},
"IXBound": null,
"NeedMatch": true,
"ReturnNum": 1,
"ElapsedTime": 0.000278,
"DataRead": 1,
"IndexRead": 0,
"UserCPU": 0,
"SysCPU": 0
}
Return 3 row(s).
sdbadmin@sdbserver1:
2、观察查询慢的机器发现为内存紧张的机器(判断为查询慢的节点内存繁忙,查询的数据文件和索引文件被淘汰出内存。);
可以使用sar -B 2
观察内存的换入换出情况
使用 iostat -x -d 2
观察磁盘的繁忙程度,平均查询耗时
3、把节点数据使用split到内存空闲的机器上。
sdb 'db.company.employee.split( "group1", "groupxx", 100 )'
4、持续观察后续查询均较为平稳。