1.批量查询的好处
1)一条一条查询,需要发送多次,网络开销大,批量查询可以解决很多网络的开销
2)使用mget
2.使用mget进行查询操作
1)如果批量查询的对象不是在一个index中,那么可以用下面的语法
GET /_mget
{
"docs":[
{
"_index":"test_index",
"_type":"test_type",
"_id":10
},
{
"_index":"test_index",
"_type":"test_type",
"_id":8
},
{
"_index":"ecommerce",
"_type":"product",
"_id":5
}
]
}
执行结果:结果也是封装到一个docs中
{
"docs": [
{
"_index": "test_index",
"_type": "test_type",
"_id": "10",
"_version": 7,
"found": true,
"_source": {
"test_field": "test_201712291052",
"test_field2": "test002"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "8",
"_version": 2,
"found": true,
"_source": {
"test_field": "xxx81"
}
},
{
"_index": "ecommerce",
"_type": "product",
"_id": "5",
"_version": 1,
"found": true,
"_source": {
"name": "zhuyan yagao",
"desc": "meibai jiankang",
"price": 54,
"producer": "zhuyan producer",
"tags": [
"fangzhu",
"meibai",
"jiankang"
]
}
}
]
}
2)如果获取的document是一个index中的不同type中的时候
GET /test_index/_mget
{
"docs":[
{
"_type":"test_type",
"_id":10
},
{
"_type":"test_type",
"_id":8
}
]
}
执行结果:
{
"docs": [
{
"_index": "test_index",
"_type": "test_type",
"_id": "10",
"_version": 7,
"found": true,
"_source": {
"test_field": "test_201712291052",
"test_field2": "test002"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "8",
"_version": 2,
"found": true,
"_source": {
"test_field": "xxx81"
}
}
]
}
3)如果document在同一个index,同一个type中时
GET /test_index/test_type/_mget
{
"docs":[
{
"_id":10
},
{
"_id":8
}
]
}
同样也可以查询
{
"docs": [
{
"_index": "test_index",
"_type": "test_type",
"_id": "10",
"_version": 7,
"found": true,
"_source": {
"test_field": "test_201712291052",
"test_field2": "test002"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "8",
"_version": 2,
"found": true,
"_source": {
"test_field": "xxx81"
}
}
]
}
上面3)中的语法也可以写成
GET /test_index/test_type/_mget
{
"ids":[1,2,10]
}
结果:
{
"docs": [
{
"_index": "test_index",
"_type": "test_type",
"_id": "1",
"_version": 2,
"found": true,
"_source": {
"test_field": "xxx",
"test_field2": "xxx2"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "2",
"_version": 1,
"found": true,
"_source": {
"test_field": "xxxxx000"
}
},
{
"_index": "test_index",
"_type": "test_type",
"_id": "10",
"_version": 7,
"found": true,
"_source": {
"test_field": "test_201712291052",
"test_field2": "test002"
}
}
]
}
3.见解
mget的重要性:很重要
一般来说,在进行查询的时候,如果一次性查询多条数据,那么一定要用batch批量操作的api,
尽可能一次请求网络,减少网路的消耗。
4.bulk size的最佳大小
bulk request会加载到内存中,如果太大的话,性能反而下降,因此需要反复尝试一个最大的bulk size。一般从1000~5000条数据开始,尝试逐渐增加。另外,如果看大小的话,最好在5M~15M之间