yii-elasticseach中terms查询优化
说明
terms 的用法就相当于 in ,默认情况下请求体允许个数为1024,请求体少的情况下无需优化,请求体超过500,请求时长就会变长。
场景
在kibana直接请求的时长是140毫秒,但在程序中请求时长是2秒。
es数据总数为200万,需要在1300个编码号中选择可拆分,按价格排序后的15个并返回。
猜测时长原因
1、请求体过大
2、编码类型非keyword
3、查询语句非最优
优化方案:
1、将请求体过多使用的 terms 替换成should+terms ,或者是 term 写法
2、将非必要用match的 替换成 term
3、将查询code改用id查询
4、设置id类型为 keyword
//原始写法
public function getSeqList($data, $order = "price desc"){
$must = [];
if (isset($data['name']) && $data['name']) {
$must[] = ['wildcard' => ['name' => "*{$data['name']}*"]];
}
$must[] = ['match' => ['is_group' => $data['is_group']]];
$must[] = ['terms' => ['code' => $data['code']]];
$query['bool']['must'] = $must;
$start = ($data['page'] - 1) * $data['limit'];
$limit = $data['limit'] ?? 10;
$res = $this->getList($query, $order, $start, $limit);
return array_column($res, '_source');
}
//优化后写法,should+terms
public function getSeqList($data, $order = "price desc"){
$must = [];
if (isset($data['name']) && $data['name']) {
$must[] = ['wildcard' => ['name' => {$data['name']}]];
}
$slice_num = 260;
//不用code,改用对应code主键,减小请求体大小(code有20个字节,id最大仅11个字节)
$code_id_count = count($data['code_id']);
if($code_id_count > $slice_num){
$slice_must['bool']['should'] = [];
for ($i = 0; $i <= $code_id_count; $i += $slice_num) { // 拆分should+terms
$code_id_arr = array_slice($data['code_id'], $i, $slice_num);
$slice_must['bool']['should'][] = ['terms' => ['code' => $code_id_arr ]];
}
$must[] = $slice_must;
}
$must[] = ['term' => ['is_group' => $data['is_group']]];
$query['bool']['must'] = $must;
$start = ($data['page'] - 1) * $data['limit'];
$limit = $data['limit'] ?? 10;
$res = $this->getList($query, $order, $start, $limit);
return array_column($res, '_source');
}
题外话
其他场景优化推荐:
1、聚合terms
2、游标srcoll
3、must不用计算得分时用filter
4、检查es服务器、节点、副本、分片