ElasticSearch搜索,输入errrrrrrrrrrrrrrrrrrrrr等内容会报错too_complex_to_determinize_exception
报错原因是因为查询所携带的text文本过长,导致自动机模型过多,超过10000个
"caused_by": {
"type": "too_complex_to_determinize_exception",
"reason": "too_complex_to_determinize_exception: Determinizing .*\\,ER\\,[^\\|]\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R[^\\|]*\\|R.* would result in more than 10000 states.",
"caused_by": {
"type": "too_complex_to_determinize_exception",
"reason": "too_complex_to_determinize_exception: Determinizing automaton with 52 states and 95 transitions would result in more than 10000 states."
}
}
}
【解决办法】
- 过滤字符(例: 只取中文,英文,数字参与搜索)
String afterKeyword = keyword.replaceAll("[!$^&*+=|';'\\\"<>/?~#%……&*——|{}‘']", " ");
- 限制长度(例: 取前20位参与搜索)
- 自定义最大确定状态数
query.should(QueryBuilders.regexpQuery(fieldName, reg).maxDeterminizedStates(100000).boost(1.0F));