ElasticSearch学习笔记之三十三 IK分词器扩展字典及text全文类型数据分词聚合查询
专属词汇分词失败
前面我们已经知道了IK分词器已经可以很好的为中文的text全文类型数据分词,但是有一些特定行业的特定专属词汇,IK分词器却不能按照我们的设想来分词,例如 人名/书名/专属词汇 等等
例如我们分析一下
GET _analyze?pretty
{
"analyzer": "ik_smart",
"text": ["斗破苍穹真好看"]
}
结果如下
{
"tokens": [
{
"token": "斗",
"start_offset": 0,
"end_offset": 1,
"type": "CN_CHAR",
"position": 0
},
{
"token": "破",
"start_offset": 1,
"end_offset": 2,
"type": "CN_CHAR",
"position": 1
},
{
"token": "苍穹",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 2
},
{
"token": "真",
"start_offset": 4,
"end_offset": 5,
"type": "CN_CHAR",
"position": 3
},
{
"token": "好看",
"start_offset": 5,
"end_offset": 7,
"type": "CN_WORD",
"position": 4