什么是同义词就不用说了,直接说怎么实现。
测试环境:ES 5.5.1
一、同义词词库
准备一个同义词词库,每行一个同义词词组,例子syno.dic:
西红柿,番茄,tomato
马铃薯,土豆
词库位置:elasticsearch-5.5.1/config/analysis
二、配置分析器
DELETE syno
PUT syno
{
"settings": {
"analysis": {
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms_path" : "analysis/syno.dic"
}
},
"analyzer": {
"my_synonyms": {
"tokenizer": "ik_smart",
"filter": [
"lowercase",
"my_synonym_filter"
]
}
}
}
}
}
三、分词效果测试
GET /syno/_analyze
{
"text":"我爱吃土豆",
"analyzer": "my_synonyms"
}
结果:
{
"tokens": [
{
"token": "我",
"start_offset": 0,
"end_offset": 1,
"type": "CN_CHAR",
"position": 0
},
{
"token": "爱吃",
"start_offset": 1,
"end_offset": 3,
"type": "CN_WORD",
"position": 1
},
{
"token": "土豆",
"start_offset": 3,
"end_offset": 5,
"type": "CN_WORD",
"position": 2
},
{
"token": "马铃薯",
"start_offset": 3,
"end_offset": 5,
"type": "SYNONYM",
"position": 2
},
{
"token": "tudou",
"start_offset": 3,
"end_offset": 5,
"type": "SYNONYM",
"position": 2
}
]
}
截个图: