ES基于IK简单的文档同义词搜索配置
环境:
elasticsearch 6.4.3
kibana 6.4.3
ik分词器 6.4.3
注:三者版本要保持一致
1.创建基于ik分词器的索引
创建基于ik分词器的索引"syno"。
注:关于synonyms.txt(UTF-8编码格式)文档中同义词设置为:
西红柿,番茄 =>西红柿,番茄
创建方法:
PUT /syno
{
"settings": {
"number_of_shards":10,
"analysis": {
"filter": {
"my_synonym_filter": {
"type": "synonym",
"synonyms_path" : "analysis/synonyms.txt"
}
},
"analyzer": {
"my_synonyms": {
"type":"custom",
"tokenizer": "ik_smart",
"filter": [
"lowercase",
"my_synonym_filter"
]
}
}
}
}
}
2.映射配置,为相关字段配置类型和分析器
注:es 6.0版本后已不支持string类型,具体查看官方文档ElasticSearch映射配置官方文档中文版
为索引为"syno",类型为"fruit"的文档进行mapping配置,
配置示例:(字段为"id",“name”,“detail”)
PUT /syno/_mapping/fruit
{
"properties": {
"id": {
"type": "text",
"store" : true
},
"name": {
"type": "text",
"index":"true",
"analyzer": "my_synonyms",
"search_analyzer": "my_synonyms"
},
"detail": {
"type": "text"
}
}
}
注:此处字段"name"类型为"text",如果为"keyword"会配置失败报错。类型区别参见官方文档解释。ES-mapping官方文档英文版
报错信息如下:
{
"error": {
"root_cause": [
{
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [name] has unsupported parameters: [analyzer : my_synonyms] [search_analyzer : my_synonyms]"
}
],
"type": "mapper_parsing_exception",
"reason": "Mapping definition for [name] has unsupported parameters: [analyzer : my_synonyms] [search_analyzer : my_synonyms]"
},
"status": 400
}
3.测试数据准备
为文档添加测试数据。
添加数据内容:
PUT /syno/fruit/1
{
"id": "1",
"name": "西红柿",
"detail": "美味又好吃。"
}
4.用同义词搜索测试搜索结果
利用同义词来进行文档搜索,验证结果。
搜索内容:
GET /syno/fruit/_search
{
"query":{
"match" : {
"name":"番茄"
}
}
}
搜索结果:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 10,
"successful": 10,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 1,
"max_score": 0.46029136,
"hits": [
{
"_index": "syno",
"_type": "fruit",
"_id": "1",
"_score": 0.46029136,
"_source": {
"id": "1",
"name": "西红柿",
"detail": "美味又好吃。"
}
}
]
}
}
利用IK分词器进行同义词搜索文档的简单操作,至此结束。