Elasticsearch 同义词配置

版权声明:本文为博主原创文章,未经博主允许禁止转载(http://blog.csdn.net/napoay) https://blog.csdn.net/napoay/article/details/80825381

什么是同义词就不用说了,直接说怎么实现。

测试环境:ES 5.5.1

一、同义词词库

准备一个同义词词库,每行一个同义词词组,例子syno.dic:

西红柿,番茄,tomato
马铃薯,土豆

词库位置:elasticsearch-5.5.1/config/analysis

二、配置分析器

DELETE syno

PUT syno
{
  "settings": {
    "analysis": {
      "filter": {
        "my_synonym_filter": {
          "type": "synonym",
          "synonyms_path" : "analysis/syno.dic"
        }
      },
      "analyzer": {
        "my_synonyms": {
          "tokenizer": "ik_smart",
          "filter": [
            "lowercase",
            "my_synonym_filter"
          ]
        }
      }
    }
  }
}

三、分词效果测试

GET /syno/_analyze
{
  "text":"我爱吃土豆",
  "analyzer": "my_synonyms"
}

结果:

{
  "tokens": [
    {
      "token": "我",
      "start_offset": 0,
      "end_offset": 1,
      "type": "CN_CHAR",
      "position": 0
    },
    {
      "token": "爱吃",
      "start_offset": 1,
      "end_offset": 3,
      "type": "CN_WORD",
      "position": 1
    },
    {
      "token": "土豆",
      "start_offset": 3,
      "end_offset": 5,
      "type": "CN_WORD",
      "position": 2
    },
    {
      "token": "马铃薯",
      "start_offset": 3,
      "end_offset": 5,
      "type": "SYNONYM",
      "position": 2
    },
    {
      "token": "tudou",
      "start_offset": 3,
      "end_offset": 5,
      "type": "SYNONYM",
      "position": 2
    }
  ]
}

截个图:

这里写图片描述

阅读更多

扫码向博主提问

mydpp

博客专家

熟悉Lucene、ES、ELK
去开通我的Chat快问

没有更多推荐了,返回首页