- 下载elasticsearch-analysis-pinyin拼音分词器
https://codeload.github.com/medcl/elasticsearch-analysis-pinyin/zip/v6.4.3
解压 - 因为elasticsearch拼音插件和你的elasticsearch版本必须一致,如果不一致,可以修改pom.xml的版本为你的elasticsearch的版本,再用maven打包。
- 编译打包
mvn clean install -Dmaven.test.skip
- 打包成功后,找到target\releases目录下的文件
将该文件复制出来,解压,重命名pinyin。
将pinyin文件上传到服务器的elasticsearch安装目录的plugins目录下
- 启动elasticsearch
su xiaobo cd /usr/local/elasticsearch6.4/bin ./elasticsearch -d
- Postman测试
地址: 192.168.2.115:9200/_analyze 数据: { "analyzer": "pinyin", "text": "华为" }
- 测试结果
- 因为我之前整合的文档是Ik分词器,如果想使用pinyin分词和IK一起的话,得自定义分词器(ik、pinyin)
- 查看之前得索引文档分词器
结果: goods索引文档得name字段使用得是ik分词器GET /goods/_mapping
- 删除索引
DELETE /goods
- 自定义分词器ik_smart_pinyin
// ik_smart_pinyin分词器使用得是pinyin和ik分词器得结合PUT /goods { "settings": { "analysis": { "analyzer": { "ik_smart_pinyin": { "type": "custom", "tokenizer": "ik_smart", "filter": ["my_pinyin", "word_delimiter"] }, "ik_max_word_pinyin": { "type": "custom", "tokenizer": "ik_max_word", "filter": ["my_pinyin", "word_delimiter"] } }, "filter": { "my_pinyin": { "type" : "pinyin", "keep_separate_first_letter" : true, "keep_full_pinyin" : true, "keep_original" : true, "limit_first_letter_length" : 16, "lowercase" : true, "remove_duplicated_term" : true } } } } }
- 重新指定文档类型映射拼音分词类型
name字段指定得是我们新的分词器POST /goods/_mapping/goods { "goods": { "properties": { "@timestamp": { "type": "date" }, "@version": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "attribute_list": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "category_id": { "type": "long" }, "created_time": { "type": "date" }, "detail": { "type": "text", "analyzer":"ik_smart_pinyin", "search_analyzer":"ik_smart_pinyin" }, "id": { "type": "long" }, "main_image": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "name": { "type": "text", "analyzer":"ik_smart_pinyin", "search_analyzer":"ik_smart_pinyin" }, "revision": { "type": "long" }, "status": { "type": "long" }, "sub_images": { "type": "text", "fields": { "keyword": { "type": "keyword", "ignore_above": 256 } } }, "subtitle": { "type": "text", "analyzer":"ik_smart", "search_analyzer":"ik_smart" }, "updated_time": { "type": "date" } } } }
- 启动logstash重新同步数据
cd /usr/local/logstash-6.4.3 ./bin/logstash -f mysql.conf
- 测试代码
地址: http://127.0.0.1:8500/search?name=huawei@Override public BaseResponse<List<ProductDto>> search(String name) { // 1.拼接查询条件 BoolQueryBuilder builder = QueryBuilders.boolQuery(); // 2.模糊查询name\subtitle\detail字段 builder.must(QueryBuilders.multiMatchQuery(name, "name", "subtitle", "detail")); Pageable pageable = new QPageRequest(0, 5); // 3.调用ES接口查询 Page<ProductEntity> page = productReposiory.search(builder, pageable); // 4.获取集合数据 List<ProductEntity> content = page.getContent(); // 5.将entity转换dto MapperFactory mapperFactory = new DefaultMapperFactory.Builder().build(); List<ProductDto> mapAsList = mapperFactory.getMapperFacade().mapAsList(content, ProductDto.class); return setResultSuccess(mapAsList); }
地址: http://127.0.0.1:8500/search?name=%E5%8D%8E%E4%B8%BA
- 这样实现了拼音和汉字得同样搜索!!!
Elasticsearch配置拼音分词和自定义分词器
最新推荐文章于 2023-04-11 22:42:10 发布