前面处理的都是纯英文文本,英文使用空格分词,ES直接可以处理。如果搜索中文则需要另外安装插件。
下载elasticsearch-analysis-ik插件
解压后获得elasticsearch-analysis-ik-6.5.4目录,把该目录复制到ES安装路径下的plugs目录,以我的开发机为例,目录完整结构如下:
重启ES
检查插件是否生效
用浏览器打开 http://localhost:9200/_cat/plugins,显示类似如下内容即为安装成功
sRdVRrd analysis-ik 6.5.4
设置分析器
在Blog类的text属性上增加注解
@Field(type = FieldType.Text, analyzer = "ik_max_word", searchAnalyzer = "ik_smart")
private String text;
其中,指定了索引时的分词采用ik_max_word模式,特点是对文本做最细粒度的拆分。指定了搜索时采用ik_smart模式,对文本做最粗粒度的拆分。
重建索引并插入中文测试数据
elasticsearchTemplate.deleteIndex("website");
elasticsearchTemplate.createIndex("website");
elasticsearchTemplate.putMapping(Blog.class);
List<IndexQuery> indexQueries = Arrays.asList(
new IndexQueryBuilder().withObject(new Blog(1, "Mary Jones", "Jane is an expert in her field", 80, parseDate("2019-06-21"))).build(),
new IndexQueryBuilder().withObject(new Blog(2, "Jane Smith", "I am starting to get the hang of this...", 0, parseDate("2019-06-20"))).build(),
new IndexQueryBuilder().withObject(new Blog(3, "John Smith", "The Query DSL is really powerful and flexible", 100, parseDate("2019-06-20"))).build(),
new IndexQueryBuilder().withObject(new Blog(4, "Mary Jones", "Still trying this out...", 0, parseDate("2019-06-20"))).build(),
new IndexQueryBuilder().withObject(new Blog(5, "Mary Jones", "However did I manage before Elasticsearch?", 200, parseDate("2019-06-19"))).build(),
new IndexQueryBuilder().withObject(new Blog(6, "Jane Smith", "I like to collect rock albums", 0, parseDate("2019-06-19"))).build(),
new IndexQueryBuilder().withObject(new Blog(7, "Douglas Fir", "I like to build cabinets", 50, parseDate("2019-06-19"))).build(),
new IndexQueryBuilder().withObject(new Blog(8, "John Smith", "I love to go rock climbing", 40, parseDate("2019-06-18"))).build(),
new IndexQueryBuilder().withObject(new Blog(9, "Mary Jones", "I am Mary Jones, welcome to my blog!", 500, parseDate("2019-06-17"))).build(),
new IndexQueryBuilder().withObject(new Blog(10, "Mary Jones", "My first blog entry", 400, parseDate("2019-06-17"))).build(),
new IndexQueryBuilder().withObject(new Blog(11, "小明", "试试中文分词", 5, parseDate("2019-06-21"))).build(),
new IndexQueryBuilder().withObject(new Blog(12, "李三", "我参观了北京大学", 5, parseDate("2019-06-21"))).build()
);
elasticsearchTemplate.bulkIndex(indexQueries);
测试