背景:
项目中使用了spring-data-elasticsearch注解配置了mapping,实现全文搜索功能。
接到需求需要实现错词纠正功能和自动补全功能:
在网上找的资料,可以通过ES的suggester功能实现,但大部分是直接使用elasticsearch client的方式实现的,于是对使用spring-data-elasticsearch实现的步骤做个总结,希望对有需要的同学有所帮助。
这边直接进行配置说明,不再介绍ES通过注解配置mapping和Suggest的基础知识点。
注意:这里只实现了英文的错词纠正,中文的没有实现*
步骤
1、配置索引对象
import io.swagger.annotations.ApiModelProperty;
import lombok.Data;
import lombok.NoArgsConstructor;
import lombok.ToString;
import lombok.experimental.Accessors;
import lombok.experimental.FieldNameConstants;
import org.springframework.data.annotation.Id;
import org.springframework.data.elasticsearch.annotations.*;
import org.springframework.data.elasticsearch.core.completion.Completion;
import java.io.Serializable;
import java.util.List;
@Data
@Accessors(chain = true)
@ToString
@NoArgsConstructor
@FieldNameConstants
@Document(indexName = SysProtocol.EsIndexKey.SEARCH_MODEL_EN)
@Setting(settingPath = "/elasticsearch/settings.json")
public class SearchModelEnIndex implements Serializable {
private static final long serialVersionUID = 1L;
@Id
private String id;
@ApiModelProperty(value = "标题")
@Field(type = FieldType.Text, analyzer = "my_analyzer")
private String name;
@ApiModelProperty(value = "自动补全标题")
@CompletionField(analyzer = "my_analyzer", searchAnalyzer = "my_analyzer")
private Completion completionName;
}
2.setting.json
我在项目中使用了自定义分词器的配置
{
"index": {
"highlight": {
"max_analyzed_offset": 200000000
}
},
"analysis": {
"char_filter": {
"XtoS": {
"type": "mapping",
"mappings": ["_=>|",".=>|"]
}
},
"analyzer": {
"my_analyzer": {
"type": "custom",
"char_filter": ["XtoS"],
"tokenizer": "standard",
"filter": ["lowercase"]
}
}
}
}
3.错词纠正
用es suggester的Term实现
TermSuggestionBuilder suggestionBuilder = SuggestBuilders.termSuggestion(SearchModelEnIndex.Fields.name)
.suggestMode(TermSuggestionBuilder.SuggestMode.MISSING)
.analyzer( "my_analyzer")
.size(1); // 设定返回数量
SuggestBuilder suggestBuilder = new SuggestBuilder();
suggestBuilder.addSuggestion("suggest_text", suggestionBuilder).setGlobalText(suggestKeyword);
Suggest suggest = esRestTemplate.suggest(suggestBuilder, SearchModelEnIndex.class).getSuggest();
TermSuggestion suggest_text = suggest.getSuggestion("suggest_text");
for (TermSuggestion.Entry entry : suggest_text.getEntries()) {
if (entry.getOptions().size() > 0) {
System.out.println(entry.getOptions().get(0).getText().toString());
}
}
4.自动补全
用es suggester的Completion实现
String keyword = "elastic i";
CompletionSuggestionBuilder completionSuggestionBuilder = SuggestBuilders.completionSuggestion(SearchModelEnIndex.Fields.completionName)
.size(5)
.skipDuplicates(true)
.prefix(keyword);
SuggestBuilder suggestBuilder = new SuggestBuilder();
suggestBuilder.addSuggestion("suggest_text", completionSuggestionBuilder);
Suggest suggest = esRestTemplate.suggest(suggestBuilder, SearchModelEnIndex.class).getSuggest();
CompletionSuggestion suggest_text = suggest.getSuggestion("suggest_text");
for (CompletionSuggestion.Entry entry : suggest_text.getEntries()) {
for (CompletionSuggestion.Entry.Option option : entry.getOptions()) {
System.out.println(option.getText().toString());
}
}
注意:Completion使用的索引字段类型要用Completion类型: