在pom.xml中添加依赖:
com.thihy
elasticsearch-analysis-paoding
1.4.2.1
org.elasticsearch
elasticsearch
1.5.2
到网上下载paoding分词器
在src/main/resources/paoding建立文件:paoding-analysis.properties,内容如下paoding.analyzer.mode=most-words
paoding.analyzer.dictionaries.compiler=net.paoding.analysis.analyzer.impl.MostWordsModeDictionariesCompiler
paoding.dic.home=classpath:paoding/dic
paoding.dic.detector.interval=60
paoding.knife.class.letterKnife=net.paoding.analysis.knife.LetterKnife
paoding.knife.class.numberKnife=net.paoding.analysis.knife.NumberKnife
paoding.knife.class.cjkKnife=net.paoding.analysis.knife.CJKKnife
将dic文件夹拷贝到src/main/resources/paoding下
测试@Test
public void test() throws IOException {
Analyzer analyzer = new PaodingAnalyzer("classpath:paoding/paoding-analysis.properties");
String text = "我爱北京天安门";
TokenStream tokenStream = analyzer.tokenStream("", text);
tokenStream.reset();
while (tokenStream.incrementToken()) {
CharTermAttribute charTermAttribute = tokenStream
.addAttribute(CharTermAttribute.class);
System.out.println(charTermAttribute);
}
}
运行单元测试,控制台输出:我爱
北京
天安
天安门