Attribute :
TokenStream stream = a.tokenStream("content",new StringReader(str));
//位置增量的属性,存储语汇单元之间的距离
PositionIncrementAttribute pia = stream.addAttribute(PositionIncrementAttribute.class);
//每个语汇单元的位置偏移量
OffsetAttribute oa = stream.addAttribute(OffsetAttribute.class);
//存储每一个语汇单元的信息(分词单元信息)
CharTermAttribute cta = stream.addAttribute(CharTermAttribute.class);
//使用的分词器的类型信息
TypeAttribute ta = stream.addAttribute(TypeAttribute.class);
for(;stream.incrementToken();) {
System.out.print(pia.getPositionIncrement()+":");
System.out.print(cta+"["+oa.startOffset()+"-"+oa.endOffset()+"]-->"+ta.type()+"\n");
}
Analyzer a1 = new StandardAnalyzer(Version.LUCENE_35);
Analyzer a2 = new StopAnalyzer(Version.LUCENE_35);
Analyzer a3 = new SimpleAnalyzer(Version.LUCENE_35);
Analyzer a4 = new WhitespaceAnalyzer(Version.LUCENE_35);
String txt = "how are you thank you";
AnalyzerUtils.displayAllTokenInfo(txt, a1);
System.out.println("------------------------------");
AnalyzerUtils.displayAllTokenInfo(txt, a2);
System.out.println("------------------------------");
AnalyzerUtils.displayAllTokenInfo(txt, a3);
System.out.println("------------------------------");
AnalyzerUtils.displayAllTokenInfo(txt, a4);
得到的输出结果是:
1:how[0-3]--><ALPHANUM>
2:you[8-11]--><ALPHANUM>
1:thank[12-17]--><ALPHANUM>
1:you[18-21]--><ALPHANUM>
------------------------------
1:how[0-3]-->word
2:you[8-11]-->word
1:thank[12-17]-->word
1:you[18-21]-->word
------------------------------
1:how[0-3]-->word
1:are[4-7]-->word
1:you[8-11]-->word
1:thank[12-17]-->word
1:you[18-21]-->word
------------------------------
1:how[0-3]-->word
1:are[4-7]-->word
1:you[8-11]-->word
1:thank[12-17]-->word
1:you[18-21]-->word