以前版本的Lucene是用TokenStream.next()来遍历TokenStream的内容, 目前的版本稍微修改了一下, 使用下面的的一段程序可以遍历TokenStream的内容
private
static
void displayTokenStream(TokenStream ts)
throws IOException {
TermAttribute termAtt = (TermAttribute) ts
.getAttribute(TermAttribute. class);
TypeAttribute typeAtt = (TypeAttribute) ts
.getAttribute(TypeAttribute. class);
while (ts.incrementToken()) {
System.out.println(termAtt.term());
System.out.print(' ');
System.out.println(typeAtt.type());
}
System.out.println(' ');
}
TermAttribute termAtt = (TermAttribute) ts
.getAttribute(TermAttribute. class);
TypeAttribute typeAtt = (TypeAttribute) ts
.getAttribute(TypeAttribute. class);
while (ts.incrementToken()) {
System.out.println(termAtt.term());
System.out.print(' ');
System.out.println(typeAtt.type());
}
System.out.println(' ');
}
在Lucene3.3.0版本亦可以用以下的方法
TokenStream ts =
new SmartChineseAnalyzer(Version.LUCENE_33)
.tokenStream( "", new StringReader("我喜欢李小球"));
CharTermAttribute termAtt = (CharTermAttribute) ts
.getAttribute(CharTermAttribute. class);
while (ts.incrementToken()) {
String token = new String(termAtt.buffer(),0,termAtt.length());
System.out.println(token);
}
.tokenStream( "", new StringReader("我喜欢李小球"));
CharTermAttribute termAtt = (CharTermAttribute) ts
.getAttribute(CharTermAttribute. class);
while (ts.incrementToken()) {
String token = new String(termAtt.buffer(),0,termAtt.length());
System.out.println(token);
}
哎。方法被划掉看起来就是很不爽。
转载于:https://blog.51cto.com/greyshine/297783