1、solr整个过程中分为index和query
<fieldType name="text_standard" class="solr.TextField" positionIncrementGap="100" autoGeneratePhraseQueries="true">
<analyzer type="index">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
可以通过配置filter解决这个solr.LowerCaseFilterFactory过滤器会在建立索引和查询时字段转会成消息
2、使用solr中IKanalyzer分词器时。IK会自动把大写转换成为小写
<fieldType name="text_zh" class="solr.TextField" positionIncrementGap="100">
<analyzer type="index">
<tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory" useSmart="false"/>
</analyzer>
<analyzer type="query">
<tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory" useSmart="false"/>
</analyzer>
</fieldType>
所以schema.xml中不需要配置。
/*
* Copyright (c) 2001-2012 Bidlink(Beijing) E-Biz Tech Co.,Ltd.
* All rights reserved.
* 必联(北京)电子商务科技有限公司 版权所有
*
* <p>SolrSearchUtils.java</p>
*
* Created on 2012-6-7 by zhaodong.wang
*
*/
package cn.bidlink.space.common.util;
import java.io.IOException;
import java.io.Reader;
import java.io.StringReader;
import java.util.ArrayList;
import java.util.List;
import org.wltea.analyzer.core.IKSegmenter;
import org.wltea.analyzer.core.Lexeme;
public class SolrSearchUtils {
/**
* 用于分词技术
* @param keyWords
* @return
*/
public static List<String> findIKAnalyserWords(String keyWords){
List<String> lists = new ArrayList<String>();
Reader input = new StringReader(keyWords.replaceAll("\\s", " "));
IKSegmenter ikSeg = new IKSegmenter(input, false);
Lexeme lexeme;
try {
while ((lexeme = ikSeg.next()) != null) {
lists.add(lexeme.getLexemeText());
}
return lists;
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
return lists;
}
public static void main(String[] args) {
System.out.println(findIKAnalyserWords("必联网AAA"));
}
}
打印结果为:[必, 联网, aaa]。
AAA自动转换aaa