1.Similarity改变solr的打分机制,用solr默认的排序,发现得到的结果并不符合用户体验的需求,研究了几天后,也学了点皮毛,废话不多说。
我的是直接把solr客户端项目直接放到建立好了的web项目中,并且直接将solr-5.5.3\server\solr-webapp\webapp里面所有的都放到web文件夹中
web.xml的solrHome目录指定就不说了
2.solr本身已经提供了几种算法
org.apache.solr.search.similarities.BM25SimilarityFactory
org.apache.solr.search.similarities.DefaultSimilarityFactory
org.apache.solr.search.similarities.DFRSimilarityFactory
org.apache.solr.search.similarities.IBSimilarityFactory
org.apache.solr.search.similarities.LMDirichletSimilarityFactory
org.apache.solr.search.similarities.LMJelinekMercerSimilarityFactory
org.apache.solr.search.similarities.SchemaSimilarityFactory
然后在managed-schema里面添加全局的
<similarity class="org.apache.solr.search.similarities.SchemaSimilarityFactory"/>这个是必须填写的,不然直接在参数类型用会报错
Caused by: org.apache.solr.common.SolrException: Can't load schema E:\solr\solr20170714\solrHome\classify\conf\managed-schema: FieldType 'text_ik' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.ClassicSimilarityFactory
<fieldType name="text_ik" class="solr.TextField">
<similarity class="org.apache.solr.search.similarities.DefaultSimilarityFactory"></similarity>
<analyzer type="index" useSmart="false"
class="org.wltea.analyzer.lucene.IKAnalyzer" />
<analyzer type="query" useSmart="true"
class="org.wltea.analyzer.lucene.IKAnalyzer" />
</fieldType>
如果是单个相似度排序的话,例如
<field name="pro_up_tpd" type="text_ik" indexed="true" stored="true"/>
如果是复合的话,建议加权重如query.set("defType","
edismax"),
query.set("qf","
pro_name^100 pro_nStore_description^50")
<field name="seacher" type="text_ik" indexed="true" stored="false" multiValued="true" />
<copyField source="pro_name" dest="seacher"/>
<copyField source="pro_nStore_description" dest="seacher"/>
3.自定义
package cn.com.cacuq.similarity;
import org.apache.lucene.search.similarities.Similarity;
import org.apache.solr.schema.SimilarityFactory;
/**
* Created by Administrator on 2017/7/31 0031.
*/
public class MySimilarityFactory extends SimilarityFactory {
public Similarity getSimilarity() {
return new MySimilarity();
}
}
package cn.com.cacuq.similarity;
import org.apache.lucene.index.FieldInvertState;
import org.apache.lucene.search.similarities.DefaultSimilarity;
/**
* Created by Administrator on 2017/7/31 0031.
*/
public class MySimilarity extends DefaultSimilarity{
/**
* freq 表示 term 在一个document的出现次数,这里设置为1.0f表示不考滤这个因素影响
* */
@Override
public float tf(float freq) {
return 1.0F;
}
/**
* 这里表示匹配的docuemnt在全部document的影响因素,同理也不考滤
* */
@Override
public float idf(long docFreq, long numDocs) {
return 1.0F;
}
@Override
public float sloppyFreq(int distance) {
return 1.0F;
}
@Override
public float queryNorm(float sumOfSquaredWeights) {
return 1.0F;
}
/**
* 这里表示每一个Document中所有匹配的关键字与当前关键字的匹配比例因素影响,同理也不考滤
* */
@Override
public float coord(int overlap, int maxOverlap) {
return 1.0F;
}
@Override
public float lengthNorm(FieldInvertState state) {
return 1.0F;
}
protected boolean discountOverlaps = false;
public void setDiscountOverlaps(boolean v) {
discountOverlaps = v;
}
public boolean getDiscountOverlaps() {
return discountOverlaps;
}
public String toString(){
return "MySimilarity";
}
}
然后在改变fieldType里面的similarity里面的class用自定的cn.com.cacuq.similarity.MySimilarityFactory
处理前结果
处理后结果
很明显的到的结果更加符合用户体验,欢迎观看!!!