solr5.5(5)——Similarity简单介绍

1.Similarity改变solr的打分机制,用solr默认的排序,发现得到的结果并不符合用户体验的需求,研究了几天后,也学了点皮毛,废话不多说。

我的是直接把solr客户端项目直接放到建立好了的web项目中,并且直接将solr-5.5.3\server\solr-webapp\webapp里面所有的都放到web文件夹中


web.xml的solrHome目录指定就不说了

2.solr本身已经提供了几种算法

org.apache.solr.search.similarities.BM25SimilarityFactory
org.apache.solr.search.similarities.DefaultSimilarityFactory
org.apache.solr.search.similarities.DFRSimilarityFactory
org.apache.solr.search.similarities.IBSimilarityFactory
org.apache.solr.search.similarities.LMDirichletSimilarityFactory
org.apache.solr.search.similarities.LMJelinekMercerSimilarityFactory
org.apache.solr.search.similarities.SchemaSimilarityFactory
然后在managed-schema里面添加全局的

<similarity class="org.apache.solr.search.similarities.SchemaSimilarityFactory"/>这个是必须填写的,不然直接在参数类型用会报错

Caused by: org.apache.solr.common.SolrException: Can't load schema E:\solr\solr20170714\solrHome\classify\conf\managed-schema: FieldType 'text_ik' is configured with a similarity, but the global similarity does not support it: class org.apache.solr.search.similarities.ClassicSimilarityFactory

<fieldType name="text_ik" class="solr.TextField">
	<similarity class="org.apache.solr.search.similarities.DefaultSimilarityFactory"></similarity>
        <analyzer type="index" useSmart="false"
            class="org.wltea.analyzer.lucene.IKAnalyzer" />
        <analyzer type="query" useSmart="true"
            class="org.wltea.analyzer.lucene.IKAnalyzer" />
    </fieldType>
如果是单个相似度排序的话,例如
<field name="pro_up_tpd" type="text_ik"  indexed="true"  stored="true"/>
如果是复合的话,建议加权重如query.set("defType"," edismax"), query.set("qf"," pro_name^100 pro_nStore_description^50")

<field name="seacher" type="text_ik" indexed="true" stored="false" multiValued="true" />
<copyField source="pro_name" dest="seacher"/>
<copyField source="pro_nStore_description" dest="seacher"/>
3.自定义
package cn.com.cacuq.similarity;

import org.apache.lucene.search.similarities.Similarity;
import org.apache.solr.schema.SimilarityFactory;

/**
 * Created by Administrator on 2017/7/31 0031.
 */
public class MySimilarityFactory extends SimilarityFactory {

    public Similarity getSimilarity() {
        return new MySimilarity();
    }
}

package cn.com.cacuq.similarity;

import org.apache.lucene.index.FieldInvertState;
import org.apache.lucene.search.similarities.DefaultSimilarity;

/**
 * Created by Administrator on 2017/7/31 0031.
 */
public class MySimilarity extends DefaultSimilarity{

    /**
     * freq 表示 term 在一个document的出现次数,这里设置为1.0f表示不考滤这个因素影响
     * */
    @Override
    public float tf(float freq) {
        return 1.0F;
    }

    /**
     * 这里表示匹配的docuemnt在全部document的影响因素,同理也不考滤
     * */
    @Override
    public float idf(long docFreq, long numDocs) {
        return 1.0F;
    }

    @Override
    public float sloppyFreq(int distance) {
        return 1.0F;
    }

    @Override
    public float queryNorm(float sumOfSquaredWeights) {
        return 1.0F;
    }

    /**
     * 这里表示每一个Document中所有匹配的关键字与当前关键字的匹配比例因素影响,同理也不考滤
     * */
    @Override
    public float coord(int overlap, int maxOverlap) {
        return 1.0F;
    }

    @Override
    public float lengthNorm(FieldInvertState state) {
        return 1.0F;
    }

    protected boolean discountOverlaps = false;

    public void setDiscountOverlaps(boolean v) {
        discountOverlaps = v;
    }

    public boolean getDiscountOverlaps() {
        return discountOverlaps;
    }

    public String toString(){
        return "MySimilarity";
    }

}

然后在改变fieldType里面的similarity里面的class用自定的cn.com.cacuq.similarity.MySimilarityFactory

处理前结果


处理后结果


很明显的到的结果更加符合用户体验,欢迎观看!!!

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值