Solr配置文件说明

最新推荐文章于 2024-06-25 10:33:00 发布

WaveVector

最新推荐文章于 2024-06-25 10:33:00 发布

阅读量733

点赞数

分类专栏：工程开发文章标签： solr 搜索

本文链接：https://blog.csdn.net/wendingzhulu/article/details/42837431

版权

工程开发专栏收录该内容

23 篇文章 0 订阅

订阅专栏

为了满足多类型索引的建立和不同搜索需求，solr提供了MultiCore的机制。使用中，只需要通过配置Core相应的参数文件，就可以满足热扩展。本文主要介绍每个core下相应的关键配置文件和HttpGet请求接口。

1. Core的文件路径

test_core

|-- conf

|-- schema.xml ——配置索引域和数据域的对应关系

|-- solrconfig.xml ——配置solr处理规则，主要用于定制索引创建规则

|-- data-config.xml ——配置数据库查询语句

|-- data

|-- index

|-- tlog

2. solrconfig.xml

主要是配置solr的索引创建、查询接口和缓存机制等配置。

3. schema.xml

schema.xml主要有三部分， <types>节点主要配置参数类型和索引分词器，大致如下，以下可复用性强

<types>
    <fieldtype name="int"  class="solr.IntField"  omitNorms="true"/>
    <fieldtype name="date"  class="solr.DateField"  omitNorms="true"/>
    <fieldtype name="string"  class="solr.StrField" sortMissingLast="true" omitNorms="true" />
    <fieldtype name="boolean" class="solr.BoolField" sortMissingLast="true" omitNorms="true"/>
    <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
	<fieldType name="float" class="solr.FloatField" omitNorms="true"/>
    <!--分词，也可以使用mmseg或者jcseg等中文分词器-->
    <fieldtype name="simpleSeg" class="solr.TextField">
		<analyzer>
			<tokenizer class="org.apache.lucene.analysis.standard.StandardTokenizerFactory"/>
		</analyzer>
	</fieldtype>
</types>

<fields>节点主要是配置索引域和数据表域的映射关系，大致如下，这个节点一般要业务需求配置

还有几个零碎的搜索逻辑配置，主要是默认值设定

<!-- field to use to determine and enforce document uniqueness. -->
  <uniqueKey>docid</uniqueKey>
  <defaultSearchField>title</defaultSearchField>  
  <!-- field for the QueryParser to use when an explicit fieldname is absent <copyField source="title" dest="title_autocomplete" />  用于智能提示-->

  <!-- SolrQueryParser configuration: defaultOperator="AND|OR" 
  <solrQueryParser defaultOperator="OR"/>-->
  
 <solrQueryParser defaultOperator="AND"/>

4. dataconfig.xml

其中，<dataSource>节点用于配置数据库连接属性，可以配置多个，通过name做区分，如下

<dataSource type="JdbcDataSource"   name="ds-1"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://127.0.0.1:3306/my" 
              user="root"
              password="654321" batchSize="-1"/>
<dataSource type="JdbcDataSource"   name="ds-2"
              driver="com.mysql.jdbc.Driver"
              url="jdbc:mysql://192.0.0.1:3306/my" 
              user="root"
              password="654321" batchSize="-1"/>

<document>节点下用于配置数据库查询语句，每个entity对应一个查询实体，为了满足join逻辑，可以使用entity嵌套，实例如下

  <document name="meetyou">

	<!-- name指定标明alias -->
	<!--  query:  查询数据库表符合记录数据   -->
    <!--  deltaQuery:增量索引   全量查询主键ID    -->
    <!--  deltaImportQuery:增量索引   查询增量的数据  --> 
    <!--  deletedPkQuery:增量删除索引  如果没有删除操作，可不配置该sql  --> 
    <entity name="my_community_topic"  dataSource="ds-1"
               query="select  * from community_topic  where is_deleted=0 limit ${dih.request.length} offset ${dih.request.offset}"
               deltaQuery="select id from community_topic where published_date > '${dih.last_index_time}' and is_deleted=0 "
			   deletedPkQuery="select id from community_topic where modified_date > '${dih.last_index_time}' and is_deleted=1 "
			   deltaImportQuery="select * from community_topic  where ID='${dih.delta.id}'"
	  >
	      <field column="id" name="docid"/>
	      <field column="id" name="id"/>
	      <field column="title" name="title"/>
	      <field column="content" name="content"/>
	
	      <field column="is_recommended" name="is_recommended"/>
		  <field column="tag_id" name="tag_id"/>
		  <field column="total_review" name="total_review"/>
		  <!--分数，使用聚合函数，entity可以嵌套entity-->
		  <!--<entity name="my_topic_score" dataSource="ds-2"
			query="select if(count(*)=0,1.0,score) AS score from topic_score where id=${my_community_topic.id}"
		  
			<field column="score" name="qscore"/>
		  </entity>-->

    </entity>

  </document>

5. 索引增删改的HTTPGet URL说明

基于solr，我们可以免去自己复写回调接口（特殊业务下还是需要的），下面罗列一些比较常用的接口

基本格式都是："http://" + server + ":" + port + "/" + webapp + "/" + coreName + "/" + params

其中

①"http://" + server + ":" + port + "/" + webapp + "/" + coreName + "/select?q=**&wt=json"

http://localhost:8090/mysimplecn/test_core/select?q=%27%E5%9C%B0%27&wt=json

select表示查询语句，q是查询内容，可以指定查询字段，默认值配置在schema.xml；返回结果由wt指定使用json序列化，默认是XML格式

②"http://" + server + ":" + port + "/" + webapp + "/" + coreName + "/dataimport?command=status&indent=true&wt=json"

执行全量索引，comman=status用于返回状态码

③"http://" + server + ":" + port + "/" + webapp + "/" + coreName + "/dataimport?command=delta-import"

执行增量索引，solr会读取dataimport.xml文件中的last_index_time字段，获取上次索引的时间，然后执行deltaImportQuery，获取增量数据并建立索引

④"http://" + server + ":" + port + "/" + webapp + "/" + coreName + "/update?optimize=true"

索引优化，用于将多个索引合并，减少索引分段数，增快检索速度，可以使用curl定时触发

WaveVector

关注

0
点赞
踩
0

收藏

觉得还不错? 一键收藏
0
评论
Solr配置文件说明

为了满足多类型索引的建立和不同搜索需求，solr提供了MultiCore的机制。使用中，只需要通过配置Core相应的参数文件，就可以满足热扩展。本文主要介绍每个core下相应的关键配置文件和HttpGet请求接口。 1. Core的文件路径 test_core |-- conf |-- schema.xml ——配置索引域和数据域的对应关系...
复制链接

扫一扫

专栏目录