[搜索] Solr (二) 配置文件

最新推荐文章于 2024-07-17 10:07:16 发布

April_CH

最新推荐文章于 2024-07-17 10:07:16 发布

阅读量708

点赞数

分类专栏：搜索文章标签：全文检索 lucene solr

本文链接：https://blog.csdn.net/qust_2011/article/details/40106209

版权

搜索专栏收录该内容

8 篇文章 0 订阅

订阅专栏

Solr配置文件

Solr包含两个主要配置：solr的主目录solr\collection1\conf下面的schema.xml,solrConfig.xml。

solrconfig.xml：主要定义solr的处理程序（handler）和一些扩展程序；
schema.xml：主要定义索引的字段和字段类型。

schema.xml

field 字段节点

  <field name="id"        type="string"   indexed="true"  stored="true"  multiValued="false" required="true"/>
  <field name="type"      type="string"   indexed="true"  stored="true"  multiValued="false" /> 
  <field name="name"      type="string"   indexed="true"  stored="true"  multiValued="false" /> 
  <field name="core0"     type="string"   indexed="true"  stored="true"  multiValued="false" /> 
  <field name="_version_" type="long"     indexed="true"  stored="true"/> <!-- _version_ field是必需的，不能注掉 -->

type：字段类型，对应fieldType节点，在此结点中可以配置对应字段类型的相应分词器
indexed ：是否进行索引
stored：是否进行保存，如不保存，可以进行索引但不能显示此字段的内容
multiValues：是否允许多值
required：是否是必须字段，如是，则该字段必须有值，否则索引报错

dynamicField 动态字段节点

  <dynamicField name="*_i"  type="int"    indexed="true"  stored="true"/>
  <dynamicField name="*_is" type="int"    indexed="true"  stored="true"  multiValued="true"/>
  <dynamicField name="*_s"  type="string"  indexed="true"  stored="true" />
  <dynamicField name="*_ss" type="string"  indexed="true"  stored="true" multiValued="true"/>
  <dynamicField name="*_l"  type="long"   indexed="true"  stored="true"/>
  <dynamicField name="*_ls" type="long"   indexed="true"  stored="true"  multiValued="true"/>
  <dynamicField name="*_t"  type="text_general"    indexed="true"  stored="true"/>

动态字段表示：如果字段的定义没有在配置中找到，例如menu_s字段，field节点中没有配置，就可以在dynamicField中匹配name="*_s"，表示该字段为string类型

copyField 复制字段节点
```
  <copyField source="title" dest="text"/>
  <copyField source="author" dest="text"/>
  <copyField source="manu" dest="manu_exact"/>
  <copyField source="price" dest="price_c"/>
```
复制源字段到目标字段。通过复制字段的配置，就可以把这些字段放到一起，这样搜索的时候不用写很复杂的查询组合就可以在所有的字段中搜索

fieldType 字段类型节点

<fieldType name="string" class="solr.StrField" sortMissingLast="true" />
<fieldType name="text_general" class="solr.TextField" positionIncrementGap="100">
     	<analyzer type="index">
        	<tokenizer class="solr.StandardTokenizerFactory"/>
        	<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        	<!-- in this example, we will only use synonyms at query time
        	<filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
       	 	-->
        	<filter class="solr.LowerCaseFilterFactory"/>
      	</analyzer>
      	<analyzer type="query">
        	<tokenizer class="solr.StandardTokenizerFactory"/>
        	<filter class="solr.StopFilterFactory" ignoreCase="true" words="stopwords.txt" />
        	<filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
        	<filter class="solr.LowerCaseFilterFactory"/>
      	</analyzer>
</fieldType>

添加处理对应字段类型的方法，包括相应分词器等

其他节点

<uniqueKey>id</uniqueKey><!-- 唯一标示，如建索引时遇到重复，则覆盖 -->

-------------------------------------------------------------------------------------

solrconfig.xml

solrconfig.xml中配置很多，大部分可以保持默认

dataDir 索引数据data路径
```
<dataDir>${solr.data.dir:}</dataDir>
```
默认不指定，默认房子每个code下data目录中

autoCommit 自动提交

<autoCommit> 
    <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> 
    <openSearcher>false</openSearcher> <!-- 默认关闭-->
</autoCommit>

autoSoftCommit 软提交(近实时搜索)
```

<autoSoftCommit> 
     <maxTime>${solr.autoSoftCommit.maxTime:-1}</maxTime> 
</autoSoftCommit>
```
Solr创建索引数据时要在提交（Commit）时写入磁盘，这是硬提交，确保即便是停电也不会丢失数据，如果想进行实时的查询操作需要每次进行commit操作，但是这种方式是比较消耗资源的。为了提供更实时的检索能力，同时又能保证性能，Solr设定了一种软提交方式。软提交（soft commit）：仅把数据提交到内存，index可见，此时没有写入到磁盘索引文件中。通常的用法是：每1-10分钟自动触发硬提交，每秒钟自动触发软提交
other

Solr (三) 全量索引与增量索引