Solr7 如何动态添加未定义的字段以及设置默认的字段类型

概述

假设从 select title, content from T_NOTICE 中导入数据至 solr collection 中,且 managed-schema 中未定义 <field name="title " type="text_ansj"/>,也没有相应的 dynamicField,此时数据导入是成功的,默认情况下会在 collection/conf 下生成一个新的 managed-schema文件,里面会自动追加<field name="title " type="text_general"/><copyField source="title" dest="title_str" maxLength="256"/>,本文主要分析该字段是如何产生的,以及如何设置自定义的FieldType

solrconfig.xml 中搜索 unknown

关键词

  • add-unknown-fields-to-the-schema
<!-- The update.autoCreateFields property can be turned to false to disable schemaless mode -->
  <updateRequestProcessorChain name="add-unknown-fields-to-the-schema" default="${update.autoCreateFields:true}"
           processor="uuid,remove-blank,field-name-mutating,parse-boolean,parse-long,parse-double,parse-date,add-schema-fields">
    <processor class="solr.LogUpdateProcessorFactory"/>
    <processor class="solr.DistributedUpdateProcessorFactory"/>
    <processor class="solr.RunUpdateProcessorFactory"/>
  </updateRequestProcessorChain>

processor 里面最后一个单词 add-schema-fields

  • add-schema-fields
<updateProcessor class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields">
    <lst name="typeMapping">
      <str name="valueClass">java.lang.String</str>
      <str name="fieldType">text_general</str>
      <lst name="copyField">
        <str name="dest">*_str</str>
        <int name="maxChars">256</int>
      </lst>
      <!-- Use as default mapping instead of defaultFieldType -->
      <bool name="default">true</bool>
    </lst>
    ...

可以看到定义的默认java.lang.String对应的字段类型为 text_general
其实在 solrconfig.xml 中已经有做说明:

       Add unknown fields to the schema

       Field type guessing update processors that will
       attempt to parse string-typed field values as Booleans, Longs,
       Doubles, or Dates, and then add schema fields with the guessed
       field types. Text content will be indexed as "text_general" as
       well as a copy to a plain string version in *_str.

       These require that the schema is both managed and mutable, by
       declaring schemaFactory as ManagedIndexSchemaFactory, with
       mutable specified as true.

详见 schema-factory-definition-in-solrconfig

修改为自定义的字段类型

我们在项目中使用的是 ansj 分词器,字段类型是text_ansj,修改如下:

<updateProcessor class="solr.AddSchemaFieldsUpdateProcessorFactory" name="add-schema-fields">
    <lst name="typeMapping">
      <str name="valueClass">java.lang.String</str>
      <str name="fieldType">text_ansj</str>
      <!-- Use as default mapping instead of defaultFieldType -->
      <bool name="default">true</bool>
    </lst>
    ...

solr 字段属性默认值详见 defining-fields,可根据实际情况调整。

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值