solr配数据库介绍

最新推荐文章于 2021-11-07 12:58:00 发布

shuangyidehudie

最新推荐文章于 2021-11-07 12:58:00 发布

阅读量732

点赞数

分类专栏： solr 文章标签： solr

solr 专栏收录该内容

16 篇文章 1 订阅

订阅专栏

本文就以mysql为例进行一个较详细的介绍。其使用到的是“dataimport”。

1、在conf\solrconfig.xml中添加，增加导入数据功能

 
   
          
        < 
        requestHandler  
        name 
        = 
        "/dataimport"  
        class 
        = 
        "org.apache.solr.handler.dataimport.DataImportHandler" 
        >    
       
 
           
        < 
        lst  
        name 
        = 
        "defaults" 
        >    
       
 
            
        < 
        str  
        name 
        = 
        "config" 
        >data-config.xml</ 
        str 
        >    
       
 
           
        </ 
        lst 
        >    
       
 
           
        </ 
        requestHandler 
        > 
       
 
 

2、在conf\目录下添加一个数据源data-config.xml,代码如下：

 
        < 
        dataConfig 
        > 
       
        < 
        dataSource  
        type 
        = 
        "JdbcDataSource" 
       
        driver 
        = 
        "com.mysql.jdbc.Driver" 
       
        url 
        = 
        "jdbc:mysql://172.0.0.1:3306/cmntadmin" 
       
        user 
        = 
        "root" 
       
        password 
        = 
        "" 
        /> 
       
        < 
        document  
        name 
        = 
        "content" 
        > 
       
        < 
        entity  
        name 
        = 
        "node"  
        query 
        = 
        "select id,username,creator from forbiduser" 
        > 
       
        < 
        field  
        column 
        = 
        "id"  
        name 
        = 
        "id"  
        /> 
       
        < 
        field  
        column 
        = 
        "username"  
        name 
        = 
        "name"  
        /> 
       
        < 
        field  
        column 
        = 
        "creator"  
        name 
        = 
        "contents"  
        /> 
       
        </ 
        entity 
        > 
       
        </ 
        document 
        > 
       
        </ 
        dataConfig 
        >

这里配置了数据源的信息。entity的内容来自于“query”查询得到的结果。field对应查询出的字段信息：“column”对应数据库字段名、“name”必须对应“schema.xml”中配置的field值。

3、创建schema.xml语法

 
   
        <? 
        xml  
        version 
        = 
        "1.0"  
        encoding 
        = 
        "UTF-8"  
        ?> 
       
 
        < 
        schema  
        name 
        = 
        "example"  
        version 
        = 
        "1.5" 
        > 
       
 
        < 
        fields 
        > 
       
 
             
        <!-- If you remove this field, you must _also_ disable the update log in solrconfig.xml 
       
 
               
        or Solr won't start. _version_ and update log are required for SolrCloud 
       
 
            
        -->  
       
 
            
        < 
        field  
        name 
        = 
        "_version_"  
        type 
        = 
        "long"  
        indexed 
        = 
        "true"  
        stored 
        = 
        "true" 
        /> 
       
 
             
       
 
            
        <!-- points to the root document of a block of nested documents. Required for nested 
       
 
               
        document support, may be removed otherwise 
       
 
            
        --> 
       
 
            
        < 
        field  
        name 
        = 
        "_root_"  
        type 
        = 
        "string"  
        indexed 
        = 
        "true"  
        stored 
        = 
        "false" 
        /> 
       
 
            
        < 
        field  
        name 
        = 
        "id"  
        type 
        = 
        "string"  
        indexed 
        = 
        "true"  
        stored 
        = 
        "true"  
        required 
        = 
        "true"  
        multiValued 
        = 
        "false"  
        />  
       
 
             
        < 
        field  
        name 
        = 
        "name"  
        type 
        = 
        "text_general"  
        indexed 
        = 
        "true"  
        stored 
        = 
        "true" 
        /> 
       
 
             
        < 
        field  
        name 
        = 
        "contents"  
        type 
        = 
        "text_ik"  
        indexed 
        = 
        "true"  
        stored 
        = 
        "true" 
        /> 
       
 
          
        </ 
        fields 
        > 
       
 
          
        <!-- Field to use to determine and enforce document uniqueness.  
       
 
               
        Unless this field is marked with required="false", it will be a required field 
       
 
            
        --> 
       
 
          
        < 
        uniqueKey 
        >id</ 
        uniqueKey 
        > 
       
 
          
        <!-- DEPRECATED: The defaultSearchField is consulted by various query parsers when 
       
 
           
        parsing a query string that isn't explicit about the field.  Machine (non-user) 
       
 
           
        generated queries are best made explicit, or they can use the "df" request parameter 
       
 
           
        which takes precedence over this. 
       
 
           
        Note: Un-commenting defaultSearchField will be insufficient if your request handler 
       
 
           
        in solrconfig.xml defines "df", which takes precedence. That would need to be removed.--> 
       
 
          
        < 
        defaultSearchField 
        >contents</ 
        defaultSearchField 
        > 
       
 
        < 
        copyField  
        source 
        = 
        "name"  
        dest 
        = 
        "contents" 
        /> 
       
 
        < 
        solrQueryParser  
        defaultOperator 
        = 
        "OR" 
        /> 
       
 
        < 
        types 
        > 
       
 
          
        < 
        fieldType  
        name 
        = 
        "string"  
        class 
        = 
        "solr.StrField"  
        sortMissingLast 
        = 
        "true"  
        /> 
       
 
        < 
        fieldType  
        name 
        = 
        "long"  
        class 
        = 
        "solr.TrieLongField"  
        precisionStep 
        = 
        "0"  
        positionIncrementGap 
        = 
        "0" 
        /> 
       
 
        < 
        fieldType  
        name 
        = 
        "text_general"  
        class 
        = 
        "solr.TextField"  
        positionIncrementGap 
        = 
        "100" 
        > 
       
 
               
        < 
        analyzer  
        type 
        = 
        "index" 
        > 
       
 
                 
        < 
        tokenizer  
        class 
        = 
        "solr.StandardTokenizerFactory" 
        /> 
       
 
                 
        < 
        filter  
        class 
        = 
        "solr.StopFilterFactory"  
        ignoreCase 
        = 
        "true"  
        words 
        = 
        "stopwords.txt"  
        /> 
       
 
                 
        <!-- in this example, we will only use synonyms at query time 
       
 
                 
        <filter class="solr.SynonymFilterFactory" synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/> 
       
 
                 
        --> 
       
 
                 
        < 
        filter  
        class 
        = 
        "solr.LowerCaseFilterFactory" 
        /> 
       
 
               
        </ 
        analyzer 
        > 
       
 
               
        < 
        analyzer  
        type 
        = 
        "query" 
        > 
       
 
                 
        < 
        tokenizer  
        class 
        = 
        "solr.StandardTokenizerFactory" 
        /> 
       
 
                 
        < 
        filter  
        class 
        = 
        "solr.StopFilterFactory"  
        ignoreCase 
        = 
        "true"  
        words 
        = 
        "stopwords.txt"  
        /> 
       
 
                 
        < 
        filter  
        class 
        = 
        "solr.SynonymFilterFactory"  
        synonyms 
        = 
        "synonyms.txt"  
        ignoreCase 
        = 
        "true"  
        expand 
        = 
        "true" 
        /> 
       
 
                 
        < 
        filter  
        class 
        = 
        "solr.LowerCaseFilterFactory" 
        /> 
       
 
               
        </ 
        analyzer 
        > 
       
 
             
        </ 
        fieldType 
        > 
       
 
        < 
        fieldType  
        name 
        = 
        "text_ik"  
        class 
        = 
        "solr.TextField" 
        >  
       
 
                  
        < 
        analyzer  
        class 
        = 
        "org.wltea.analyzer.lucene.IKAnalyzer" 
        />  
       
 
          
        </ 
        fieldType 
        > 
       

           
       
 
          
        </ 
        types 
        > 
       
 
        </ 
        schema 
        > 
       
 
 

    schema.xml 里重要的字段:
    要有这个copyField字段SOLR才能检索多个字段的值(以下设置将同时搜索 id,name,contents中的值)<defaultSearchField>contents</defaultSearchField>
    copyField是用来复制你一个栏位里的值到另一栏位用. 如你可以将name里的东西copy到default里, 这样solr做检索时也会检索到name里的東西.
<copyField source="name" dest="contents"/>

4、导入相关jar包

因为本文使用mysql作为数据源，所以需要驱动包（mysql-connector.jar）；另外，使用dataimport功能还需要solr-dataimporthandler-4.7.2.jar和solr-dataimporthandler-extras-4.7.2.jar，这两个jar包不需要下载，在\dist目录下就有。

copy这三个jar包到tomcat下的solr工程下的lib目录下（webapps\solr\WEB-INF\lib）。

5、创建索引

重启tomcat。

A）、可以通过url的方式触发创建全量索引：

http://localhost:8080/solr/dataimport?command=full-import

B）、通过admin页面上的“dataimport”模块进行操作：