Solr 3.6.2索引MySQL数据库配置过程

前言

     下面的步骤开起来比较多,其实总结下来不外乎以下几点
  • 给solr配置mysql数据库驱动(步骤2.1)
  • 告诉solr,要从一个地方导入数据。(步骤2.2)
  • 告诉solr,mysql数据库的 地址,用户名,密码,数据库名等等。(步骤2.3)
  • 告诉solr,要为mysql数据库建立那些索引域。(步骤2.4)
  • 从mysql数据库中导入数据。(步骤2.6)
        后面添加中文分词部分
  • 为solr新建一个可分词的数据类型 “text_cn”
  • 导入IKAnalyzer分词包
  • 将步骤2.4中的数据类型改为“text_cn”。

一、      试运行solr

cmd 进入solr下的example目录:cd  /d  apache-solr-3.6.2\example

执行java命令:java  -jar start.jar

测试是否成功运行solr:访问URLhttp://localhost:8983/solr/admin/

二、      配置Solr索引MySQL数据库表

准备工作

在本地的MySQL数据库中执行:

SQL语句

DROP TABLE IF EXISTS `documents`; 
CREATE TABLE `documents` ( 
  `id` int(11) NOT NULL auto_increment, 
  `date_added` datetime NOT NULL, 
  `title` varchar(255) NOT NULL, 
  `content` text NOT NULL, 
  PRIMARY KEY  (`id`) 
) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8; 
-- ---------------------------- 
-- Records of documents 
-- ---------------------------- 
INSERT INTO `documents` VALUES ('1', '2012-01-11 23:15:59', 'world', 'test1'); 
INSERT INTO `documents` VALUES ('2', '2012-01-11 23:16:30', 'hello', 'test'); 
INSERT INTO `documents` VALUES ('3', now(), 'hello12', 'test'); 
INSERT INTO `documents` VALUES ('4', now(), ‘我们’, 'test');

 

2.1.  复制mysql-connector-java-5.1.25-bin.jar(去网上下载)文件到目录apache-solr-3.6.2\example\lib。它是mysql的驱动。

 

2.2.  配置apache-solr-3.6.2\example\solr\conf\solrconfig.xml。在文件中加入:

apache-solr-3.6.2\example\solr\conf\solrconfig.xml插入

<requestHandler name="/dataimport" 
       class="org.apache.solr.handler.dataimport.DataImportHandler"> 
         <lst name="defaults"> 
            <str name="config">data-config.xml</str> 
         </lst> 
</requestHandler>

<lib dir="../../dist/" regex="apache-solr-cell-\d.*\.jar" />前面加入

<lib dir="../../dist/" regex="apache-solr-dataimporthandler-\d.*\.jar" />

 

2.3.  apache-solr-3.6.2\example\solr\conf\目录下创建data-config.xml文件,其内容如下。其目的是指定了MySQL数据库的地址、用户名、密码和建立索引的数据表。

apache-solr-3.6.2\example\solr\conf\data-config.xml

<dataConfig>   
	<dataSource type="JdbcDataSource" driver="com.mysql.jdbc.Driver"   
    url="jdbc:mysql://localhost/italk" user="username" password="password"/>  
	<document name="documents1" >  
		<entity name="documents"             query="SELECT id, content, date_added, title FROM documents"  > 
			<field column="id" name="id" /> 
			<field column="content" name="content" /> 
			<field column="title" name="hashcode" /> 
			<field column="date_added" name="updatetime" /> 
		</entity> 
	</document> 
</dataConfig>
 
 
 

 

2.4.  solr中为数据库表字段建立域,编辑apache-solr-3.6.2\example\solr\conf\schema.xml

² 删除<fields></fields>节点间的所有内容

² 删除 <uniqueKey>id</uniqueKey> </schema>之间所有内容

apache-solr-3.6.2\example\solr\conf\schema.xml:在<fields></fields>节点间插入

<field name="id" type="string" indexed="true" stored="true" required="true" /> 
<field name="title" type="text_general" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true"/> 
<field name="content" type="text_general" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true"/> 
<field name="date_added" type="date" indexed="false" stored="true"/> 


 

2.5.  重新执行java  -jar start.jar

 

2.6.  执行所用数据库命令:http://localhost:8983/solr/dataimport?command=full-import

 

2.7.  再次访问:http://localhost:8983/solr/admin/ Query string是默认的 *:*。意思是列出所有数据来。

点击“search”按钮,查看索引的全部数据。

 

三、      加入中文分词工具IKAnalyzer

solr分词工具设为IKAnalyzer,IKAnalyzer下载地址:http://code.google.com/p/ik-analyzer/downloads/list

 

3.1.  编辑apache-solr-3.6.2\example\solr\conf\schema.xml,添加自定义的中文类型text_cn,并配置其分词器。

apache-solr-3.6.2\example\solr\conf\schema.xml文件的<types></types>节点间插入

<fieldType name="text_cn" class="solr.TextField" positionIncrementGap="100">
            <analyzer type="index">
                <tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"
                          isMaxWordLength="false"/>
                <filter class="solr.StopFilterFactory"
                    ignoreCase="true" words="stopwords.txt"/>
                <filter class="solr.WordDelimiterFilterFactory"
                    generateWordParts="1" generateNumberParts="1"
                    catenateWords="1" catenateNumbers="1" catenateAll="0"
                    splitOnCaseChange="1"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPorterFilterFactory"
                    protected="protwords.txt"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
            <analyzer type="query">
                <tokenizer class="org.wltea.analyzer.solr.IKTokenizerFactory"
                          isMaxWordLength="true"/>
                <filter class="solr.StopFilterFactory"
                    ignoreCase="true" words="stopwords.txt"/>
                <filter class="solr.WordDelimiterFilterFactory"
                    generateWordParts="1" generateNumberParts="1"
                    catenateWords="1" catenateNumbers="1" catenateAll="0"
                    splitOnCaseChange="1"/>
                <filter class="solr.LowerCaseFilterFactory"/>
                <filter class="solr.EnglishPorterFilterFactory"
                    protected="protwords.txt"/>
                <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
            </analyzer>
        </fieldType>

 

3.2.  添加IKAnalyzer分词工具包IKAnalyzer2012_u6.jar到目录apache-solr-3.6.2\example\solr\lib

 

3.3.  修改schema.xml文件中的content title域的数据类型。从”text_general”改为”text_cn”


3.4 测试分词工具,重新运行solr,进入:http://localhost:8983/solr/admin/analysis.jsp 

         如下配置,测试分词是否成功。                 

        

参考:http://blog.csdn.net/fover717/article/details/7551867 (向其致敬)

 

  • 0
    点赞
  • 1
    收藏
    觉得还不错? 一键收藏
  • 2
    评论
Lucidworks for solr This guide includes the following sections: : This section guides you through the installation and setup of Solr. Getting Started : This section introduces the Solr Web-based Using the Solr Administration User Interface user interface. From your browser you can view configuration files, submit queries, view logfile settings and Java environment settings, and monitor and control distributed configurations. : This section describes how Solr organizes its data Documents, Fields, and Schema Design for indexing. It explains how a Solr schema defines the fields and field types which Solr use to organize data within the document files it indexes. : This section explains how Solr prepares Understanding Analyzers, Tokenizers, and Filters text for indexing and searching. Analyzers parse text and produce a stream of tokens, lexica units used for indexing and searching. Tokenizers break field data down into tokens. Filters perform other transformational or selective work on token streams. : This section describes the indexing process and basic Indexing and Basic Data Operations index operations, such as commit, optimize, and rollback. : This section presents an overview of the search process in Solr. It describes the Searching main components used in searches, including request handlers, query parsers, and response writers. It lists the query parameters that can be passed to Solr, and it describes features such as boosting and faceting, which can be used to fine-tune search results. : This section discusses performance tuning for Solr. It The Well-Configured Solr Instance begins with an overview of the file, then tells you how to configure cores solrconfig.xml with , how to configure the Lucene index writer, and more. solr.xml : This section discusses important topics for running and monitoring Solr. It Managing Solr describes running Solr in the Apache Tomcat servlet runner and Web server. Other topics include how to back up a Solr instance, and how to run Solr with Java Management Extensions (JMX). : This section tells you how to grow a Solr distribution by dividing a Scaling and Distribution large index into sections called shards, which are then distributed across multiple servers, or by replicating a single index across multiple services. : This section tells you how to access Solr through various client APIs, including Client APIs JavaScript, JSON, and Ruby.

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论 2
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值