1.创建表
- -- ----------------------------
- -- Table structure for `documents`
- -- ----------------------------
- DROP TABLE IF EXISTS `documents`;
- CREATE TABLE `documents` (
- `id` int(11) NOT NULL auto_increment,
- `date_added` datetime NOT NULL,
- `title` varchar(255) NOT NULL,
- `content` text NOT NULL,
- PRIMARY KEY (`id`)
- ) ENGINE=InnoDB AUTO_INCREMENT=3 DEFAULT CHARSET=utf8;
- -- ----------------------------
- -- Records of documents
- -- ----------------------------
- INSERT INTO `documents` VALUES ('1', '2012-01-11 23:15:59', 'world', 'test1');
- INSERT INTO `documents` VALUES ('2', '2012-01-11 23:16:30', 'hello', 'test');
2. 加入DataImportHandler,在solr\conf\solrconfig.xml中
- <requestHandler name="/dataimport"
- class="org.apache.solr.handler.dataimport.DataImportHandler">
- <lst name="defaults">
- <str name="config">data-config.xml</str>
- </lst>
- </requestHandler>
(1)如果出现如下错误的时候,
严重: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr .handler.dataimport.DataImportHandler'
需要在前面添加一段类似这样子的配置:
<lib dir="E:/WindRiver/Solr/apache-solr-3.6.0/dist/" regex="apache-solr-dataimporthandler-\d.*\.jar" />
3. 同时在solr/conf目录下面新建data-config.xml
- <dataConfig>
- <dataSource type="JdbcDataSource"
- driver="com.mysql.jdbc.Driver"
- url="jdbc:mysql://localhost:3306/test"
- user="test"
- password="test"
- />
- <document name="documents1" >
- <entity name="documents"
- query="select id,title,content,date_added from documents"
- deltaImportQuery="select id,title,content,date_added from documents where ID='${dataimporter.delta.id}'"
- deltaQuery="select id from documents where date_added > '${dataimporter.last_index_time}'"
- deletedPkQuery="select id from documents where id=0">
- <field column="id" name="id" />
- <field column="title" name="title" />
- <field column="content" name="content" />
- <field column="date_added" name="date_added" />
- </entity>
- </document>
- </dataConfig>
上面指定了数据库连接路径。
query 用于初次导入到索引的sql语句。
deltaImportQuery 根据ID取得需要进入的索引的单条数据。
deltaQuery 用于增量索引的sql语句,用于取得需要增量索引的ID。
deletedPkQuery 用于取出需要从索引中删除文档的的ID。
4.在schema.xml中指定索引类型
- <field name="id" type="string" indexed="true" stored="true" required="true" />
- <field name="title" type="text" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true"/>
- <field name="content" type="text" indexed="true" stored="true" termVectors="true" termPositions="true" termOffsets="true"/>
- <field name="date_added" type="date" indexed="false" stored="true"/>
- <p> </p>
(1)遇到如下错误
严重: org.apache.solr.common.SolrException: undefined field text
需要添加一段
<field name="text" type="text_general" stored="false" indexed="true" multiValued="true"/>
<defaultSearchField>text</defaultSearchField>
<copyField source="title" dest="text"/>
<copyField source="content" dest="text"/>
5.执行URL
http://localhost:8080/solr/dataimport?command=full-import