mysql solr 数据延迟_用solr DIH 实现mysql 数据定时,增量同步到solr

本文详细介绍了如何设置Solr数据导入处理器DIH,实现从MySQL数据库到Solr的定时增量同步,包括配置文件修改、错误排查及解决办法,确保数据实时更新。
摘要由CSDN通过智能技术生成

基础环境:

(二)设置增量导入为定时执行的任务:

很多人利用Windows计划任务,或者Linux的Cron来定期访问增量导入的连接来完成定时增量导入的功能,这其实也是可以的,而且应该没什么问题。

但是更方便,更加与Solr本身集成度高的是利用其自身的定时增量导入功能。

1、下载apache-solr-dataimportscheduler-1.0.jar放到Tomcat的webapps的solr目录的WEB-INF的lib目录下:

下载地址:http://yunpan.cn/cdIpMthFdFcgn (提取码:5a1c)

由于我采用的jetty+zk配置

我将apache-solr-dataimportscheduler-1.0.jar 放在solr-4.10.4/example/solr-webapp/webapp/WEB-INF/lib目录下

73cc7150f87eaa38434baed1e98daa00.png

2、部分配置文件: db-data-config.xml

文件目录位置:/solr-4.10.4/example/solr/collection1/conf

3、配置文件头尾

url="jdbc:mysql://ip:3306/database"

user="username"

password="password"   />

batchSize="-1"/>

4、修改配置文件dataimport.properties

我是放在/solr-4.10.4/example/solr/conf 目录下

配置文件如下

#################################################

# #

# dataimport scheduler properties #

# #

#################################################

# to sync or not to sync

#1 - active; anything else -inactive

syncEnabled=1# which cores to schedule

# in a multi-core environment you can decide which cores you want syncronized

# leave empty or comment it outif using single-core deployment

syncCores=game,resource

# solr server name or IP address

# [defaults to localhostifempty]

server=ip# solr server port

# [defaults to80 ifempty]

port=8983# application name/context

# [defaults to current ServletContextListener's context (app) name]

webapp=solr

# URL params [mandatory]

# remainder of URL

params=/dataimport?command=delta-import&clean=true&commit=true# schedule interval

# number of minutes between two runs

# [defaults to30 ifempty]

interval=1# 重做索引的时间间隔,单位分钟,默认7200,即1天;

# 为空,为0,或者注释掉:表示永不重做索引

reBuildIndexInterval=7200# 重做索引的参数

reBuildIndexParams=/dataimport?command=full-import&clean=true&commit=true# 重做索引时间间隔的计时开始时间,第一次真正执行的时间=reBuildIndexBeginTime+reBuildIndexInterval*60*1000;

# 两种格式:2012-04-11 03:10:00 或者 03:10:00,后一种会自动补全日期部分为服务启动时的日期

reBuildIndexBeginTime=03:10:00

5、第一次启动会出现:

sorry, no dataimport-handler defined!

解决办法

找到配置文件example/solr/collection1/conf 下的solrconfig.xml添加

db-data-config.xml

6、启动后报错信息:

7432f2dc203b3700cf40b976ea417c5e.png

- 2015-08-19 23:31:13.591; org.apache.solr.handler.dataimport.scheduler.BaseTimerTask; [game] Response message Not Found

INFO- 2015-08-19 23:31:13.592; org.apache.solr.handler.dataimport.scheduler.BaseTimerTask; [game] Response code 404INFO- 2015-08-19 23:31:13.592; org.apache.solr.core.SolrResourceLoader; JNDI not configured forsolr (NoInitialContextEx)

INFO- 2015-08-19 23:31:13.593; org.apache.solr.core.SolrResourceLoader; solr home defaulted to 'solr/'(could not find system property or JNDI)

INFO- 2015-08-19 23:31:13.593; org.apache.solr.core.SolrResourceLoader; new SolrResourceLoader for deduced Solr Home: 'solr/'INFO- 2015-08-19 23:31:13.609; org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties; Instance dir = solr/

错误原因:

改成启动方式:

java -Dsolr.solr.home=/home/hadoop/cloudsolr/solr-4.10.4/example -DzkHost=192.168.0.157:2181,192.168.0.158:2181,192.168.0.159:2181 -DnumShards=2 -Dbootstrap_confdir=./solr/collection1/conf -Dcollection.configName=myconf -jar start.jar

7、错误信息如下:

1045[main] ERROR org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Error locating DataImportScheduler dataimport.properties file

java.io.FileNotFoundException:/home/hadoop/cloudsolr/solr-4.10.4/example/conf/dataimport.properties (No such file or directory)

将配置文件dataimport.properties移动对应的目录

8、错误信息:

ter – Could not start Solr. Check solr/home property and the logs1146 [main] ERROR org.apache.solr.core.SolrCore – null:org.apache.solr.common.SolrException: solr.xml does not exist in /home/hadoop/cloudsolr/solr-4.10.4/example/solr.xml cannot start Solr

at org.apache.solr.core.ConfigSolr.fromFile(ConfigSolr.java:62)

将对应的solr.xml 复制到对应的目录即可

9、错误信息:

in] ERROR org.apache.solr.servlet.SolrDispatchFilter – Could not start Solr. Check solr/home property and the logs3230 [main] ERROR org.apache.solr.core.SolrCore – null:org.apache.solr.common.SolrException: Found multiple cores with the name [collection1], with instancedirs [/home/hadoop/cloudsolr/solr-4.10.4/example/example-schemaless/solr/collection1/] and [/home/hadoop/cloudsolr/solr-4.10.4/example/solr/collection1/]

解决办法:example-schemaless/solr/collection1 将例子的core重新命名为其他的名字,并且在core.properties 也修改即可

10、在执行的时候另一个错误:

dding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-http-8.1.10.v20130312.jar'to classloader481115 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Instance dir = /home/hadoop/cloudsolr/solr-4.10.4/example/

481116 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Disconnected from server ip

481117 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Process ended at ................ 20.08.2015 01:37:00 595

541047 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] Process started at .............. 20.08.2015 01:38:00 525

541049 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] Full URL http://ip:8983/solr/game/dataimport?command=delta-import&clean=true&commit=true

541057 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] Response message Not Found541058 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] Response code 404

541058 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – JNDI not configured forsolr (NoInitialContextEx)541059 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – using system property solr.solr.home: /home/hadoop/cloudsolr/solr-4.10.4/example541059 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for deduced Solr Home: '/home/hadoop/cloudsolr/solr-4.10.4/example/'

541061 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-deploy-8.1.10.v20130312.jar'to classloader541061 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-xml-8.1.10.v20130312.jar'to classloader541062 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-servlet-8.1.10.v20130312.jar'to classloader541062 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-io-8.1.10.v20130312.jar'to classloader541063 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-util-8.1.10.v20130312.jar'to classloader541063 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-security-8.1.10.v20130312.jar'to classloader541064 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-server-8.1.10.v20130312.jar'to classloader541065 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-continuation-8.1.10.v20130312.jar'to classloader541065 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/ext/'to classloader541066 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-webapp-8.1.10.v20130312.jar'to classloader541067 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/servlet-api-3.0.jar'to classloader541067 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-jmx-8.1.10.v20130312.jar'to classloader541068 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-http-8.1.10.v20130312.jar'to classloader541085 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Instance dir = /home/hadoop/cloudsolr/solr-4.10.4/example/

541085 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] Disconnected from server ip

541086 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [game] Process ended at ................ 20.08.2015 01:38:00 564

541086 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Process started at .............. 20.08.2015 01:38:00 564

541087 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Full URL http://ip:8983/solr/resource/dataimport?command=delta-import&clean=true&commit=true

541091 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Response message Not Found541091 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Response code 404

541091 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – JNDI not configured forsolr (NoInitialContextEx)541091 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – using system property solr.solr.home: /home/hadoop/cloudsolr/solr-4.10.4/example541091 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – new SolrResourceLoader for deduced Solr Home: '/home/hadoop/cloudsolr/solr-4.10.4/example/'

541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-deploy-8.1.10.v20130312.jar'to classloader541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-xml-8.1.10.v20130312.jar'to classloader541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-servlet-8.1.10.v20130312.jar'to classloader541092 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-io-8.1.10.v20130312.jar'to classloader541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-util-8.1.10.v20130312.jar'to classloader541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-security-8.1.10.v20130312.jar'to classloader541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-server-8.1.10.v20130312.jar'to classloader541093 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-continuation-8.1.10.v20130312.jar'to classloader541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/ext/'to classloader541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-webapp-8.1.10.v20130312.jar'to classloader541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/servlet-api-3.0.jar'to classloader541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-jmx-8.1.10.v20130312.jar'to classloader541094 [Timer-0] INFO org.apache.solr.core.SolrResourceLoader – Adding 'file:/home/hadoop/cloudsolr/solr-4.10.4/example/lib/jetty-http-8.1.10.v20130312.jar'to classloader541106 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.SolrDataImportProperties – Instance dir = /home/hadoop/cloudsolr/solr-4.10.4/example/

541106 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Disconnected from server ip

541111 [Timer-0] INFO org.apache.solr.handler.dataimport.scheduler.BaseTimerTask – [resource] Process ended at ................ 20.08.2015 01:38:00 589

问题原因:

solr版本不支持

7ecf8afdbbb06d0b7d413a5e87f1e939.png

7333119999bd2cff3f89ea62c32bae75.png

解决办法:

jar包换1.1版本。

3fdd5cf49ce2d99ddee3ca3f598dc621.png

错误原因:

deltaQuery="select id, content, avgfeel, state, sentencenum, articlenum,updatetime, createtime  from bns_word  where  updatetime  >=  '${dataimporter.last_index_time}'">

在xml 中定义大于号小于号:

原符号

<

<=

>

>=

&

'

"

替换符号

<

<=

>

>=

&

'

"

11、导入数据后出现控制台有出现导入数据成功,但是solr查询不到数据

ac0ef7efc2032dcf3e03464765c162d3.png

错误原因:

db-data-config.xml

配置文件中

query ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence"

deltaImportQuery ="select id, uid, createname, createheadimg, wid, word, content, articlenum, state, feel, forwardnum, supportnum, updatetime, createtime from bns_sentence where id='${dataimporter.delta.id}'"

dataimporter.delta.id 需要改为小写的id

12 、配置完启动出错:

48 [coreLoadExecutor-5-thread-1] ERROR org.apache.solr.core.CoreContainer ? Error creating core [collection1]: RequestHandler init failure

org.apache.solr.common.SolrException: RequestHandler init failure

at org.apache.solr.core.SolrCore.(SolrCore.java:881)

at org.apache.solr.core.SolrCore.(SolrCore.java:654)

at org.apache.solr.core.CoreContainer.create(CoreContainer.java:491)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:255)

at org.apache.solr.core.CoreContainer$1.call(CoreContainer.java:249)

at java.util.concurrent.FutureTask.run(FutureTask.java:262)

at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)

at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)

at java.lang.Thread.run(Thread.java:745)

Caused by: org.apache.solr.common.SolrException: RequestHandler init failure

at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:172)

at org.apache.solr.core.SolrCore.(SolrCore.java:800)

... 8 more

Caused by: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.dataimport.DataImportHandler'

at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:490)

at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:421)

at org.apache.solr.core.SolrCore.createInstance(SolrCore.java:551)

at org.apache.solr.core.SolrCore.createRequestHandler(SolrCore.java:624)

at org.apache.solr.core.RequestHandlers.initHandlersFromConfig(RequestHandlers.java:158)

... 9 more

Caused by: java.lang.ClassNotFoundException: org.apache.solr.handler.dataimport.DataImportHandler

at java.net.URLClassLoader$1.run(URLClassLoader.java:366)

at java.net.URLClassLoader$1.run(URLClassLoader.java:355)

at java.security.AccessController.doPrivileged(Native Method)

at java.net.URLClassLoader.findClass(URLClassLoader.java:354)

at java.lang.ClassLoader.loadClass(ClassLoader.java:425)

at java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:789)

at java.lang.ClassLoader.loadClass(ClassLoader.java:358)

at java.lang.Class.forName0(Native Method)

at java.lang.Class.forName(Class.java:274)

at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:474)

... 13 more

错误原因:

解决办法:

软件包下载地址:http://yunpan.cn/cHTNPkchYSCrX (提取码:e5ee)

将solr-4.10.4/dist下的

solr-dataimporthandler-4.10.4.jar

solr-dataimporthandler-extras-4.10.4.jar

考到solr web的lib目录下,然后重启即可

[root@devnote ~]# cp solr-4.5.1/dist/solr-dataimporthandler-*.jar /opt/tomcat/webapps/solr/WEB-INF/lib/

13 、 solr 清空所有数据:

http://ip:port/solr/corename/update/?stream.body=%3Cdelete%3E%3Cquery%3E*:*%3C/query%3E%3C/delete%3E&stream.contentType=text/xml;charset=utf-8&commit=true

参考地址:http://josh-persistence.iteye.com/blog/2017155

14、如果是solr和tomcat 集成,参考http://www.aboutyun.com/thread-10496-1-1.html, 这步是必须的

、修改solr的WEB-INF目录下面的web.xml文件:

为元素添加一个子元素

org.apache.solr.handler.dataimport.scheduler.ApplicationListener

9688cc25d36f736b05006f7bdaa87afa.png

15、如果出现:Unsupported Media Type 错误提示,数据增量导入失败

188ab8fcfd3f15fd58dc91e0e2dc26a4.png

错误原因: 我部署的是在tomcat 下 的solr /WEB-INF/lib 下将apache-solr-dataimportscheduler-1.0.jar 包删除

解决办法: 将/WEB-INF/lib 下将apache-solr-dataimportscheduler-1.0.jar 删除, 替换上solr-dataimportscheduler-1.1.jar

软件包下载地址:http://yunpan.cn/cHTNPkchYSCrX (提取码:e5ee)

  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值