Elasticsearch-jdbc实现MySQL同步到ElasticSearch详解

elasticsearch-jdbc V2.3.2.0版本不需要安装。以下使用的elasticsearch也是2.3.2测试。 
操作系统:CentOS release 6.5 
1)elasticsearch 2.4.4 安装成功,测试ok。 
2)mysql安装成功,能实现增、删、改、查。 

可供测试的数据库为test,表为cc,具体信息如下:

mysql> select * from cc;
+----+------------+
| id | name |
+----+------------+
| 1 | laoyang |
| 2 | dluzhang |
| 3 | dlulaoyang |
+----+------------+
3 rows in set (0.00 sec)

 

第一步:下载工具。 
地址:http://xbib.org/repository/org/xbib/elasticsearch/importer/elasticsearch-jdbc/2.3.2.0/elasticsearch-jdbc-2.3.2.0-dist.zip 
第二步:导入Centos。路径自己定,笔者放到根目录下,解压。unzip elasticsearch-jdbc-2.3.2.0-dist.zip 
第三步:设置环境变量。

[root@5b9dbaaa148a /]# vi /etc/profile 
export JDBC_IMPORTER_HOME=/elasticsearch-jdbc-2.3.2.0

使环境变量生效: 
[root@5b9dbaaa148a /]# source /etc/profile 
第四步:配置使用。详细参考:https://github.com/jprante/elasticsearch-jdbc 

1、 elasticsearch-jdbc 同步方法一

1)、根目录下新建文件夹odbc_es 如下: 

[root@5b9dbaaa148a /]# ll /odbc_es/ 
drwxr-xr-x 2 root root 4096 Jun 16 03:11 logs 
-rwxrwxrwx 1 root root 542 Jun 16 04:03 mysql_import_es.sh 


2)、新建脚本mysql_import_es.sh,内容如下;

[root@5b9dbaaa148a odbc_es]# cat mysql_import_es.sh
’#!/bin/sh
bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
echo '{
"type" : "jdbc",
"jdbc": {
"elasticsearch.autodiscover":true,
"elasticsearch.cluster":"my-application", #簇名,详见:/usr/local/elasticsearch/config/elasticsearch.yml
"url":"jdbc:mysql://10.8.5.101:3306/test", #mysql数据库地址
"user":"root", #mysql用户名
"password":"123456", #mysql密码
"sql":"select * from cc",
"elasticsearch" : {
  "host" : "10.8.5.101",
  "port" : 9300
},
"index" : "myindex", #新的index
"type" : "mytype" #新的type
}
}'| java \
  -cp "${lib}/*" \
  -Dlog4j.configurationFile=${bin}/log4j2.xml \
  org.xbib.tools.Runner \
  org.xbib.tools.JDBCImporter

 

3)、为 mysql_import_es.sh 添加可执行权限。 
[root@5b9dbaaa148a odbc_es]# chmod a+x mysql_import_es.sh 
4)执行脚本mysql_import_es.sh 
[root@5b9dbaaa148a odbc_es]# ./mysql_import_es.sh

第五步:测试数据同步是否成功。 
使用elasticsearch检索查询:

[root@5b9dbaaa148a odbc_es]# curl -XGET 'http://10.8.5.101:9200/myindex/mytype/_search?pretty'
{
  "took" : 4,
  "timed_out" : false,
  "_shards" : {
  "total" : 8,
  "successful" : 8,
  "failed" : 0
  },
  "hits" : {
  "total" : 3,
  "max_score" : 1.0,
  "hits" : [ {
  "_index" : "myindex",
  "_type" : "mytype",
  "_id" : "AVVXKgeEun6ksbtikOWH",
  "_score" : 1.0,
  "_source" : {
  "id" : 1,
  "name" : "laoyang"
  }
  }, {
  "_index" : "myindex",
  "_type" : "mytype",
  "_id" : "AVVXKgeEun6ksbtikOWI",
  "_score" : 1.0,
  "_source" : {
  "id" : 2,
  "name" : "dluzhang"
  }
  }, {
  "_index" : "myindex",
  "_type" : "mytype",
  "_id" : "AVVXKgeEun6ksbtikOWJ",
  "_score" : 1.0,
  "_source" : {
  "id" : 3,
  "name" : "dlulaoyang"
  }
  } ]
  }
}

 

出现以上包含mysql数据字段的信息则为同步成功。

2、 elasticsearch-jdbc 同步方法二

[root@5b9dbaaa148a odbc_es]# cat mysql_import_es_simple.sh
#!/bin/sh
bin=$JDBC_IMPORTER_HOME/bin
lib=$JDBC_IMPORTER_HOME/lib
  java \
  -cp "${lib}/*" \
  -Dlog4j.configurationFile=${bin}/log4j2.xml \
  org.xbib.tools.Runner \
  org.xbib.tools.JDBCImporter statefile.json

[root@5b9dbaaa148a odbc_es]# cat statefile.json
{
"type" : "jdbc",
"jdbc": {
"elasticsearch.autodiscover":true,
"elasticsearch.cluster":"my-application",
"url":"jdbc:mysql://10.8.5.101:3306/test",
"user":"root",
"password":"123456",
"sql":"select * from cc",
"elasticsearch" : {
  "host" : "10.8.5.101",
  "port" : 9300
},
"index" : "myindex_2",
"type" : "mytype_2"
}
}

 

脚本和json文件分开,脚本执行前先加载json文件。 
执行方式:直接运行脚本 ./mysql_import_es_simple.sh 即可。

3、Mysql与elasticsearch等价查询

目标:实现从表cc中查询id=3的name信息。 
1)MySQL中sql语句查询:

mysql> select * from cc where id=3;
+----+------------+
| id | name |
+----+------------+
| 3 | dlulaoyang |
+----+------------+
1 row in set (0.00 sec)

 

2)elasticsearch检索:

[root@5b9dbaaa148a odbc_es]# curl http://10.8.5.101:9200/myindex/mytype/_search?pretty -d '
{
"filter" : { "term" : { "id" : "3" } }
}'
{
  "took" : 3,
  "timed_out" : false,
  "_shards" : {
  "total" : 8,
  "successful" : 8,
  "failed" : 0
  },
  "hits" : {
  "total" : 1,
  "max_score" : 1.0,
  "hits" : [ {
  "_index" : "myindex",
  "_type" : "mytype",
  "_id" : "AVVXKgeEun6ksbtikOWJ",
  "_score" : 1.0,
  "_source" : {
  "id" : 3,
  "name" : "dlulaoyang"
  }
  } ]
  }
}

常见错误:

错误日志位置:/odbc_es/logs 
日志内容: 
[root@5b9dbaaa148a logs]# tail -f jdbc.log 
[04:03:39,570][INFO ][org.xbib.elasticsearch.helper.client.BaseTransportClient][pool-3-thread-1] after auto-discovery connected to [{5b9dbaaa148a}{aksn2ErNRlWjUECnp_8JmA}{10.8.5.101}{10.8.5.101:9300}{master=true}]

Bug1、[02:46:23,894][ERROR][importer.jdbc ][pool-3-thread-1] error while processing request: cluster state is RED and not YELLOW, from here on, everything will fail! 
原因: 
you created an index with replicas but you had only one node in the cluster. One way to solve this problem is by allocating them on a second node. Another way is by turning replicas off. 
你创建了带副本 replicas 的索引,但是在你的簇中只有一个节点。

解决方案: 
方案一:允许分配‘它们’到第二个节点。 
方案二:关闭副本replicas(非常可行)。如下:

curl -XPUT 'localhost:9200/_settings' -d '
{
  "index" : {
  "number_of_replicas" : 0
  }
}

 

Bug2、[13:00:37,137][ERROR][importer.jdbc ][pool-3-thread-1] error while processing request: no cluster nodes available, check settings {autodiscover=false, client.transport.ignore_cluster_name=false, client.transport.nodes_sampler_interval=5s, client.transport.ping_timeout=5s, cluster.name=elasticsearch, 
org.elasticsearch.client.transport.NoNodeAvailableException: no cluster nodes available, check 
解决方案: 
见上脚本中新增: 
“elasticsearch.cluster”:”my-application”, #簇名,和/usr/local/elasticsearch/config/elasticsearch.yml 簇名保持一致。

 

转载于:https://my.oschina.net/nonstop/blog/869184

评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值