MySQL系列:ES数据同步

1 环境

  • Ubuntu18.04
  • MySQL5.7.27
  • JDK1.8
  • ElasticSearch6.3.0
  • canal1.1.4
  • zookeeper

2 mysql数据同步至ES

binlog使用ROW存储模式,保证记录每条数据的变化.

2.1 增量同步

实时同步,工具canal,logstash_input_jdbc,maxwell.

2.1.0 下载canal

传送门:https://github.com/alibaba/canal/releases
下载最新版:1.1.4

2.1.2 配置文件

序号配置说明
1mysql开启binlog记录,修改为ROW模式
2canal-server充当MySQL集群的slave,获取master的binlog信息,推送给canal-adapter
3canal-adapter与canal-server采用多点部署,提高可用性

2.1.3 mysql配置

  • 开启binlog功能及修改为ROW模式
cd /etc/lib/mysql
sudo vim mysql.cnf
[mysqld]
log-bin=/var/lib/mysql/mysql-binlog
binlog-format=ROW
server_id=1
  • 查看binlog状态
mysql> show variables like 'binlog_format';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| binlog_format | ROW   |
+---------------+-------+
1 row in set (0.05 sec)
mysql> show variables like 'log_bin';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| log_bin       | ON    |
+---------------+-------+
1 row in set (0.01 sec)
mysql> select version();
+-----------------------------+
| version()                   |
+-----------------------------+
| 5.7.27-0ubuntu0.18.04.1-log |
+-----------------------------+
1 row in set (0.06 sec)
  • 新增canal用户
mysql>create user 'canal'@'%' identified by '123456';
mysql>grant all on *.* to 'canal'@'%';
# 启用主从复制
mysql>grant select, replication slave, repliaction client on *.* to 'canal'@'%';
# 授权生效
flush privileges;

2.1.4 canal配置

  • 解压
tar -zxvf canal.deployer-1.1.4.tar.gz -C /usr/canal/
  • 配置
cd /usr/canal/conf/example
sudo vim instance.properties
canal.instance.dbUsername=canal
canal.instance.dbPassword=123456
canal.instance.connectionCharset = UTF-8
canal.instance.defaultDatabaseName=test_database
canal.instance.enableDruid=false

2.1.5 启动

  • 重启mysql
sudo service mysql restart
  • 启动canal
cd /usr/canal/bin
./startup.sh
  • 查看canal日志
tail 300f /usr/canal/logs/canal/canal.log

2.1.6 源码

  • pom.xml
<dependency>
    <groupId>com.alibaba.otter</groupId>
    <artifactId>canal.client</artifactId>
    <version>1.1.4</version>
</dependency>
  • GetDataFromSQL.java
package high.level.rest;
import java.net.InetSocketAddress;
import java.util.List;

import com.alibaba.otter.canal.client.CanalConnectors;
import com.alibaba.otter.canal.client.CanalConnector;
import com.alibaba.otter.canal.common.utils.AddressUtils;
import com.alibaba.otter.canal.protocol.Message;
import com.alibaba.otter.canal.protocol.CanalEntry.Column;
import com.alibaba.otter.canal.protocol.CanalEntry.Entry;
import com.alibaba.otter.canal.protocol.CanalEntry.EntryType;
import com.alibaba.otter.canal.protocol.CanalEntry.EventType;
import com.alibaba.otter.canal.protocol.CanalEntry.RowChange;
import com.alibaba.otter.canal.protocol.CanalEntry.RowData;

public class GetDataFromSQL {
    public static void main(String args[]) {
        // 创建链接
        CanalConnector connector = CanalConnectors.newSingleConnector(new InetSocketAddress(AddressUtils.getHostIp(),
                11111), "example", "", "");
        int batchSize = 1000;
        int emptyCount = 0;
        try {
            connector.connect();
            connector.subscribe(".*\\..*");
            connector.rollback();
            int totalEmptyCount = 120;
            while (emptyCount < totalEmptyCount) {
                Message message = connector.getWithoutAck(batchSize); // 获取指定数量的数据
                long batchId = message.getId();
                int size = message.getEntries().size();
                if (batchId == -1 || size == 0) {
                    emptyCount++;
                    System.out.println("empty count : " + emptyCount);
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                    }
                } else {
                    emptyCount = 0;
                    // System.out.printf("message[batchId=%s,size=%s] \n", batchId, size);
                    printEntry(message.getEntries());
                }

                connector.ack(batchId); // 提交确认
                // connector.rollback(batchId); // 处理失败, 回滚数据
            }

            System.out.println("empty too many times, exit");
        } finally {
            connector.disconnect();
        }
    }

    private static void printEntry(List<Entry> entrys) {
        for (Entry entry : entrys) {
            if (entry.getEntryType() == EntryType.TRANSACTIONBEGIN || entry.getEntryType() == EntryType.TRANSACTIONEND) {
                continue;
            }

            RowChange rowChage = null;
            try {
                rowChage = RowChange.parseFrom(entry.getStoreValue());
            } catch (Exception e) {
                throw new RuntimeException("ERROR ## parser of eromanga-event has an error , data:" + entry.toString(),
                        e);
            }

            EventType eventType = rowChage.getEventType();
            System.out.println(String.format("================&gt; binlog[%s:%s] , name[%s,%s] , eventType : %s",
                    entry.getHeader().getLogfileName(), entry.getHeader().getLogfileOffset(),
                    entry.getHeader().getSchemaName(), entry.getHeader().getTableName(),
                    eventType));

            for (RowData rowData : rowChage.getRowDatasList()) {
                if (eventType == EventType.DELETE) {
                    printColumn(rowData.getBeforeColumnsList());
                } else if (eventType == EventType.INSERT) {
                    printColumn(rowData.getAfterColumnsList());
                } else {
                    System.out.println("-------&gt; before");
                    printColumn(rowData.getBeforeColumnsList());
                    System.out.println("-------&gt; after");
                    printColumn(rowData.getAfterColumnsList());
                }
            }
        }
    }

    private static void printColumn(List<Column> columns) {
        for (Column column : columns) {
            System.out.println(column.getName() + " : " + column.getValue() + "    update=" + column.getUpdated());
        }
    }
}
  • 启动
================&gt; binlog[mysql-binlog.000021:219] , name[,] , eventType : QUERY
empty count : 1
empty count : 2
empty count : 3
empty count : 4
empty count : 5
empty count : 6
  • 添加数据
mysql> insert into info
    -> (name, address)
    -> values
    -> ("小小","河北");
  • 结果
================&gt; binlog[mysql-binlog.000021:219] , name[,] , eventType : QUERY
empty count : 1
empty count : 2
empty count : 3
empty count : 4
empty count : 5
empty count : 6
================&gt; binlog[mysql-binlog.000021:599] , name[test_database,info] , eventType : INSERT
id : 4    update=true
name : 小小    update=true
address : 河北    update=true

2.2 同步es

2.2.1 脚本同步es

  • 项目结构
    在这里插入图片描述
图 项目结构
  • pom.xml
<?xml version="1.0" encoding="UTF-8"?>
<project xmlns="http://maven.apache.org/POM/4.0.0"
         xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
         xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd">
    <modelVersion>4.0.0</modelVersion>

    <groupId>com.java.es</groupId>
    <artifactId>java.es</artifactId>
    <version>1.0-SNAPSHOT</version>
    <dependencies>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-core</artifactId>
            <version>2.10.0</version>
        </dependency>
        <dependency>
            <groupId>org.apache.logging.log4j</groupId>
            <artifactId>log4j-api</artifactId>
            <version>2.10.0</version>
        </dependency>

        <dependency>
            <groupId>org.slf4j</groupId>
            <artifactId>slf4j-api</artifactId>
            <version>1.8.0-beta2</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>transport</artifactId>
            <version>6.4.2</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch</groupId>
            <artifactId>elasticsearch</artifactId>
            <version>6.3.0</version>
        </dependency>
        <dependency>
            <groupId>org.elasticsearch.client</groupId>
            <artifactId>elasticsearch-rest-high-level-client</artifactId>
            <version>6.3.0</version>
        </dependency>
        <dependency>
            <groupId>com.alibaba.otter</groupId>
            <artifactId>canal.client</artifactId>
            <version>1.1.4</version>
        </dependency>
    </dependencies>
</project>
  • InitDemo.java
package high.level.rest;

import org.apache.http.HttpHost;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;

/**
 *
 * @Description: 获取Java High Level REST Client客户端
 * @author xdq
 * @date 2019年9月4日
 *
 */
public class InitDemo {

    public static RestHighLevelClient getClient() {

        RestHighLevelClient client = new RestHighLevelClient(
                RestClient.builder(new HttpHost("localhost", 9200, "http")));

        return client;
    }
}
  • HighAPITetsV1
package high.level.rest;


import com.sun.org.apache.xml.internal.security.Init;
import org.apache.http.HttpHost;
import org.elasticsearch.ElasticsearchException;
import org.elasticsearch.action.DocWriteRequest;
import org.elasticsearch.action.DocWriteResponse;
import org.elasticsearch.action.admin.indices.create.CreateIndexRequest;
import org.elasticsearch.action.admin.indices.create.CreateIndexResponse;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexRequest;
import org.elasticsearch.action.admin.indices.delete.DeleteIndexResponse;
import org.elasticsearch.action.admin.indices.get.GetIndexRequest;
import org.elasticsearch.action.bulk.BulkItemResponse;
import org.elasticsearch.action.bulk.BulkRequest;
import org.elasticsearch.action.bulk.BulkResponse;
import org.elasticsearch.action.delete.DeleteResponse;
import org.elasticsearch.action.index.IndexRequest;
import org.elasticsearch.action.index.IndexResponse;
import org.elasticsearch.action.search.*;
import org.elasticsearch.action.support.replication.ReplicationResponse;
import org.elasticsearch.action.update.UpdateRequest;
import org.elasticsearch.action.update.UpdateResponse;
import org.elasticsearch.client.RestClient;
import org.elasticsearch.client.RestHighLevelClient;
import org.elasticsearch.common.settings.Settings;
//import org.elasticsearch.client.RequestOptions;
import org.elasticsearch.common.unit.TimeValue;
import org.elasticsearch.common.xcontent.XContentType;


import org.elasticsearch.index.query.MatchQueryBuilder;
import org.elasticsearch.index.query.QueryBuilder;
import org.elasticsearch.index.query.QueryBuilders;
import org.elasticsearch.rest.RestStatus;
import org.elasticsearch.search.Scroll;
import org.elasticsearch.search.SearchHit;
import org.elasticsearch.search.SearchHits;
import org.elasticsearch.search.builder.SearchSourceBuilder;

import java.io.IOException;
import java.util.*;

import static org.elasticsearch.index.query.QueryBuilders.matchQuery;

public class HighAPITestV1 {
    public String insertData(String data_id, String name, String address){
        try(RestHighLevelClient client = InitDemo.getClient()){
            Map<String, Object> jsonMap = new HashMap<>();
            String insertStatus="test";
            jsonMap.put("name", name);
            jsonMap.put("address", address);
            IndexRequest request = new IndexRequest("twitter", "info", data_id)
                                                .source(jsonMap);
            if (indexResponse.getResult() == DocWriteResponse.Result.CREATED){
                insertStatus = "created data";
            } else if (indexResponse.getResult() == DocWriteResponse.Result.UPDATED){
                insertStatus = "updated data";
            }
            return insertStatus;
        }catch (IOException e){
            return "insert error";
        }
    }
}
  • GetDataFromSQL.java
package high.level.rest;

import java.net.InetSocketAddress;
import java.util.*;

import com.alibaba.otter.canal.client.CanalConnectors;
import com.alibaba.otter.canal.client.CanalConnector;
import com.alibaba.otter.canal.common.utils.AddressUtils;
import com.alibaba.otter.canal.protocol.Message;
import com.alibaba.otter.canal.protocol.CanalEntry.Column;
import com.alibaba.otter.canal.protocol.CanalEntry.Entry;
import com.alibaba.otter.canal.protocol.CanalEntry.EntryType;
import com.alibaba.otter.canal.protocol.CanalEntry.EventType;
import com.alibaba.otter.canal.protocol.CanalEntry.RowChange;
import com.alibaba.otter.canal.protocol.CanalEntry.RowData;


public class GetDataFromSQL {

    public static void main(String args[]) {
        // 创建链接
        CanalConnector connector = CanalConnectors.newSingleConnector(new InetSocketAddress(AddressUtils.getHostIp(),
                11111), "example", "", "");
        int batchSize = 1000;
        int emptyCount = 0;
        try {
            connector.connect();
            connector.subscribe(".*\\..*");
            connector.rollback();
            int totalEmptyCount = 120;
            while (emptyCount < totalEmptyCount) {
                Message message = connector.getWithoutAck(batchSize); // 获取指定数量的数据
                long batchId = message.getId();
                int size = message.getEntries().size();
                if (batchId == -1 || size == 0) {
                    emptyCount++;
                    System.out.println("empty count : " + emptyCount);
                    try {
                        Thread.sleep(1000);
                    } catch (InterruptedException e) {
                    }
                } else {
                    emptyCount = 0;
                    // System.out.printf("message[batchId=%s,size=%s] \n", batchId, size);
                    printEntry(message.getEntries());
                }

                connector.ack(batchId); // 提交确认
                // connector.rollback(batchId); // 处理失败, 回滚数据
            }

            System.out.println("empty too many times, exit");
        } finally {
            connector.disconnect();
        }
    }

    private static void printEntry(List<Entry> entrys) {
        for (Entry entry : entrys) {
            if (entry.getEntryType() == EntryType.TRANSACTIONBEGIN || entry.getEntryType() == EntryType.TRANSACTIONEND) {
                continue;
            }

            RowChange rowChage = null;
            try {
                rowChage = RowChange.parseFrom(entry.getStoreValue());
            } catch (Exception e) {
                throw new RuntimeException("ERROR ## parser of eromanga-event has an error , data:" + entry.toString(),
                        e);
            }

            EventType eventType = rowChage.getEventType();
            System.out.println(String.format("================&gt; binlog[%s:%s] , name[%s,%s] , eventType : %s",
                    entry.getHeader().getLogfileName(), entry.getHeader().getLogfileOffset(),
                    entry.getHeader().getSchemaName(), entry.getHeader().getTableName(),
                    eventType));

            for (RowData rowData : rowChage.getRowDatasList()) {
                if (eventType == EventType.DELETE) {
                    printColumn(rowData.getBeforeColumnsList());
                } else if (eventType == EventType.INSERT) {
                    printColumn(rowData.getAfterColumnsList());
                } else {
                    System.out.println("-------&gt; before");
                    printColumn(rowData.getBeforeColumnsList());
                    System.out.println("-------&gt; after");
                    printColumn(rowData.getAfterColumnsList());
                }
            }
        }
    }

    private static void printColumn(List<Column> columns) {
        HighAPITestV1 esOperation = new HighAPITestV1();
        System.out.println("List datas: "+columns);
        List<String> listData = new ArrayList<>();
        for (Column column : columns) {
            listData.add(column.getValue());
        }
        esOperation.insertData(listData.get(0), listData.get(1), listData.get(2));
        System.out.println("Extract datas: "+listData);
        String id = listData.get(0);
        System.out.println("id:"+id);
    }
}
  • 启动顺序
启动elasticsearch
启动canal
启动javaes项目
添加mysql数据
  • 查询
# elasticsearch查询数据
http://localhost:9200/twitter/info/_search
  • 数据
{
    "took": 1,
    "timed_out": false,
    "_shards": {
        "total": 3,
        "successful": 3,
        "skipped": 0,
        "failed": 0
    },
    "hits": {
        "total": 6,
        "max_score": 1,
        "hits": [
            {
                "_index": "twitter",
                "_type": "info",
                "_id": "2",
                "_score": 1,
                "_source": {
                    "address": "沈阳",
                    "name": "小黄"
                }
            },
            {
                "_index": "twitter",
                "_type": "info",
                "_id": "40",
                "_score": 1,
                "_source": {
                    "address": "佛罗伦萨州",
                    "name": "小笑"
                }
            },
            {
                "_index": "twitter",
                "_type": "info",
                "_id": "da",
                "_score": 1,
                "_source": {
                    "address": "da",
                    "name": "39"
                }
            },
            {
                "_index": "twitter",
                "_type": "info",
                "_id": "1",
                "_score": 1,
                "_source": {
                    "address": "沈阳",
                    "name": "小黑嘿嘿"
                }
            },
            {
                "_index": "twitter",
                "_type": "info",
                "_id": "3",
                "_score": 1,
                "_source": {
                    "address": "沈阳",
                    "name": "小三三"
                }
            },
            {
                "_index": "twitter",
                "_type": "info",
                "_id": "41",
                "_score": 1,
                "_source": {
                    "address": "北卡罗来纳州",
                    "name": "小了"
                }
            }
        ]
    }
}

2.2.2 canal-adapter同步es(这个没成功)

mkdir /usr/canal-adapter
tar -zxvf canal.adapter-1.1.4.tar.gz -C /usr/canal-adapter/
  • 配置application.yml
cd /usr/canal-adapter/conf
sudo vim application.yml
server:
  port: 8081
spring:
  jackson:
    date-format: yyyy-MM-dd HH:mm:ss
    time-zone: GMT+8
    default-property-inclusion: non_null

canal.conf:
  mode: tcp # kafka rocketMQ
  canalServerHost: 127.0.0.1:11111
  batchSize: 500
  syncBatchSize: 1000
  retries: 0
  timeout:
  accessKey:
  secretKey:
  srcDataSources:
    defaultDS:
      # 数据库配置:test_database
      url: jdbc:mysql://127.0.0.1:3306/test_database?useUnicode=true
	  username: canal
      password: 123456
  canalAdapters:
  - instance: example # canal instance Name or mq topic name
    groups:
    - groupId: g1
      outerAdapters:
      - name: logger
      - name: es
		# es 传输端口9300
        hosts: 127.0.0.1:9300 # 127.0.0.1:9200 for rest mode
        properties:
          cluster.name: xdq
  • 配置es
cd /usr/canal-adapter/conf/es
sudo vim mytest_user.yml
dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: twitter
  _type: _doc
  _id: _id
  upsert: true

3 小结

(1) mysql数据同步到es过程:

在这里插入图片描述

图 mysql数据同步es示意图

(2) 主(master)服务器,mysql更新数据,同时开启binlog服务,记录数据库数据变更情况,从(slave)服务器中运行canal,用于监测binlog,若有变动则将数据变更同步至ES,类似于主从复制,读写分离.


参考文献
[1]https://blog.csdn.net/laoyang360/article/details/88600799
[2]https://blog.csdn.net/qq_35556233/article/details/96989002

  • 2
    点赞
  • 8
    收藏
    觉得还不错? 一键收藏
  • 打赏
    打赏
  • 0
    评论
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包

打赏作者

天然玩家

坚持才能做到极致

¥1 ¥2 ¥4 ¥6 ¥10 ¥20
扫码支付:¥1
获取中
扫码支付

您的余额不足,请更换扫码支付或充值

打赏作者

实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值