Mysql转储到Elasticsearch

一、ES配置

1.下载文件

下载elasticsearch:https://www.elastic.co/cn/downloads/past-releases#elasticsearch

                                     https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.3-linux-x86_64.tar.gz

                         插件: https://github.com/mobz/elasticsearch-head/archive/v5.0.0.tar.gz

2.安装配置

 修改elasticsearch.yum

     node.name: node-1

     network.host: 0.0.0.0

    cluster.initial_master_nodes: ["node-1"]

修改 jvm.options

   -Xms1g
   -Xmx1g

3.启动

bin/elasticsearch &

二、Logstash配置

1.下载文件

 下载logstash:https://www.elastic.co/cn/downloads/past-releases#logstash

                              https://artifacts.elastic.co/downloads/logstash/logstash-7.9.3.tar.gz

2.安装配置

 业务业务:将数据库中的user和user_detail表中的数据同步到ES

 创建配置文件,用于读取mysql到es

 cp config/logstash-sample.conf  config/logstash-mysql2es.conf

  修改 logstash-mysql2es.conf

input {
    stdin {
    }

    jdbc {
      tags => ["user_tag"]
      jdbc_connection_string => "jdbc:mysql://127.0.0.1:3301/member?useUnicode=true&characterEncoding=utf-8"
      jdbc_user => "root"
      jdbc_password => "123456"
      jdbc_driver_library => "/home/elk/logstash-7.9.3/mysql-transfer/mysql-connector-java-6.0.5.jar"
      jdbc_driver_class => "com.mysql.jdbc.Driver"
      statement => " select id,user_name, code, status, create_time,update_time   from t_user where update_time >= :sql_last_value order by update_time,id asc" 
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      use_column_value => true
      tracking_column => "update_time"
      tracking_column_type => "timestamp"
      record_last_run => true
      last_run_metadata_path => "./logstash_last_id"
      clean_run => false
      schedule => "* * * * *"
    }

    jdbc {
      tags => ["user_detail_tag"]
      jdbc_connection_string => "jdbc:mysql://127.0.0.1:3301/member?useUnicode=true&characterEncoding=utf-8"
      jdbc_user => "root"
      jdbc_password => "123456"
      jdbc_driver_library => "/home/elk/logstash-7.9.3/mysql-transfer/mysql-connector-java-6.0.5.jar"
      jdbc_driver_class => "com.mysql.jdbc.Driver"
      statement => "select id,user_id, type,amount, status, remark,create_time,update_time  from t_user_detail  where update_time >= :sql_last_value order by update_time,id asc" 
      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"
      use_column_value => true
      tracking_column => "update_time"
      tracking_column_type => "timestamp"
      record_last_run => true
      last_run_metadata_path => "./logstash_last_detail_id"
      clean_run => false
      schedule => "* * * * *"
    }
}
output {
        if "user_tag" in [tags] {
          elasticsearch {
                hosts => ["127.0.0.1:9200"]
                index => "user_index"
                document_type => "_doc"
                document_id => "%{id}"
          }
        }

        if "user_detail_tag" in [tags] {
          elasticsearch {
                hosts => ["127.0.0.1:9200"]
                index => "user_detail_index"
                document_type => "_doc"
                document_id => "%{id}"
          }
        }

        stdout {
            codec => json_lines
        }

}

启动

bin/logstash -f config/logstash-mysql2es.conf

ps:如果数据库连接的时sharding-proxy的得知,需要去掉如下配置(不支持分页)

      jdbc_paging_enabled => "true"
      jdbc_page_size => "50000"

切记!切记!

三、canal配置

1.下载文件

下载canal:https://github.com/alibaba/canal/releases (下载deployer,adapter,admin)

                         https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.deployer-1.1.4.tar.gz

                         https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.adapter-1.1.4.tar.gz

                         https://github.com/alibaba/canal/releases/download/canal-1.1.4/canal.admin-1.1.4.tar.gz

  • canal-deployer(canal-server)
    可以直接监听MySQL的binlog,把自己伪装成MySQL的从库,只负责接收数据,并不做处理。
  • canal-adapter  (canal-client)
    相当于canal的客户端,会从canal-server中获取数据,然后对数据进行同步,可以同步到MySQL、Elasticsearch和HBase等存储中去。
  • canal-admin
    为canal提供整体配置管理、节点运维等面向运维的功能,提供相对友好的WebUI操作界面,方便更多用户快速和安全的操作。

2.安装配置

 2.1 deployer部署

mkdir /usr/local/canal/deployer

tar -zxvf canal.deployer-1.1.4.tar.gz -C /usr/local/canal/deployer

cd /usr/local/canal/deployer
 

   编辑 conf/example/instance.properties ,修改以下四项

canal.instance.master.address = 192.168.158.211:3306
canal.instance.dbUsername=canal
canal.instance.dbPassword=canal

canal.instance.filter.regex=test_.*\\..*

  启动

sh bin/startup.sh

2.2 adapter部署

mkdir /usr/local/canal/adapter

tar -zxvf canal.deployer-1.1.4.tar.gz -C /usr/local/canal/adapter

cd /usr/local/canal/adapter
 

修改conf/application.yml

server:
  port: 8081
spring:
  jackson:
    date-format: yyyy-MM-dd HH:mm:ss
    time-zone: GMT+8
    default-property-inclusion: non_null

canal.conf:
  mode: tcp     #tcp kafka rocketMQ rabbitMQ
  flatMessage: true
  zookeeperHosts:         #对应集群模式下的zk地址
  syncBatchSize: 1000  #每次同步的批数量
  retries: 0
  timeout:
  accessKey:
  secretKey:
  consumerProperties:
    # canal tcp consumer
    canal.tcp.server.host: 127.0.0.1:11111    #canal-server地址
    canal.tcp.zookeeper.hosts:
    canal.tcp.batch.size: 500
    canal.tcp.username:
    canal.tcp.password:

srcDataSources:   # 源数据库配置
    defaultDS:
      url: jdbc:mysql://127.0.0.1:3301/member?useUnicode=true
      username: canal
      password: canal
  canalAdapters:
  - instance: example  # canal实例名或者MQ topic名
    groups:
    - groupId: g1    #分组id, 如果是MQ模式将用到该值
      outerAdapters:
      - name: logger
      - name: es
        hosts: 127.0.0.1:9200  #ES连接地址
        properties:
          mode: rest    # 模式可选transport(9300) 或者 rest(9200)
          cluster.name: elasticsearch  #ES集群名称

添加业务配置

 编辑 es7/user.yml

dataSourceKey: defaultDS
destination: example
groupId: g1
esMapping:
  _index: user_index             ##es的索引
  _id: id
  sql: "select id,user_name, code, status, create_time,update_time   from t_user"
  etlCondition: "where create_time>={}"
  commitBatch: 3000

 编辑 es/user_detail.yml

dataSourceKey: defaultDS   # 源数据源的key, 对应上面配置的srcDataSources中的值
destination: example            # canal的instance或者MQ的topic
groupId: g1   # 对应MQ模式下的groupId, 只会同步对应groupId的数据                  
esMapping:
  _index: user_detail_index   # es 的索引名称
  _id: id                                  # es 的_id, 如果配置该项必须配置下面的pk项_id,否则会由es自动分配
  sql: "select id,user_id, type,amount, status, remark,create_time,update_time  from t_user_detail"
  etlCondition: "where create_time>={}"
  commitBatch: 3000

启动

sh bin/startup.sh
  • 0
    点赞
  • 0
    收藏
    觉得还不错? 一键收藏
  • 0
    评论

“相关推荐”对你有帮助么?

  • 非常没帮助
  • 没帮助
  • 一般
  • 有帮助
  • 非常有帮助
提交
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值