环境准备
- Zookeeper环境
- kafka环境
- canal环境
本次主要介绍canal的环境搭建过程, 以及如何通过kafka对数据进行查看
简介
canal 主要用途是基于 MySQL 数据库增量日志解析,提供增量数据订阅和消费
基于日志增量订阅和消费的业务包括:
- 业务 cache 刷新
- 带业务逻辑的增量数据处理
- 数据库实时备份
canal工作原理
- canal 模拟 MySQL slave 的交互协议,伪装自己为 MySQL slave ,向 MySQL master 发送dump 协议
- MySQL master 收到 dump 请求,开始推送 binary log 给 slave (即 canal)
- canal 解析 binary log 对象(原始为 byte 流)
详细了解请移步至官方文档:link
以下进入正题部分:
了解配置
对于我们来说主要关心两个配置
- canal/config/canal.properties:
该配置为canal的通用配置(以下只介绍部分配置)
#################################################
######### common argument #############
#################################################
# tcp bind ip
canal.ip =
# register ip to zookeeper
canal.register.ip =
canal.port = 11111
canal.metrics.pull.port = 11112
#配置Zookeeper地址
canal.zkServers = 192.168.25.10:2181
# flush data to zk
canal.zookeeper.flush.period = 1000
canal.withoutNetty = false
# tcp, kafka, RocketMQ
canal.serverMode = kafka
# flush meta cursor/parse position to file
canal.file.data.dir = ${canal.conf.dir}
canal.file.flush.period = 1000
## memory store RingBuffer size, should be Math.pow(2,n)
canal.instance.memory.buffer.size = 16384
## memory store RingBuffer used memory unit size , default 1kb
canal.instance.memory.buffer.memunit = 1024
## meory store gets mode used MEMSIZE or ITEMSIZE
canal.instance.memory.batch.mode = MEMSIZE
canal.instance.memory.rawEntry = true
##################################################
######### MQ #############
##################################################
#kafka地址
canal.mq.servers = 192.168.25.10:9092
canal.mq.retries = 0
canal.mq.batchSize = 16384
canal.mq.maxRequestSize = 1048576
canal.mq.lingerMs = 100
canal.mq.bufferMemory = 33554432
canal.mq.canalBatchSize = 50
canal.mq.canalGetTimeout = 100
canal.mq.parallelThreadSize = 8
canal.mq.flatMessage = true
canal.mq.compressionType = none
canal.mq.acks = all
- canal/example/instance.properties:
实例配置(部分配置):
#################################################
## mysql serverId , v1.0.26+ will autoGen
canal.instance.mysql.slaveId=12
# enable gtid use true/false
canal.instance.gtidon=false
# position info
canal.instance.master.address=192.168.25.10:3306
canal.instance.master.journal.name=
canal.instance.master.position=
canal.instance.master.timestamp=
canal.instance.master.gtid=
# 数据库用户名和密码以及指定数据库名
canal.instance.dbUsername=root
canal.instance.dbPassword=123456
canal.instance.defaultDatabaseName=crm
canal.instance.connectionCharset = UTF-8
# enable druid Decrypt database password
#canal.instance.enableDruid=false
#canal.instance.pwdPublicKey=MFwwDQYJKoZIhvcNAQEBBQADSwAwSAJBALK4BUxdDltRRE5/zXpVEVPUgunvscYFtEip3pmLlhrWpacX7y7GCMo2/JM6LeHmiiNdH1FWgGCpUfircSwlWKUCAwEAAQ==
# mq config
canal.mq.topic=example
# dynamic topic route by schema or table regex
#canal.mq.dynamicTopic=mytest1.user,mytest2\\..*,.*\\..*
canal.mq.partition=0
# hash partition config
#canal.mq.partitionsNum=3
#canal.mq.partitionHash=test.table:id^name,.*\\..*
#################################################
启动相关命令
- 启动zookeeper
./bin/zkServer.sh start
- 启动kafka
./bin/kafka-server-start.sh -daemon config/server.properties
- 启动canal
./bin/startup.sh
验证结果:
进入kafka的customer客户端:
./bin/kafka-console-consumer.sh --bootstrap-server 192.168.25.10:9092 --topic example --from-beginning
给配置数据库中的表添加相关数据, 则通过客户端可以查看相关信息:
下篇文章介绍如何消费这些数据