1 TiDB测试集群,使用tiup进行安装部署和运维操作。
集群状态如下:
2 简要介绍一下TiDB binlog架构TiDB Binlog集群主要分为Pump和Drainer两个组件,以及binlogctl工具:
Pump
Pump用于实时记录TiDB产生的Binlog,并将Binlog按照事务的提交时间进行排序,再提供给Drainer进行消费。
Drainer
Drainer从各个Pump中收集Binlog进行归并,再将Binlog转化成SQL或者指定格式的数据,最终同步到下游。
binlogctl工具
binlogctl是一个TiDB Binlog配套的运维工具,具有如下功能:
获取TiDB集群当前的TSO;
查看Pump/Drainer状态;
修改Pump/Drainer状态;
暂停/下线Pump/Drainer。
3 测试环境pump和drainer的部署位置和日志位置
[root@xxx-ticdc-23 drainer-8249]# ls bin/
drainer
[root@xxx-ticdc-23 drainer-8249]# ls conf/
drainer.toml
[root@xxx-ticdc-23 drainer-8249]# ls log/
drainer.log drainer_stderr.log
4 配置文件drainer.toml内容如
# WARNING: This file is auto-generated. Do not edit! All your modification will be overwritten!
# You can use 'tiup cluster edit-config' and 'tiup cluster reload' to update the configuration
# All configuration items you want to change can be added to:
# server_configs:
# drainer:
# aa.b1.c3: value
# aa.b2.c4: value
[syncer]
db-type = "tidb"
[syncer.to]
host = "192.168.1.88"
password = "TEST@2020"
port = 4000
user = "root"
drainer.toml是不建议直接修改的,下面会通过tiup进行编辑修改。
5 将tidb binlog同步到指定文件夹下面
比如在drainer节点创建文件夹/xxxxx/file_from_drainer; 修改drainer的下游配置:
# tiup cluster edit-config tidb-test
使用vi进行编辑即可,找到drainer_servers; 配置syncer.db-type和syncer.to.dir
drainer_servers:
- host: 192.168.1.23
ssh_port: 22
port: 8249
deploy_dir: /xxxxx/tidb-deploy/drainer-8249
data_dir: /xxxxx/tidb-data/drainer-8249
commit_ts: -1
config:
syncer.db-type: file
syncer.to.dir: /xxxxx/file_from_drainer
arch: amd64
os: linux
6 保存配置后reload drainer组件,默认会重启drainer
# tiup cluster reload tidb-test -N 192.168.1.23:8249
7 查看/xxxxx/file_from_drainer下的binlog文件
[root@xxx-ticdc-23 xxxxx]# ls file_from_drainer/
binlog-0000000000000000-20200827152650 binlog-0000000000000042-20200827160329 binlog-0000000000000084-20200827165204 binlog-0000000000000126-20200827182750
......
查看一下文件内容,恍恍惚惚看懂一点
[root@xxx-ticdc-23 xxxxx]# head -n 10 file_from_drainer/binlog-0000000000000000-20200827152650
�l
����������-
�
sbtestsbtest4"
idint��"
kint �"�
c�char"z�37632108549-35152435769-71163438075-76047767136-94022208193-37827536126-61826128934-31159094192-50908702293-70158721962"M
pad�char"=v48470296239-65598868380-44347847785-96102491963-02379738623
�
sbtestsbtest4"
idint��"
8 下面重点是将TiDB binlog同步到下游Kafka
修改drainer的配置为kafka,并重新reload drainer。
drainer_servers:
- host: 192.168.1.23
ssh_port: 22
port: 8249
deploy_dir: /xxxxx/tidb-deploy/drainer-8249
data_dir: /xxxxx/tidb-data/drainer-8249
commit_ts: -1
config:
syncer.db-type: kafka
syncer.to.kafka-addrs :"192.168.1.209:9092,192.168.1.210:9092,192.168.1.211:9092"
syncer.to.kafka-version :"2.0.0"
syncer.to.kafka-max-messages :1024
arch: amd64
os: linux
9 通过drainer的日志,可以完整的看到pump到drainer的初始化过程
略去了日志行开头格式:[2020/08/31 11:18:38.322 +08:00] [INFO];
go源代码文件:行号
[version.go:50] ["Welcome to Drainer"] ["Release Version"=v4.0.0-rc.2] ["Git Commit Hash"=a75036cf8933a581cac42c1007bf92c9e5417b90] ["Build TS"="2020-05-27 11:00:45"] ["Go Version"=go1.13] ["Go OS/Arch"=linux/amd64]
[main.go:46] ["start drainer..."] [config="{\"log-level\":\"info\",\"node-id\":\"192.168.1.23:8249